Thanks! I am in the process of producing a report with some benchmarks with several videos -apart from the Prometheus. And I also have to clean up more patches and commit them to my tree.
Rest assured, I found plenty of stuff to continue work on Altivec so I'm not going to stop regardless the bounty (for the record I'm at 32/35 hours). Of course, nothing would please me more to continue to work on Altivec on a more permanent basis :)
Here are some benchmarks with the "unpatched" version (as in the one I forked from on github) and mine (until rev 893065a), using 30 second clips (-t 30 on the command line). I have many more patches to apply but I have to clean them up first.
So, conclusions, so far not unsurprisingly even though there are some obvious gains, it's going to be very hard to achieve proper playback on h264 1080p on low end machines. I still have some patches that may give an extra few %, but at least on my system (G4@1Ghz) I don't expect that. Maybe with more optimizations 720p will be playable, but I'm not really hopeful for the 1080p. On the other hand, my machine is a really low end and with a very slow memory bus.
However, I'll keep working on this so regardless of the bounty as it's extremely intriguing, so it can only get better I think :)
I'll be posting updates here, as usual and on my blog soon.
I know that the goal was not guaranted but, regarding your results, have we any chance to see "Prometheus h 264 1920x1080 works on a X1000" regarding than X1000 is 1,8 ghz ?
It depends on the codec parameters, and the movie itself and actually it changes within the same movie as well, for example, Prometheus is decoded at ~30fps in the beginning as it's a plain green background with some letters. The more action packed a scene is, the harder it is to compress/decompress, and hence the lower the FPS are.
@zzd10h
what's the current fps on that one? (don't forget that ffmpeg is benchmarking the decoding only, not display itself, so you have extra overhead there). Right now, on that particular movie, the gain is 4%, I expect it can get up to 6% with some more twiddling, but not much more. Otoh, there are other aspects, like ac3/aac decoding which can get optimized as well, or better memory prefetching -as suggested by corto in personal email- and general optimizations.
I'm sorry if that is not sufficient, the fact that my powerpc system is slow does not help the pace of development, I am thinking of getting an cheap old imac g5 to help with that, waiting for a G4 to build stuff -even upgraded with a SATA SSD- does not really help compile times :-/
@zzd10h Here it runs with 21 fps with RadeonHD 2.4+FE+RadeonHD6870.,There are 4fps missing. Note that there is room to optimaze RadeonHD 2.4 driver. I think in a few month it is possible.
21 fps, that's ~15% slower than needed. We got the 4-5% covered, so we need an extra 10% improvement, which is not something trivial, but it's not unheard of either. I'm always saying "there is _always_ room for improvement", but it's unclear how much.
Anyway, I'll post the remaining patches and new numbers soon.
Here it runs with 21 fps with RadeonHD 2.4+FE+RadeonHD6870.,There are 4fps missing. Note that there is room to optimaze RadeonHD 2.4 driver. I think in a few month it is possible.
No. It has already been optimized to its extent by Hans. It will be the best the driver can achieve in this domain.
Correct. While I can never say that it's perfectly optimal, the composited video code is about as optimal as I can make it.
More importantly, if you look at the benchmarks then the driver overhead is tiny compared to everything else. You won't be getting any extra fps from driver optimizations.
I checked Prometheus trailer again. It runs here with 19fps and not 21 fps.My fault. I read somewhere that there is room for optimization for graphic.library and RadeonHD driver. Maybe i am wrong.