Download and extract the above archive and run the rgbconv program. Do not do anything else on the computer while the test running as it will disturb the test, then post the results in this thread.
Please try it using different WB depths (i.e. 16/32-bit, not 8-bit though). Also make sure that there is free video memory before running the test (f.e. by closing all unnecessary windows).
So far it seems that the assembler routines are somewhat faster on the Sam440ep while the difference is much less noticeable on the ?A1.
deleted old results to make it easier on your eyes
Edited by Slayer on 2009/4/17 10:11:01
~Yes I am a Kiwi, No, I did not appear as an extra in 'Lord of the Rings'~ 1x AmigaOne X5000 2.0GHz 2gM RadeonR9280X AOS4.x 3x AmigaOne X1000 1.8GHz 2gM RadeonHD7970 AOS4.x
The size of the screen shouldn't matter to the results.
These are results from my Sam440ep:
16-bit R5G6B5: BltBitMap... 12 fps 8 s 137292 us 16-bit lookup table... 17 fps 6 s 33373 us RGB565 optimised C-code (LUT version)... 19 fps 5 s 235484 us RGB565 optimised C-code (macro version)... 18 fps 5 s 619877 us RGB565 optimised ASM-code... 18 fps 5 s 618827 us 16-bit LUT ASM-code... 21 fps 4 s 807921 us
32-bit A8R8G8B8: BltBitMap... 12 fps 8 s 34220 us ARGB8888 optimised ASM-code... 13 fps 7 s 649436 us
BTW it appears that the 32-bit ASM routine is buggy (the checksum isn't the same as for BltBitMap). I'll upload a fixed version immediately.
I've uploaded a fixed version of the test now (the download URL is the same). I also removed some of the debug output because it was unnecessary and taking up space.
BltBitMap... 12 fps 8 s 403757 us Size: 768000 Checksum: B12566E8 16-bit lookup table... 22 fps 4 s 516908 us Size: 768000 Checksum: B12566E8 RGB565 optimised C-code (LUT version)... 23 fps 4 s 294906 us Size: 768000 Checksum: B12566E8 RGB565 optimised C-code (macro version)... 23 fps 4 s 346118 us Size: 768000 Checksum: B12566E8 RGB565 optimised ASM-code... 23 fps 4 s 335930 us Size: 768000 Checksum: B12566E8 000BB800 / 000BB800 16-bit LUT ASM-code... 23 fps 4 s 422492 us Size: 768000 Checksum: B12566E8
Radeon M9: 1680x1050 ARGB32
BltBitMap... 12 fps 8 s 305781 us Size: 768000 Checksum: 7E91CCA5 ARGB8888 optimised ASM-code... 13 fps 7 s 643724 us Size: 768000 Checksum: 7E91CCA5 000FA000 / 000FA000
~Yes I am a Kiwi, No, I did not appear as an extra in 'Lord of the Rings'~ 1x AmigaOne X5000 2.0GHz 2gM RadeonR9280X AOS4.x 3x AmigaOne X1000 1.8GHz 2gM RadeonHD7970 AOS4.x
Now this is a little weird that for you the LUT versions are slower than the non-LUT ones. For me the LUT versions are much faster. Are you both also using the 667MHz SAM440ep?
Please specify the CPU as well as motherboard (?A1, sam440ep, A1-xe, ...) when reporting test results.
Yip, 667mhz, a complete reset between shell/rgbconv execution
edit:
ran it again 16bit only
BltBitMap... 12 fps 8 s 396247 us Size: 768000 Checksum: E18B2970 16-bit lookup table... 23 fps 4 s 320143 us Size: 768000 Checksum: E18B2970 RGB565 optimised C-code (LUT version)... 23 fps 4 s 297795 us Size: 768000 Checksum: E18B2970 RGB565 optimised C-code (macro version)... 23 fps 4 s 346315 us Size: 768000 Checksum: E18B2970 RGB565 optimised ASM-code... 23 fps 4 s 334482 us Size: 768000 Checksum: E18B2970 000BB800 / 000BB800 16-bit LUT ASM-code... 23 fps 4 s 278232 us Size: 768000 Checksum: E18B2970
~Yes I am a Kiwi, No, I did not appear as an extra in 'Lord of the Rings'~ 1x AmigaOne X5000 2.0GHz 2gM RadeonR9280X AOS4.x 3x AmigaOne X1000 1.8GHz 2gM RadeonHD7970 AOS4.x