@Hans
First of all, thanks for the great compilation and the script which works very well. 
Here now also my result under Qemu, the hardware they please take from my signature.
CPUBench v1.0
RageMem:
RAGEMEM v0.37 - compiled 11/06/2010
CPU: Motorola MPC 7447/7457 Apollo 1.2 @ 1533 Mhz
Caches Sizes: L1: 32 KB - L2: 512 KB - L3: none
Cache Line: 32
---> CPU <---
MAX MIPS:  9845
---> L1 <---
READ32:  5848 MB/Sec
READ64:  8829 MB/Sec
WRITE32: 6469 MB/Sec
WRITE64: 10205 MB/Sec
---> L2 <---
READ32:  5812 MB/Sec
READ64:  8822 MB/Sec
WRITE32: 6452 MB/Sec
WRITE64: 10234 MB/Sec
---> RAM <---
READ32:  5172 MB/Sec
READ64:  7506 MB/Sec
WRITE32: 5408 MB/Sec
WRITE64: 10191 MB/Sec
WRITE: 2943 MB/Sec (Tricky)
---> VIDEO BUS <---
READ:  8767 MB/Sec
WRITE: 10102 MB/Sec
SortBench:
-------------------------------------------------------------
SORTBENCH 1.1 (Gunnar von Boehn)
Its a CPU benchmark that stresses CPU, DCache and branch prediction.
-------------------------------------------------------------
 1 K Element :  3091.48 MB/sec
 2 K Element :  3103.53 MB/sec
 4 K Element :  3110.91 MB/sec
 8 K Element :  3102.89 MB/sec
16 K Element :  3110.96 MB/sec
32 K Element :  3109.41 MB/sec
BogoMIPS:
Calibrating delay loop..
Ok - 656.00 BogoMips
Dhrystone:
Dhrystone Benchmark, Version 2.1 (Language: C)
Program compiled without 'register' attribute
Execution starts, 50000000 runs through Dhrystone
Execution ends
Duration in seconds:                        28.1 
Microseconds for one run through Dhrystone: 0.6 
Dhrystones per Second:                      1778409.9 
Dhrystone MIPS (DMIPS)                      1012 
Quicksort:
Elaborating quicksort of 1000 numbers repeated for 10 times
Unsorted array: 
Total time taken by CPU: 2.44
Sieve:
   Sieve of Eratosthenes (Scaled to 10 Iterations)
   Version 1.2, 03 April 1992
   Array Size   Number   Last Prime    Linear     RunTime    MIPS
    (Bytes)   of Primes               Time(sec)    (Sec)
       8191       1899        16381   0.000684   0.000684  2425.1
      10000       2261        19997   0.000835   0.000879  2311.8
      20000       4202        39989   0.001669   0.001758  2341.8
      40000       7836        79999   0.003339   0.003515  2370.4
      80000      14683       160001   0.006677   0.007813  2157.9
     160000      27607       319993   0.013354   0.015624  2181.8
     320000      52073       639997   0.026709   0.031252  2204.1
     640000      98609      1279997   0.053418   0.056248  2473.5
    1280000     187133      2559989   0.106836   0.125008  2246.8
    2560000     356243      5119997   0.213671   0.250015  2267.0
    5120000     679460     10239989   0.427343   0.499954  2286.8
   10240000    1299068     20479999   0.854686   1.399994  1646.7
   20480000    2488465     40960001   1.709372   4.400024  1056.1
   Relative to 10 Iterations and the 8191 Array Size:
   Average RunTime = 0.000865 (sec)
   High  MIPS      =   2473.5
   Low   MIPS      =   1056.1
Whetstone:
Please wait...
Loops: 50000 Iterations: 1 Duration: 9.6 seconds
C Converted Double Precision Whetstones: 523.2 MIPS
All tests done.
Now can I please get a 32 bit driver for AmigaOs4.1 so we can move on to the GPU benchmark ? 

 Don't take it seriously it's just a joke...
Thanks again for the script and merging the CPU benchmarks.
I am somewhat pleasantly surprised that these results are so different.....es shows once again how strong this ARM Cpu is, but also shows its weaknesses in some areas. 
---> VIDEOBUS <---
READ: 8767 MB/S.
WRITE: 10102 MB/sec.
That impressed me the most, how does that come about, maybe it's because the main memory as well as the garficard memory are used as shared memory. 
With the benchmark you can imagine how well AmigaOs4.1 with Qemu runs on my machine. That is the point why I chose this emulation, it gives me a good feeling to be able to use AmigaOs4.1 under it. AmigaOs4.1 runs albeit somewhat limited as my 2nd operating system thanks to Qemu.
Numbers can be manipulated, with videos it becomes very difficult, if they would like I can also create a video about it again.