QEMU Emulation vs Hardware CPU Benchmarks

	Bottom Previous Topic Next Topic
Register To Post

(1) 2 3 »

Hans

Posted on: 2023/5/24 13:18 #1

Home away from home

I'm interested to see the raw CPU performance of QEMU on various machines vs real AmigaOS 4.x hardware. So, I've put together a smal script with a set of CPU benchmarks.

EDIT: You can download the benchmarks here.

Please post the results here (deleting the unneeded extra output in the Dhrystone & Quicksort tests), together with you're system's specs.

Hans

Edited by Hans on 2023/5/28 2:40:03

Join Kea Campus' Amiga Corner and support Amiga content creation
https://keasigmadelta.com/ - see more of my work

Hans

Re: QEMU Emulation vs Hardware CPU Benchmarks

Posted on: 2023/5/24 13:21 #2

Home away from home

Here are my results:
System: QEMU 8.0.0 running on a Core i7 @ 2.60GHz

CPUBench v1.0

RageMem:

RAGEMEM v0.37 - compiled 11/06/2010

CPU: Motorola MPC 7447/7457 Apollo 1.1 @ 1533 Mhz
Caches Sizes: L1: 32 KB - L2: 512 KB - L3: none
Cache Line: 32

---> CPU <---
MAX MIPS: 3119

---> L1 <---
READ32: 2034 MB/Sec
READ64: 2930 MB/Sec
WRITE32: 1881 MB/Sec
WRITE64: 3583 MB/Sec

---> L2 <---
READ32: 1950 MB/Sec
READ64: 2689 MB/Sec
WRITE32: 1602 MB/Sec
WRITE64: 3539 MB/Sec

---> RAM <---
READ32: 1433 MB/Sec
READ64: 2280 MB/Sec
WRITE32: 348 MB/Sec
WRITE64: 2523 MB/Sec
WRITE: 789 MB/Sec (Tricky)

---> VIDEO BUS <---
READ: 2105 MB/Sec
WRITE: 1773 MB/Sec

SortBench:
-------------------------------------------------------------
SORTBENCH 1.1 (Gunnar von Boehn)
Its a CPU benchmark that stresses CPU, DCache and branch prediction.
-------------------------------------------------------------
1 K Element : 1007.81 MB/sec
2 K Element : 998.39 MB/sec
4 K Element : 954.54 MB/sec
8 K Element : 982.14 MB/sec
16 K Element : 968.40 MB/sec
32 K Element : 981.72 MB/sec

BogoMIPS:
Calibrating delay loop..

Ok - 240.00 BogoMips

Dhrystone:
Dhrystone Benchmark, Version 2.1 (Language: C)
Program compiled without 'register' attribute

Execution starts, 50000000 runs through Dhrystone
Execution ends

Duration in seconds: 100.3
Microseconds for one run through Dhrystone: 2.0
Dhrystones per Second: 498517.2
Dhrystone MIPS (DMIPS) 283

Quicksort:
Total time taken by CPU: 7.84

Sieve:

Sieve of Eratosthenes (Scaled to 10 Iterations)
Version 1.2, 03 April 1992

Array Size Number Last Prime Linear RunTime MIPS
(Bytes) of Primes Time(sec) (Sec)
8191 1899 16381 0.002365 0.002365 701.0
10000 2261 19997 0.002887 0.003128 649.5
20000 4202 39989 0.005775 0.005875 700.7
40000 7836 79999 0.011550 0.011749 709.2
80000 14683 160001 0.023100 0.027161 620.7
160000 27607 319993 0.046199 0.054932 620.5
320000 52073 639997 0.092398 0.114746 600.3
640000 98609 1279997 0.184797 0.244141 569.9
1280000 187133 2559989 0.369594 0.786133 357.3
2560000 356243 5119997 0.739188 1.230469 460.6
5120000 679460 10239989 1.478376 2.656250 430.4
10240000 1299068 20479999 2.956751 8.632812 267.0
20480000 2488465 40960001 5.913503 44.218750 105.1

Relative to 10 Iterations and the 8191 Array Size:
Average RunTime = 0.004554 (sec)
High MIPS = 709.2
Low MIPS = 105.1

Whetstone:

Please wait...

Loops: 50000 Iterations: 1 Duration: 43.7 seconds
C Converted Double Precision Whetstones: 114.4 MIPS

All tests done.

Join Kea Campus' Amiga Corner and support Amiga content creation
https://keasigmadelta.com/ - see more of my work

Hans

Re: QEMU Emulation vs Hardware CPU Benchmarks

Posted on: 2023/5/24 13:24 #3

Home away from home

Machine: A1222

CPUBench v1.0

RageMem:

RAGEMEM v0.37 - compiled 11/06/2010

CPU: Freescale P10XX (E500 core) 5.1 @ 1199 Mhz
Caches Sizes: L1: 32 KB - L2: 256 KB - L3: none
Cache Line: 64

---> CPU <---
MAX MIPS: 2398

---> L1 <---
READ32: 4402 MB/Sec
READ64: 330 MB/Sec
WRITE32: 4411 MB/Sec
WRITE64: 322 MB/Sec

---> L2 <---
READ32: 630 MB/Sec
READ64: 182 MB/Sec
WRITE32: 969 MB/Sec
WRITE64: 321 MB/Sec

---> RAM <---
READ32: 606 MB/Sec
READ64: 181 MB/Sec
WRITE32: 924 MB/Sec
WRITE64: 317 MB/Sec
WRITE: 317 MB/Sec (Tricky)

---> VIDEO BUS <---
READ: 7 MB/Sec
WRITE: 158 MB/Sec

SortBench:
-------------------------------------------------------------
SORTBENCH 1.1 (Gunnar von Boehn)
Its a CPU benchmark that stresses CPU, DCache and branch prediction.
-------------------------------------------------------------
1 K Element : 2210.71 MB/sec
2 K Element : 2226.00 MB/sec
4 K Element : 2221.14 MB/sec
8 K Element : 2207.53 MB/sec
16 K Element : 565.15 MB/sec
32 K Element : 456.96 MB/sec

BogoMIPS:
Calibrating delay loop..

Ok - 192.00 BogoMips

Dhrystone:
Dhrystone Benchmark, Version 2.1 (Language: C)
Program compiled without 'register' attribute

Execution starts, 50000000 runs through Dhrystone
Execution ends

Duration in seconds: 41.6
Microseconds for one run through Dhrystone: 0.8
Dhrystones per Second: 1201938.1
Dhrystone MIPS (DMIPS) 684

Quicksort:
Elaborating quicksort of 1000 numbers repeated for 10 times
Total time taken by CPU: 4.62

Sieve:

Sieve of Eratosthenes (Scaled to 10 Iterations)
Version 1.2, 03 April 1992

Array Size Number Last Prime Linear RunTime MIPS
(Bytes) of Primes Time(sec) (Sec)
8191 1899 16381 0.001367 0.001367 1212.7
10000 2261 19997 0.001669 0.001758 1155.8
20000 4202 39989 0.003338 0.003515 1170.9
40000 7836 79999 0.006677 0.013672 609.5
80000 14683 160001 0.013353 0.045313 372.1
160000 27607 319993 0.026706 0.114064 298.8
320000 52073 639997 0.053412 0.246873 279.0
640000 98609 1279997 0.106824 0.524998 265.0
1280000 187133 2559989 0.213648 1.112499 252.5
2560000 356243 5119997 0.427296 2.900009 195.4
5120000 679460 10239989 0.854593 6.949997 164.5
10240000 1299068 20479999 1.709186 15.399933 149.7
20480000 2488465 40960001 3.418371 33.999939 136.7

Relative to 10 Iterations and the 8191 Array Size:
Average RunTime = 0.006461 (sec)
High MIPS = 1212.7
Low MIPS = 136.7

Whetstone:

Please wait...

Loops: 50000 Iterations: 1 Duration: 414.5 seconds
C Converted Double Precision Whetstones: 12.1 MIPS

All tests done.

Join Kea Campus' Amiga Corner and support Amiga content creation
https://keasigmadelta.com/ - see more of my work

Hans

Re: QEMU Emulation vs Hardware CPU Benchmarks

Posted on: 2023/5/24 13:26 #4

Home away from home

Machine: A1-X1000-40

CPUBench v1.0

RageMem:

RAGEMEM v0.37 - compiled 11/06/2010

CPU: Freescale P5040 (E5500 core) 1.2 @ 2200 Mhz
Caches Sizes: L1: 32 KB - L2: 512 KB - L3: none
Cache Line: 64

---> CPU <---
MAX MIPS: 4397

---> L1 <---
READ32: 8183 MB/Sec
READ64: 16342 MB/Sec
WRITE32: 8184 MB/Sec
WRITE64: 16346 MB/Sec

---> L2 <---
READ32: 4660 MB/Sec
READ64: 8378 MB/Sec
WRITE32: 5494 MB/Sec
WRITE64: 9634 MB/Sec

---> RAM <---
READ32: 615 MB/Sec
READ64: 1055 MB/Sec
WRITE32: 1433 MB/Sec
WRITE64: 1438 MB/Sec
WRITE: 2455 MB/Sec (Tricky)

---> VIDEO BUS <---
READ: 21 MB/Sec
WRITE: 539 MB/Sec

SortBench:
-------------------------------------------------------------
SORTBENCH 1.1 (Gunnar von Boehn)
Its a CPU benchmark that stresses CPU, DCache and branch prediction.
-------------------------------------------------------------
1 K Element : 4129.14 MB/sec
2 K Element : 4145.09 MB/sec
4 K Element : 4150.98 MB/sec
8 K Element : 4155.51 MB/sec
16 K Element : 3730.50 MB/sec
32 K Element : 3625.49 MB/sec

BogoMIPS:
Calibrating delay loop..

Ok - 352.00 BogoMips

Dhrystone:
Dhrystone Benchmark, Version 2.1 (Language: C)
Program compiled without 'register' attribute

Execution starts, 50000000 runs through Dhrystone
Execution ends

Duration in seconds: 23.1
Microseconds for one run through Dhrystone: 0.5
Dhrystones per Second: 2165278.0
Dhrystone MIPS (DMIPS) 1232

Quicksort:
Elaborating quicksort of 1000 numbers repeated for 10 times
Total time taken by CPU: 2.47

Sieve:

Sieve of Eratosthenes (Scaled to 10 Iterations)
Version 1.2, 03 April 1992

Array Size Number Last Prime Linear RunTime MIPS
(Bytes) of Primes Time(sec) (Sec)
8191 1899 16381 0.000781 0.000781 2121.7
10000 2261 19997 0.000954 0.000976 2081.1
20000 4202 39989 0.001908 0.001757 2342.6
40000 7836 79999 0.003816 0.003517 2369.6
80000 14683 160001 0.007632 0.007811 2158.5
160000 27607 319993 0.015264 0.017185 1983.5
320000 52073 639997 0.030528 0.034370 2004.1
640000 98609 1279997 0.061056 0.100021 1391.0
1280000 187133 2559989 0.122111 0.312500 898.8
2560000 356243 5119997 0.244223 1.224976 462.7
5120000 679460 10239989 0.488445 3.999939 285.8
10240000 1299068 20479999 0.976890 9.799805 235.2
20480000 2488465 40960001 1.953781 21.799316 213.2

Relative to 10 Iterations and the 8191 Array Size:
Average RunTime = 0.002749 (sec)
High MIPS = 2369.6
Low MIPS = 213.2

Whetstone:

Please wait...

Loops: 50000 Iterations: 1 Duration: 7.6 seconds
C Converted Double Precision Whetstones: 654.5 MIPS

All tests done.

Join Kea Campus' Amiga Corner and support Amiga content creation
https://keasigmadelta.com/ - see more of my work

eliyahu

Re: QEMU Emulation vs Hardware CPU Benchmarks

Posted on: 2023/5/24 14:04 #5

Not too shy to talk

@Hans

Great thread! My A1222 showed virtually identical results, except for the Whetstone results:


Whetstone:



Please wait...



Loops: 50000 Iterations: 1 Duration: 59.5 seconds

C Converted Double Precision Whetstones: 84.1 MIPS

One other point is that the RageMem numbers may be off; the video bandwidth numbers, for example, are absurdly low. I suspect RageMem may be hitting the FPU emulation layer. In which case the numbers both are and are not correct. They would be correct in so far as it shows the impact of floating-point emulation. They would be incorrect in terms of reflecting actual HW bandwidth.

I'll go ahead and and try and verify/disprove that hypothesis shortly.

-- eliyahu

"Physical reality is consistent with universal laws. When the laws do not operate, there is no reality. All of this is unreal."

joerg

Re: QEMU Emulation vs Hardware CPU Benchmarks

Posted on: 2023/5/24 14:22 #6

Home away from home

@eliyahu
Quote:

One other point is that the RageMem numbers may be off; the video bandwidth numbers, for example, are absurdly low.

No, CPU access to VRAM is extremely slow on all systems. Both the read and write results of the A1222 are about 1/3 of the X5000/40 ones.

eliyahu

Re: QEMU Emulation vs Hardware CPU Benchmarks

Posted on: 2023/5/24 14:27 #7

Not too shy to talk

@Hans

As a quick test I added RageMem to the LTE blacklist, and wow, was it slow in finishing. That alone tells me that FPU emulation is impacting the numbers; and with LTE disable the results were substantially worse as expected:


7.Scratch:CPUBench/RageMem> ragemem



RAGEMEM v0.37 - compiled 11/06/2010



CPU: Freescale P10XX (E500 core) 5.1 @ 1199 Mhz

Caches Sizes: L1: 32 KB - L2: 256 KB - L3: none

Cache Line: 64



---> CPU <---

MAX MIPS:  2399



---> L1 <---

READ32:  4411 MB/Sec

READ64:  22 MB/Sec

WRITE32: 4416 MB/Sec

WRITE64: 22 MB/Sec



---> L2 <---

READ32:  601 MB/Sec

READ64:  21 MB/Sec

WRITE32: 930 MB/Sec

WRITE64: 23 MB/Sec



---> RAM <---

READ32:  585 MB/Sec

READ64:  21 MB/Sec

WRITE32: 889 MB/Sec

WRITE64: 23 MB/Sec

WRITE: 23 MB/Sec (Tricky)



---> VIDEO BUS <---

READ:  5 MB/Sec

WRITE: 23 MB/Sec

So I wonder if we can get Crisot to build us a version using unsigned integers rather than floats, if possible, for an A1222 test. That should be possible since for some CPUs he already does that. This doesn't matter for the other benchmarks, since they are supposed to measure workload performance, and if FPU emulation kills the result, so be it. But for something that should be an indicator of relative HW bandwidth, we should take the emulation issue out of the equation if possible for a more accurate comparison. IMO, of course.

-- eliyahu

"Physical reality is consistent with universal laws. When the laws do not operate, there is no reality. All of this is unreal."

eliyahu

Re: QEMU Emulation vs Hardware CPU Benchmarks

Posted on: 2023/5/24 14:49 #8

Not too shy to talk

@Hans

... and lastly, here are the full results from my X5000/20 system:


CPUBench v1.0





RageMem:



RAGEMEM v0.37 - compiled 11/06/2010



CPU: Freescale P5020 (E5500 core) 1.2 @ 1995 Mhz

Caches Sizes: L1: 32 KB - L2: 512 KB - L3: none

Cache Line: 64



---> CPU <---

MAX MIPS:  3989



---> L1 <---

READ32:  7537 MB/Sec

READ64:  15050 MB/Sec

WRITE32: 7538 MB/Sec

WRITE64: 15054 MB/Sec



---> L2 <---

READ32:  4288 MB/Sec

READ64:  7726 MB/Sec

WRITE32: 5022 MB/Sec

WRITE64: 8825 MB/Sec



---> RAM <---

READ32:  677 MB/Sec

READ64:  1228 MB/Sec

WRITE32: 1469 MB/Sec

WRITE64: 1473 MB/Sec

WRITE: 2323 MB/Sec (Tricky)



---> VIDEO BUS <---

READ:  29 MB/Sec

WRITE: 541 MB/Sec







SortBench:

-------------------------------------------------------------

SORTBENCH 1.1 (Gunnar von Boehn)

Its a CPU benchmark that stresses CPU, DCache and branch prediction.

-------------------------------------------------------------

 1 K Element :  3729.67 MB/sec

 2 K Element :  3755.91 MB/sec

 4 K Element :  3763.17 MB/sec

 8 K Element :  3748.58 MB/sec

16 K Element :  3380.38 MB/sec

32 K Element :  3277.59 MB/sec





BogoMIPS:

Calibrating delay loop..



Ok - 320.00 BogoMips







Dhrystone:

Dhrystone Benchmark, Version 2.1 (Language: C)

Program compiled without 'register' attribute



Execution starts, 50000000 runs through Dhrystone

Execution ends



Final values of the variables used in the benchmark:



Int_Glob:            5

        should be:   5

Bool_Glob:           1

        should be:   1

Ch_1_Glob:           A

        should be:   A

Ch_2_Glob:           B

        should be:   B

Arr_1_Glob[8]:       7

        should be:   7

Arr_2_Glob[8][7]:    50000010

        should be:   Number_Of_Runs + 10

Ptr_Glob->

  Ptr_Comp:          1434992256

        should be:   (implementation-dependent)

  Discr:             0

        should be:   0

  Enum_Comp:         2

        should be:   2

  Int_Comp:          17

        should be:   17

  Str_Comp:          DHRYSTONE PROGRAM, SOME STRING

        should be:   DHRYSTONE PROGRAM, SOME STRING

Next_Ptr_Glob->

  Ptr_Comp:          1434992256

        should be:   (implementation-dependent), same as above

  Discr:             0

        should be:   0

  Enum_Comp:         1

        should be:   1

  Int_Comp:          18

        should be:   18

  Str_Comp:          DHRYSTONE PROGRAM, SOME STRING

        should be:   DHRYSTONE PROGRAM, SOME STRING

Int_1_Loc:           5

        should be:   5

Int_2_Loc:           13

        should be:   13

Int_3_Loc:           7

        should be:   7

Enum_Loc:            1

        should be:   1

Str_1_Loc:           DHRYSTONE PROGRAM, 1'ST STRING

        should be:   DHRYSTONE PROGRAM, 1'ST STRING

Str_2_Loc:           DHRYSTONE PROGRAM, 2'ND STRING

        should be:   DHRYSTONE PROGRAM, 2'ND STRING



Duration in seconds:                        25.5 

Microseconds for one run through Dhrystone: 0.5 

Dhrystones per Second:                      1959940.6 

Dhrystone MIPS (DMIPS)                      1115 







Quicksort:

Elaborating quicksort of 1000 numbers repeated for 10 times

Unsorted array: 

0 18612 25394 14112 -10145 -30357 -27941 163 33400 41211 -54402 -34464 -53657 42016 33525 65028 -28790 -30604 -9563 14987 25759 18130 -885 -19087 -25022 -13235 10720 30102 27090 -828 -33268 -40403 55142 34456 52908 -42818 -33642 -64353 29636 30844 8976 -15862 -26117 -17642 1770 19555 24643 12357 -11290 -29840 -26237 1487 33127 39592 -55878 -34440 -52155 43616 33752 63673 -30481 -31076 -8383 16735 26467 17147 -2655 -20016 -24257 -11478 11854 29570 25382 -2142 -32979 -38778 56610 34417 51397 -44411 -33853 -62988 31322 31301 7784 -17607 -26810 -16646 3539 20471 23864 10598 -12411 -29293 -24525 2791 32823 37960 -57338 -34385 -50636 45202 33947 62298 -32162 -31518 -7179 18478 27146 16139 -4424 -20920 -23464 -9718 12963 29008 23666 -3434 -32660 -37140 58061 34346 49871 -45990 -34033 -61604 32999 31728 6568 -19347 -27475 -15625 5308 21361 23057 8836 -13508 -28716 -22805 4073 32488 36317 -58779 -34299 -49102 46774 34111 60904 -33833 -31929 -5952 20214 27797 15105 -6192 -21796 -22643 -7954 14047 28416 21942 -4705 -32310 -35491 59493 34244 48329 -47555 -34182 -60199 34664 32124 5330 -21081 -28111 -14578 7075 22223 22222 7072 -14580 -28110 -21078 5333 32123 34662 -60202 -34182 -47552 48331 34244 59490 -35493 -32310 -4703 21945 28418 14045 -7957 -22644 -21794 -6189 15106 27795 20212 -5954 -31929 -33830 60906 34111 46771 -49104 -34299 -58777 36319 32489 4070 -22808 -28717 -13506 8839 23058 21360 5305 -15627 -27474 -19344 6570 31727 32996 -61606 -34033 -45987 49873 34346 58058 -37143 -32660 -3432 23669 29009 12961 -9721 -23465 -20918 -4421 16140 27145 18475 -7181 -31517 -32159 62301 33947 45199 -50639 -34385 -57335 37963 32824 2788 -24528 -29294 -12409 10601 23866 20470 3536 -16648 -26809 -17604 7786 31300 31320 -62991 -33853 -44408 51400 34417 56608 -38780 -32980 -2139 25385 29571 11852 -11481 -24259 -20015 -2652 17149 26466 16732 -8385 -31075 -30478 63676 33751 43613 -52157 -34440 -55876 39595 33128 1485 -26240 -29841 -11288 12360 24645 19553 1767 -17644 -26115 -15859 8978 30843 29633 -64356 -33642 -42815 52910 34456 55140 -40406 -33268 -826 27093 30103 10718 -13238 -25024 -19085 -882 18132 25758 14984 -9565 -30603 -28787 65031 33525 42013 -53659 -34464 -54399 41214 33401 161 -27944 -30358 -10143 14114 25395 18610 -3 -18613 -25393 -14109 10147 30356 27938 -165 -33400 -41209 54404 34464 53654 -42019 -33526 -65026 28793 30605 9561 -14990 -25760 -18128 888 19088 25021 13232 -10722 -30101 -27087 830 33267 40401 -55145 -34456 -52905 42820 33643 64351 -29639 -30845 -8974 15865 26118 17640 -1773 -19556 -24642 -12354 11292 29839 26234 -1490 -33127 -39589 55881 34440 52152 -43619 -33752 -63671 30483 31077 8381 -16738 -26468 -17146 2658 20018 24256 11475 -11855 -29569 -25379 2144 32979 38775 -56613 -34416 -51395 44413 33854 62986 -31325 -31302 -7781 17610 26812 16645 -3542 -20473 -23863 -10595 12413 29292 24522 -2793 -32823 -37957 57340 34385 50633 -45205 -33947 -62296 32165 31519 7177 -18481 -27147 -16137 4427 20921 23463 9715 -12964 -29007 -23663 3436 32659 37137 -58063 -34346 -49868 45993 34033 61601 -33001 -31728 -6566 19350 27476 15623 -5311 -21363 -23056 -8833 13510 28715 22802 -4075 -32488 -36314 58781 34299 49099 -46777 -34112 -60902 33836 31930 5950 -20217 -27798 -15103 6195 21797 22642 7951 -14049 -28415 -21939 4707 32309 35488 -59495 -34244 -48326 47557 34182 60197 -34667 -32124 -5328 21083 28112 14576 -7078 -22225 -22221 -7069 14582 28109 21075 -5335 -32122 -34659 60204 34181 47549 -48334 -34245 -59488 35496 32311 4701 -21948 -28419 -14043 7960 22646 21793 6186 -15108 -27794 -20209 5956 31928 33827 -60909 -34111 -46769 49107 34299 58774 -36322 -32490 -4068 22811 28718 13504 -8842 -23060 -21358 -5302 15628 27473 19341 -6572 -31726 -32993 61608 34033 45984 -49876 -34346 -58056 37146 32661 3430 -23671 -29010 -12959 9724 23467 20917 4418 -16142 -27144 -18472 7183 31517 32156 -62303 -33947 -45197 50641 34385 57333 -37966 -32824 -2786 24531 29295 12407 -10604 -23867 -20468 -3533 16650 26808 17601 -7788 -31299 -31317 62993 33853 44405 -51403 -34417 -56605 38783 32980 2137 -25388 -29572 -11850 11484 24260 20013 2649 -17151 -26465 -16729 8387 31075 30475 -63678 -33751 -43611 52160 34440 55873 -39598 -33128 -1483 26243 29842 11286 -12363 -24646 -19552 -1764 17645 26114 15856 -8980 -30842 -29631 64358 33642 42812 -52913 -34456 -55137 40409 33269 823 -27096 -30104 -10716 13241 25025 19083 879 -18133 -25757 -14981 9567 30603 28784 -65033 -33524 -42011 53662 34464 54397 -41217 -33401 -159 27947 30359 10141 -14117 -25397 -18608 6 18615 25392 14106 -10149 -30355 -27935 168 33399 41206 -54407 -34463 -53652 42022 33526 65024 -28796 -30606 -9559 14993 25761 18127 -891 -19090 -25020 -13229 10724 30100 27084 -832 -33267 -40398 55147 34456 52903 -42823 -33643 -64349 29642 30846 8972 -15868 -26119 -17639 1776 19558 24641 12351 -11294 -29838 -26231 1492 33126 39586 -55883 -34440 -52149 43621 33752 63669 -30486 -31078 -8379 16741 26469 17144 -2661 -20020 -24255 -11472 11857 29568 25376 -2146 -32978 -38772 56615 34416 51392 -44416 -33854 -62984 31328 31302 7779 -17613 -26813 -16643 3545 20475 23861 10592 -12415 -29291 -24519 2795 32822 37955 -57343 -34385 -50631 45207 33948 62294 -32167 -31519 -7175 18484 27149 16135 -4430 -20923 -23461 -9712 12966 29006 23660 -3439 -32659 -37134 58066 34346 49866 -45995 -34034 -61599 33004 31729 6564 -19353 -27477 -15621 5314 21364 23054 8830 -13512 -28714 -22799 4077 32487 36311 -58784 -34299 -49096 46779 34112 60899 -33839 -31931 -5948 20220 27799 15101 -6198 -21799 -22640 -7948 14051 28414 21936 -4710 -32308 -35485 59498 34244 48323 -47560 -34182 -60195 34670 32125 5326 -21086 -28113 -14574 7081 22226 22219 7066 -14583 -28108 -21072 5337 32122 34656 -60207 -34181 -47547 48337 34245 59486 -35499 -32311 -4699 21951 28420 14041 -7963 -22647 -21791 -6183 15110 27793 20206 -5958 -31927 -33824 60911 34111 46766 -49110 -34300 -58772 36325 32490 4066 -22814 -28719 -13502 8845 23061 21357 5299 -15630 -27472 -19338 6575 31725 32990 -61611 -34032 -45982 49879 34346 58053 -37148 -32661 -3428 23674 29011 12957 -9727 -23468 -20915 -4415 16144 27143 18469 -7185 -31516 -32153 62305 33946 45194 -50644 -34386 -57330 37969 32825 2784 -24533 -29296 -12405 10607 23868 20467 3530 -16651 -26807 -17598 7790 31299 31314 -62995 -33852 -44403 51405 34417 56603 -38786 -32981 -2135 25391 29573 11848 -11487 -24261 -20012 -2646 

Sorted array: 

-65033 -65026 -64356 -64353 -64349 -63678 -63671 -62995 -62991 -62988 -62984 -62303 -62296 -61611 -61606 -61604 -61599 -60909 -60902 -60207 -60202 -60199 -60195 -59495 -59488 -58784 -58779 -58777 -58772 -58063 -58056 -57343 -57338 -57335 -57330 -56613 -56605 -55883 -55878 -55876 -55145 -55137 -54407 -54402 -54399 -53659 -53657 -53652 -52913 -52905 -52157 -52155 -52149 -51403 -51395 -50644 -50639 -50636 -50631 -49876 -49868 -49110 -49104 -49102 -49096 -48334 -48326 -47560 -47555 -47552 -47547 -46777 -46769 -45995 -45990 -45987 -45982 -45205 -45197 -44416 -44411 -44408 -44403 -43619 -43611 -42823 -42818 -42815 -42019 -42011 -41217 -41209 -40406 -40403 -40398 -39598 -39589 -38786 -38780 -38778 -38772 -37966 -37957 -37148 -37143 -37140 -37134 -36322 -36314 -35499 -35493 -35491 -35485 -34667 -34659 -34464 -34464 -34463 -34456 -34456 -34440 -34440 -34440 -34417 -34416 -34386 -34385 -34385 -34385 -34346 -34346 -34300 -34299 -34299 -34299 -34245 -34244 -34182 -34182 -34182 -34181 -34112 -34111 -34034 -34033 -34033 -34032 -33947 -33947 -33854 -33853 -33853 -33852 -33839 -33833 -33830 -33824 -33752 -33751 -33643 -33642 -33642 -33526 -33524 -33401 -33400 -33268 -33268 -33267 -33128 -33127 -33001 -32993 -32981 -32980 -32979 -32978 -32824 -32823 -32661 -32660 -32660 -32659 -32490 -32488 -32311 -32310 -32310 -32308 -32167 -32162 -32159 -32153 -32124 -32122 -31931 -31929 -31929 -31927 -31728 -31726 -31519 -31518 -31517 -31516 -31325 -31317 -31302 -31299 -31078 -31076 -31075 -30845 -30842 -30606 -30604 -30603 -30486 -30481 -30478 -30358 -30357 -30355 -30104 -30101 -29841 -29840 -29838 -29639 -29631 -29572 -29569 -29296 -29294 -29293 -29291 -29010 -29007 -28796 -28790 -28787 -28719 -28717 -28716 -28714 -28419 -28415 -28113 -28111 -28110 -28108 -27944 -27941 -27935 -27798 -27794 -27477 -27475 -27474 -27472 -27147 -27144 -27096 -27087 -26813 -26810 -26809 -26807 -26468 -26465 -26240 -26237 -26231 -26119 -26117 -26115 -25760 -25757 -25397 -25393 -25388 -25379 -25024 -25022 -25020 -24646 -24642 -24533 -24528 -24525 -24519 -24261 -24259 -24257 -24255 -23867 -23863 -23671 -23663 -23468 -23465 -23464 -23461 -23060 -23056 -22814 -22808 -22805 -22799 -22647 -22644 -22643 -22640 -22225 -22221 -21948 -21939 -21799 -21796 -21794 -21791 -21363 -21358 -21086 -21081 -21078 -21072 -20923 -20920 -20918 -20915 -20473 -20468 -20217 -20209 -20020 -20016 -20015 -20012 -19556 -19552 -19353 -19347 -19344 -19338 -19090 -19087 -19085 -18613 -18608 -18481 -18472 -18133 -18128 -17644 -17642 -17639 -17613 -17607 -17604 -17598 -17151 -17146 -16738 -16729 -16651 -16648 -16646 -16643 -16142 -16137 -15868 -15862 -15859 -15630 -15627 -15625 -15621 -15108 -15103 -14990 -14981 -14583 -14580 -14578 -14574 -14117 -14109 -14049 -14043 -13512 -13508 -13506 -13502 -13238 -13235 -13229 -12964 -12959 -12415 -12411 -12409 -12405 -12363 -12354 -11855 -11850 -11487 -11481 -11478 -11472 -11294 -11290 -11288 -10722 -10716 -10604 -10595 -10149 -10145 -10143 -9727 -9721 -9718 -9712 -9565 -9563 -9559 -8980 -8974 -8842 -8833 -8385 -8383 -8379 -7963 -7957 -7954 -7948 -7788 -7781 -7185 -7181 -7179 -7175 -7078 -7069 -6572 -6566 -6198 -6192 -6189 -6183 -5958 -5954 -5952 -5948 -5335 -5328 -5311 -5302 -4710 -4705 -4703 -4699 -4430 -4424 -4421 -4415 -4075 -4068 -3542 -3533 -3439 -3434 -3432 -3428 -2793 -2786 -2661 -2655 -2652 -2646 -2146 -2142 -2139 -2135 -1773 -1764 -1490 -1483 -891 -885 -882 -832 -828 -826 -165 -159 -3 0 6 161 163 168 823 830 879 888 1485 1487 1492 1767 1770 1776 2137 2144 2649 2658 2784 2788 2791 2795 3430 3436 3530 3536 3539 3545 4066 4070 4073 4077 4418 4427 4701 4707 5299 5305 5308 5314 5326 5330 5333 5337 5950 5956 6186 6195 6564 6568 6570 6575 7066 7072 7075 7081 7177 7183 7779 7784 7786 7790 7951 7960 8381 8387 8830 8836 8839 8845 8972 8976 8978 9561 9567 9715 9724 10141 10147 10592 10598 10601 10607 10718 10720 10724 11286 11292 11475 11484 11848 11852 11854 11857 12351 12357 12360 12407 12413 12957 12961 12963 12966 13232 13241 13504 13510 14041 14045 14047 14051 14106 14112 14114 14576 14582 14984 14987 14993 15101 15105 15106 15110 15623 15628 15856 15865 16135 16139 16140 16144 16645 16650 16732 16735 16741 17144 17147 17149 17601 17610 17640 17645 18127 18130 18132 18469 18475 18478 18484 18610 18612 18615 19083 19088 19341 19350 19553 19555 19558 20013 20018 20206 20212 20214 20220 20467 20470 20471 20475 20917 20921 21075 21083 21357 21360 21361 21364 21793 21797 21936 21942 21945 21951 22219 22222 22223 22226 22642 22646 22802 22811 23054 23057 23058 23061 23463 23467 23660 23666 23669 23674 23861 23864 23866 23868 24256 24260 24522 24531 24641 24643 24645 25021 25025 25376 25382 25385 25391 25392 25394 25395 25758 25759 25761 26114 26118 26234 26243 26466 26467 26469 26808 26812 27084 27090 27093 27143 27145 27146 27149 27473 27476 27793 27795 27797 27799 27938 27947 28109 28112 28414 28416 28418 28420 28715 28718 28784 28793 29006 29008 29009 29011 29292 29295 29568 29570 29571 29573 29633 29636 29642 29839 29842 30100 30102 30103 30356 30359 30475 30483 30603 30605 30843 30844 30846 31075 31077 31299 31300 31301 31302 31314 31320 31322 31328 31517 31519 31725 31727 31728 31729 31928 31930 32122 32123 32124 32125 32156 32165 32309 32311 32487 32488 32489 32490 32659 32661 32822 32823 32824 32825 32979 32980 32990 32996 32999 33004 33126 33127 33128 33267 33269 33399 33400 33401 33525 33525 33526 33642 33643 33751 33752 33752 33827 33836 33853 33854 33946 33947 33947 33948 34033 34033 34111 34111 34111 34112 34181 34182 34244 34244 34244 34245 34299 34299 34346 34346 34346 34346 34385 34385 34416 34417 34417 34417 34440 34440 34456 34456 34456 34464 34464 34656 34662 34664 34670 35488 35496 36311 36317 36319 36325 37137 37146 37955 37960 37963 37969 38775 38783 39586 39592 39595 40401 40409 41206 41211 41214 42013 42016 42022 42812 42820 43613 43616 43621 44405 44413 45194 45199 45202 45207 45984 45993 46766 46771 46774 46779 47549 47557 48323 48329 48331 48337 49099 49107 49866 49871 49873 49879 50633 50641 51392 51397 51400 51405 52152 52160 52903 52908 52910 53654 53662 54397 54404 55140 55142 55147 55873 55881 56603 56608 56610 56615 57333 57340 58053 58058 58061 58066 58774 58781 59486 59490 59493 59498 60197 60204 60899 60904 60906 60911 61601 61608 62294 62298 62301 62305 62986 62993 63669 63673 63676 64351 64358 65024 65028 65031 

Total time taken by CPU: 2.73





Sieve:



   Sieve of Eratosthenes (Scaled to 10 Iterations)

   Version 1.2, 03 April 1992



   Array Size   Number   Last Prime    Linear     RunTime    MIPS

    (Bytes)   of Primes               Time(sec)    (Sec)

       8191       1899        16381   0.000877   0.000877  1889.7

      10000       2261        19997   0.001071   0.001068  1902.2

      20000       4202        39989   0.002142   0.002136  1926.9

      40000       7836        79999   0.004285   0.004349  1916.2

      80000      14683       160001   0.008569   0.008545  1973.0

     160000      27607       319993   0.017138   0.018921  1801.6

     320000      52073       639997   0.034277   0.040283  1710.0

     640000      98609      1279997   0.068554   0.100098  1389.9

    1280000     187133      2559989   0.137107   0.275879  1018.1

    2560000     356243      5119997   0.274215   1.123047   504.7

    5120000     679460     10239989   0.548430   3.896484   293.4

   10240000    1299068     20479999   1.096859   9.609375   239.9

   20480000    2488465     40960001   2.193719  21.796875   213.2



   Relative to 10 Iterations and the 8191 Array Size:

   Average RunTime = 0.002744 (sec)

   High  MIPS      =   1973.0

   Low   MIPS      =    213.2







Whetstone:



Please wait...



Loops: 50000 Iterations: 1 Duration: 8.5 seconds

C Converted Double Precision Whetstones: 591.1 MIPS

Hope this helps!

-- eliyahu

"Physical reality is consistent with universal laws. When the laws do not operate, there is no reality. All of this is unreal."

joerg

Re: QEMU Emulation vs Hardware CPU Benchmarks

Posted on: 2023/5/24 15:04 #9

Home away from home

@eliyahu
Quote:

So I wonder if we can get Crisot to build us a version using unsigned integers rather than floats, if possible, for an A1222 test.

Not possible since 32 bit CPUs don't support 64 bit integer loads/stores, 64 bit accesses are only supported for double (and 128 bit accesses for vector on some CPUs).
64 bit C integer loads/stores have to be executed as two independent 32 bit assembler loads/stores, and you have the results for 32 bit integer accesses in the READ32/WRITE32 results already.

ddni

Re: QEMU Emulation vs Hardware CPU Benchmarks

Posted on: 2023/5/24 15:12 #10

Just can't stay away

A1-X1000

CPUBench v1.0

RageMem:

RAGEMEM v0.37 - compiled 11/06/2010

CPU: P.A. Semi PWRficient PA6T-1682M B1 @ 1800 Mhz
Caches Sizes: L1: 64 KB - L2: 2048 KB - L3: none
Cache Line: 64

---> CPU <---
MAX MIPS: 3084

---> L1 <---
READ32: 6851 MB/Sec
READ64: 13610 MB/Sec
WRITE32: 6851 MB/Sec
WRITE64: 13688 MB/Sec

---> L2 <---
READ32: 3273 MB/Sec
READ64: 5007 MB/Sec
WRITE32: 2532 MB/Sec
WRITE64: 4093 MB/Sec

---> RAM <---
READ32: 2842 MB/Sec
READ64: 4006 MB/Sec
WRITE32: 2732 MB/Sec
WRITE64: 3392 MB/Sec
WRITE: 350 MB/Sec (Tricky)

---> VIDEO BUS <---
READ: 132 MB/Sec
WRITE: 161 MB/Sec

SortBench:
-------------------------------------------------------------
SORTBENCH 1.1 (Gunnar von Boehn)
Its a CPU benchmark that stresses CPU, DCache and branch prediction.
-------------------------------------------------------------
1 K Element : 1763.99 MB/sec
2 K Element : 1766.06 MB/sec
4 K Element : 1768.30 MB/sec
8 K Element : 1769.26 MB/sec
16 K Element : 1753.39 MB/sec
32 K Element : 1532.57 MB/sec

BogoMIPS:
Calibrating delay loop..

Ok - 192.00 BogoMips

Dhrystone:
Dhrystone Benchmark, Version 2.1 (Language: C)
Program compiled without 'register' attribute

Execution starts, 50000000 runs through Dhrystone
Execution ends

Duration in seconds: 38.8
Microseconds for one run through Dhrystone: 0.8
Dhrystones per Second: 1289483.0
Dhrystone MIPS (DMIPS) 733

Quicksort:
Elaborating quicksort of 1000 numbers repeated for 10 times
Total time taken by CPU: 4.09

Sieve:

Sieve of Eratosthenes (Scaled to 10 Iterations)
Version 1.2, 03 April 1992

Array Size Number Last Prime Linear RunTime MIPS
(Bytes) of Primes Time(sec) (Sec)
8191 1899 16381 0.001163 0.001163 1425.0
10000 2261 19997 0.001420 0.001469 1383.4
20000 4202 39989 0.002841 0.002747 1498.7
40000 7836 79999 0.005682 0.005875 1418.5
80000 14683 160001 0.011364 0.012360 1364.1
160000 27607 319993 0.022727 0.031433 1084.4
320000 52073 639997 0.045454 0.065308 1054.7
640000 98609 1279997 0.090908 0.137939 1008.6
1280000 187133 2559989 0.181816 0.275879 1018.1
2560000 356243 5119997 0.363633 0.576172 983.7
5120000 679460 10239989 0.727265 1.591797 718.2
10240000 1299068 20479999 1.454531 3.710938 621.2
20480000 2488465 40960001 2.909062 8.007812 580.3

Relative to 10 Iterations and the 8191 Array Size:
Average RunTime = 0.001795 (sec)
High MIPS = 1498.7
Low MIPS = 580.3

Whetstone:

Please wait...

Loops: 50000 Iterations: 1 Duration: 8.2 seconds
C Converted Double Precision Whetstones: 611.6 MIPS

All tests done.

AmigaOne X1000.
Radeon RX550

http://www.tinylife.org.uk/

tekmage

Re: QEMU Emulation vs Hardware CPU Benchmarks

Posted on: 2023/5/24 16:18 #11

Just popping in

X5000/20:

RageMem:

RAGEMEM v0.37 - compiled 11/06/2010

CPU: Freescale P5020 (E5500 core) 1.2 @ 1995 Mhz
Caches Sizes: L1: 32 KB - L2: 512 KB - L3: none
Cache Line: 64

---> CPU <---
MAX MIPS: 3988

---> L1 <---
READ32: 7532 MB/Sec
READ64: 15041 MB/Sec
WRITE32: 7533 MB/Sec
WRITE64: 15045 MB/Sec

---> L2 <---
READ32: 4286 MB/Sec
READ64: 7708 MB/Sec
WRITE32: 5020 MB/Sec
WRITE64: 8813 MB/Sec

---> RAM <---
READ32: 732 MB/Sec
READ64: 1341 MB/Sec
WRITE32: 1748 MB/Sec
WRITE64: 1754 MB/Sec
WRITE: 2320 MB/Sec (Tricky)

---> VIDEO BUS <---
READ: 22 MB/Sec
WRITE: 535 MB/Sec

SortBench:
-------------------------------------------------------------
SORTBENCH 1.1 (Gunnar von Boehn)
Its a CPU benchmark that stresses CPU, DCache and branch prediction.
-------------------------------------------------------------
1 K Element : 3720.51 MB/sec
2 K Element : 3696.03 MB/sec
4 K Element : 3682.75 MB/sec
8 K Element : 3761.76 MB/sec
16 K Element : 3316.98 MB/sec
32 K Element : 3272.27 MB/sec

BogoMIPS:
Calibrating delay loop..

Ok - 320.00 BogoMips

Dhrystone:
Dhrystone Benchmark, Version 2.1 (Language: C)
Program compiled without 'register' attribute

Execution starts, 50000000 runs through Dhrystone
Execution ends

Final values of the variables used in the benchmark:

Int_Glob: 5
should be: 5
Bool_Glob: 1
should be: 1
Ch_1_Glob: A
should be: A
Ch_2_Glob: B
should be: B
Arr_1_Glob[8]: 7
should be: 7
Arr_2_Glob[8][7]: 50000010
should be: Number_Of_Runs + 10
Ptr_Glob->
Ptr_Comp: 1405485704
should be: (implementation-dependent)
Discr: 0
should be: 0
Enum_Comp: 2
should be: 2
Int_Comp: 17
should be: 17
Str_Comp: DHRYSTONE PROGRAM, SOME STRING
should be: DHRYSTONE PROGRAM, SOME STRING
Next_Ptr_Glob->
Ptr_Comp: 1405485704
should be: (implementation-dependent), same as above
Discr: 0
should be: 0
Enum_Comp: 1
should be: 1
Int_Comp: 18
should be: 18
Str_Comp: DHRYSTONE PROGRAM, SOME STRING
should be: DHRYSTONE PROGRAM, SOME STRING
Int_1_Loc: 5
should be: 5
Int_2_Loc: 13
should be: 13
Int_3_Loc: 7
should be: 7
Enum_Loc: 1
should be: 1
Str_1_Loc: DHRYSTONE PROGRAM, 1'ST STRING
should be: DHRYSTONE PROGRAM, 1'ST STRING
Str_2_Loc: DHRYSTONE PROGRAM, 2'ND STRING
should be: DHRYSTONE PROGRAM, 2'ND STRING

Duration in seconds: 26.0
Microseconds for one run through Dhrystone: 0.5
Dhrystones per Second: 1923210.4
Dhrystone MIPS (DMIPS) 1094

Quicksort:
Elaborating quicksort of 1000 numbers repeated for 10 times
Unsorted array:
<clip>
Sorted array:
<clip>
Total time taken by CPU: 2.76

Sieve:

Sieve of Eratosthenes (Scaled to 10 Iterations)
Version 1.2, 03 April 1992

Array Size Number Last Prime Linear RunTime MIPS
(Bytes) of Primes Time(sec) (Sec)
8191 1899 16381 0.000877 0.000877 1889.7
10000 2261 19997 0.001071 0.001068 1902.2
20000 4202 39989 0.002142 0.001945 2115.8
40000 7836 79999 0.004285 0.004730 1761.7
80000 14683 160001 0.008569 0.009308 1811.3
160000 27607 319993 0.017138 0.018921 1801.6
320000 52073 639997 0.034277 0.040283 1710.0
640000 98609 1279997 0.068554 0.100098 1389.9
1280000 187133 2559989 0.137107 0.275879 1018.1
2560000 356243 5119997 0.274215 1.098633 515.9
5120000 679460 10239989 0.548430 3.554688 321.6
10240000 1299068 20479999 1.096859 8.789062 262.3
20480000 2488465 40960001 2.193719 19.804688 234.6

Relative to 10 Iterations and the 8191 Array Size:
Average RunTime = 0.002590 (sec)
High MIPS = 2115.8
Low MIPS = 234.6

Whetstone:

Please wait...

Loops: 50000 Iterations: 1 Duration: 8.5 seconds
C Converted Double Precision Whetstones: 590.7 MIPS

All tests done.

derfs

Re: QEMU Emulation vs Hardware CPU Benchmarks

Posted on: 2023/5/24 16:49 #12

Not too shy to talk

@Hans

System: QEMU 8.0.0 Windows 10 x64 Ryzen 5600


CPUBench v1.0





RageMem:



RAGEMEM v0.37 - compiled 11/06/2010



CPU: Motorola MPC 7447/7457 Apollo 1.2 @ 1533 Mhz

Caches Sizes: L1: 32 KB - L2: 512 KB - L3: none

Cache Line: 32



---> CPU <---

MAX MIPS:  7972



---> L1 <---

READ32:  6509 MB/Sec

READ64:  11550 MB/Sec

WRITE32: 3099 MB/Sec

WRITE64: 5546 MB/Sec



---> L2 <---

READ32:  6445 MB/Sec

READ64:  11284 MB/Sec

WRITE32: 3063 MB/Sec

WRITE64: 5389 MB/Sec



---> RAM <---

READ32:  4953 MB/Sec

READ64:  7477 MB/Sec

WRITE32: 2575 MB/Sec

WRITE64: 4047 MB/Sec

WRITE: 2200 MB/Sec (Tricky)



---> VIDEO BUS <---

READ:  11034 MB/Sec

WRITE: 5330 MB/Sec







SortBench:

-------------------------------------------------------------

SORTBENCH 1.1 (Gunnar von Boehn)

Its a CPU benchmark that stresses CPU, DCache and branch prediction.

-------------------------------------------------------------

 1 K Element :  3346.00 MB/sec

 2 K Element :  3374.88 MB/sec

 4 K Element :  3378.58 MB/sec

 8 K Element :  3286.33 MB/sec

16 K Element :  3350.50 MB/sec

32 K Element :  3355.24 MB/sec





BogoMIPS:

Calibrating delay loop..



Ok - 656.00 BogoMips







Dhrystone:

Dhrystone Benchmark, Version 2.1 (Language: C)

Program compiled without 'register' attribute



Execution starts, 50000000 runs through Dhrystone

Execution ends



Duration in seconds:                        33.5 

Microseconds for one run through Dhrystone: 0.7 

Dhrystones per Second:                      1491544.1 

Dhrystone MIPS (DMIPS)                      848 







Quicksort:

Total time taken by CPU: 2.55





Sieve:



   Sieve of Eratosthenes (Scaled to 10 Iterations)

   Version 1.2, 03 April 1992



   Array Size   Number   Last Prime    Linear     RunTime    MIPS

    (Bytes)   of Primes               Time(sec)    (Sec)

       8191       1899        16381   0.000687   0.000687  2414.6

      10000       2261        19997   0.000838   0.000877  2315.8

      20000       4202        39989   0.001677   0.001945  2115.8

      40000       7836        79999   0.003353   0.004272  1950.4

      80000      14683       160001   0.006706   0.008698  1938.4

     160000      27607       319993   0.013413   0.017090  1994.6

     320000      52073       639997   0.026825   0.037231  1850.1

     640000      98609      1279997   0.053651   0.069580  1999.5

    1280000     187133      2559989   0.107301   0.148926  1886.0

    2560000     356243      5119997   0.214603   0.800781   707.8

    5120000     679460     10239989   0.429206   1.103516  1036.0

   10240000    1299068     20479999   0.858412   3.496094   659.4

   20480000    2488465     40960001   1.716823   3.398438  1367.4



   Relative to 10 Iterations and the 8191 Array Size:

   Average RunTime = 0.001240 (sec)

   High  MIPS      =   2414.6

   Low   MIPS      =    659.4







Whetstone:



Please wait...



Loops: 50000 Iterations: 1 Duration: 14.5 seconds

C Converted Double Precision Whetstones: 344.7 MIPS





All tests done.

jabirulo

Re: QEMU Emulation vs Hardware CPU Benchmarks

Posted on: 2023/5/24 16:55 #13

Just can't stay away

@Hans

SAM460ex


CPUBench v1.0





RageMem:



RAGEMEM v0.37 - compiled 11/06/2010



CPU: AMCC PPC460EX 1.2 @ 1155 Mhz

Caches Sizes: L1: 32 KB - L2: 256 KB - L3: none

Cache Line: 32



---> CPU <---

MAX MIPS:  2308



---> L1 <---

READ32:  4386 MB/Sec

READ64:  8748 MB/Sec

WRITE32: 4362 MB/Sec

WRITE64: 8699 MB/Sec



---> L2 <---

READ32:  1033 MB/Sec

READ64:  1032 MB/Sec

WRITE32: 448 MB/Sec

WRITE64: 448 MB/Sec



---> RAM <---

READ32:  277 MB/Sec

READ64:  277 MB/Sec

WRITE32: 448 MB/Sec

WRITE64: 448 MB/Sec

WRITE: 798 MB/Sec (Tricky)



---> VIDEO BUS <---

READ:  41 MB/Sec

WRITE: 266 MB/Sec







SortBench:

-------------------------------------------------------------

SORTBENCH 1.1 (Gunnar von Boehn)

Its a CPU benchmark that stresses CPU, DCache and branch prediction.

-------------------------------------------------------------

 1 K Element :  1700.54 MB/sec

 2 K Element :  1706.85 MB/sec

 4 K Element :  1697.80 MB/sec

 8 K Element :  1650.10 MB/sec

16 K Element :  472.58 MB/sec

32 K Element :  390.23 MB/sec





BogoMIPS:

Calibrating delay loop..



Ok - 192.00 BogoMips







Dhrystone:

Dhrystone Benchmark, Version 2.1 (Language: C)

Program compiled without 'register' attribute



Execution starts, 50000000 runs through Dhrystone

Execution ends



Duration in seconds:                        49.9 

Microseconds for one run through Dhrystone: 1.0 

Dhrystones per Second:                      1002608.4 

Dhrystone MIPS (DMIPS)                      570 







Quicksort:

Total time taken by CPU: 5.35





Sieve:



   Sieve of Eratosthenes (Scaled to 10 Iterations)

   Version 1.2, 03 April 1992



   Array Size   Number   Last Prime    Linear     RunTime    MIPS

    (Bytes)   of Primes               Time(sec)    (Sec)

       8191       1899        16381   0.001984   0.001984   835.8

      10000       2261        19997   0.002422   0.002327   873.2

      20000       4202        39989   0.004843   0.004883   843.0

      40000       7836        79999   0.009687   0.016785   496.5

      80000      14683       160001   0.019374   0.075073   224.6

     160000      27607       319993   0.038748   0.207520   164.3

     320000      52073       639997   0.077496   0.460205   149.7

     640000      98609      1279997   0.154991   1.054688   131.9

    1280000     187133      2559989   0.309982   2.373047   118.4

    2560000     356243      5119997   0.619964   5.273438   107.5

    5120000     679460     10239989   1.239928  11.503906    99.4

   10240000    1299068     20479999   2.479856  24.687500    93.4

   20480000    2488465     40960001   4.959712  57.578125    80.7



   Relative to 10 Iterations and the 8191 Array Size:

   Average RunTime = 0.011243 (sec)

   High  MIPS      =    873.2

   Low   MIPS      =     80.7







Whetstone:



Please wait...



Loops: 50000 Iterations: 1 Duration: 51.1 seconds

C Converted Double Precision Whetstones: 97.9 MIPS





All tests done.

derfs

Re: QEMU Emulation vs Hardware CPU Benchmarks

Posted on: 2023/5/24 17:03 #14

Not too shy to talk

@derfs

System: WinUAE Windows x64 Ryzen 5600


CPUBench v1.0





RageMem:



RAGEMEM v0.37 - compiled 11/06/2010



CPU: 604e 9.516 @ 264 Mhz

Caches Sizes: L1: 32 KB - L2: none - L3: none

Cache Line: 32



---> CPU <---

MAX MIPS:  8034



---> L1 <---

READ32:  6510 MB/Sec

READ64:  11900 MB/Sec

WRITE32: 5930 MB/Sec

WRITE64: 10837 MB/Sec



---> RAM <---

READ32:  5222 MB/Sec

READ64:  8071 MB/Sec

WRITE32: 4755 MB/Sec

WRITE64: 7595 MB/Sec

WRITE: 2285 MB/Sec (Tricky)



---> VIDEO BUS <---

READ:  51 MB/Sec

WRITE: 49 MB/Sec







SortBench:

-------------------------------------------------------------

SORTBENCH 1.1 (Gunnar von Boehn)

Its a CPU benchmark that stresses CPU, DCache and branch prediction.

-------------------------------------------------------------

 1 K Element :  2718.47 MB/sec

 2 K Element :  2704.66 MB/sec

 4 K Element :  2713.30 MB/sec

 8 K Element :  2715.93 MB/sec

16 K Element :  2703.18 MB/sec

32 K Element :  2717.73 MB/sec





BogoMIPS:

Calibrating delay loop..



Ok - 464.00 BogoMips







Dhrystone:

Dhrystone Benchmark, Version 2.1 (Language: C)

Program compiled without 'register' attribute



Execution starts, 50000000 runs through Dhrystone

Execution ends



Final values of the variables used in the benchmark:



Duration in seconds:                        33.1 

Microseconds for one run through Dhrystone: 0.7 

Dhrystones per Second:                      1512390.2 

Dhrystone MIPS (DMIPS)                      860 







Quicksort:

Elaborating quicksort of 1000 numbers repeated for 10 times

Total time taken by CPU: 2.67





Sieve:



   Sieve of Eratosthenes (Scaled to 10 Iterations)

   Version 1.2, 03 April 1992



   Array Size   Number   Last Prime    Linear     RunTime    MIPS

    (Bytes)   of Primes               Time(sec)    (Sec)

       8191       1899        16381   0.000763   0.000763  2173.1

      10000       2261        19997   0.000931   0.000954  2130.5

      20000       4202        39989   0.001863   0.002213  1860.5

      40000       7836        79999   0.003726   0.004272  1950.4

      80000      14683       160001   0.007451   0.009460  1782.1

     160000      27607       319993   0.014903   0.018311  1861.6

     320000      52073       639997   0.029806   0.037842  1820.3

     640000      98609      1279997   0.059612   0.075684  1838.3

    1280000     187133      2559989   0.119224   0.312500   898.8

    2560000     356243      5119997   0.238448   1.376953   411.6

    5120000     679460     10239989   0.476895   3.359375   340.3

   10240000    1299068     20479999   0.953791   7.890625   292.2

   20480000    2488465     40960001   1.907581  17.968750   258.6



   Relative to 10 Iterations and the 8191 Array Size:

   Average RunTime = 0.002496 (sec)

   High  MIPS      =   2173.1

   Low   MIPS      =    258.6







Whetstone:



Please wait...



Loops: 50000 Iterations: 1 Duration: 18.8 seconds

C Converted Double Precision Whetstones: 266.5 MIPS





All tests done.

mr2

Re: QEMU Emulation vs Hardware CPU Benchmarks

Posted on: 2023/5/24 21:41 #15

Not too shy to talk

SAM440Flex 800MHz

CPUBench v1.0

RageMem:

RAGEMEM v0.37 - compiled 11/06/2010

CPU: AMCC PPC440EP 1.3 @ 799 Mhz
Caches Sizes: L1: 32 KB - L2: none - L3: none
Cache Line: 32

---> CPU <---
MAX MIPS: 1599

---> L1 <---
READ32: 3023 MB/Sec
READ64: 6027 MB/Sec
WRITE32: 3024 MB/Sec
WRITE64: 6026 MB/Sec

---> RAM <---
READ32: 320 MB/Sec
READ64: 320 MB/Sec
WRITE32: 200 MB/Sec
WRITE64: 200 MB/Sec
WRITE: 887 MB/Sec (Tricky)

---> VIDEO BUS <---
READ: 27 MB/Sec
WRITE: 39 MB/Sec

SortBench:
-------------------------------------------------------------
SORTBENCH 1.1 (Gunnar von Boehn)
Its a CPU benchmark that stresses CPU, DCache and branch prediction.
-------------------------------------------------------------
1 K Element : 1185.90 MB/sec
2 K Element : 1183.31 MB/sec
4 K Element : 1183.96 MB/sec
8 K Element : 1171.92 MB/sec
16 K Element : 375.57 MB/sec
32 K Element : 298.67 MB/sec

BogoMIPS:
Calibrating delay loop..

Ok - 128.00 BogoMips

Dhrystone:
Dhrystone Benchmark, Version 2.1 (Language: C)
Program compiled without 'register' attribute

Execution starts, 50000000 runs through Dhrystone
Execution ends

Duration in seconds: 71.9
Microseconds for one run through Dhrystone: 1.4
Dhrystones per Second: 695362.4
Dhrystone MIPS (DMIPS) 395

Quicksort:

Total time taken by CPU: 7.68

Sieve:

Sieve of Eratosthenes (Scaled to 10 Iterations)
Version 1.2, 03 April 1992

Array Size Number Last Prime Linear RunTime MIPS
(Bytes) of Primes Time(sec) (Sec)
8191 1899 16381 0.002708 0.002708 612.1
10000 2261 19997 0.003307 0.003433 591.8
20000 4202 39989 0.006613 0.006866 599.5
40000 7836 79999 0.013226 0.023804 350.1
80000 14683 160001 0.026453 0.117188 143.9
160000 27607 319993 0.052906 0.284424 119.8
320000 52073 639997 0.105811 0.662842 103.9
640000 98609 1279997 0.211622 1.425781 97.6
1280000 187133 2559989 0.423245 3.173828 88.5
2560000 356243 5119997 0.846489 7.001953 80.9
5120000 679460 10239989 1.692979 14.960938 76.4
10240000 1299068 20479999 3.385957 31.406250 73.4
20480000 2488465 40960001 6.771914 68.437500 67.9

Relative to 10 Iterations and the 8191 Array Size:
Average RunTime = 0.014933 (sec)
High MIPS = 612.1
Low MIPS = 67.9

Whetstone:

Please wait...

Loops: 50000 Iterations: 1 Duration: 72.5 seconds
C Converted Double Precision Whetstones: 68.9 MIPS

A1000/CDTV/CDTV+8MB+IDE/CD32/A500/A600+xT+RGB2HDMI/A600+Furia+IndiECS/A1200+TF1260+IndiAGA/A4000D+A2320+PiccoloSD64/Sam440 flex 800MHz RAM 1GB HD7750 128MB OS4.1 SBLive! ->

Maijestro

Re: QEMU Emulation vs Hardware CPU Benchmarks

Posted on: 2023/5/24 22:20 #16

Just can't stay away

@Hans

First of all, thanks for the great compilation and the script which works very well.

Here now also my result under Qemu, the hardware they please take from my signature.

CPUBench v1.0

RageMem:

RAGEMEM v0.37 - compiled 11/06/2010

CPU: Motorola MPC 7447/7457 Apollo 1.2 @ 1533 Mhz
Caches Sizes: L1: 32 KB - L2: 512 KB - L3: none
Cache Line: 32

---> CPU <---
MAX MIPS: 9845

---> L1 <---
READ32: 5848 MB/Sec
READ64: 8829 MB/Sec
WRITE32: 6469 MB/Sec
WRITE64: 10205 MB/Sec

---> L2 <---
READ32: 5812 MB/Sec
READ64: 8822 MB/Sec
WRITE32: 6452 MB/Sec
WRITE64: 10234 MB/Sec

---> RAM <---
READ32: 5172 MB/Sec
READ64: 7506 MB/Sec
WRITE32: 5408 MB/Sec
WRITE64: 10191 MB/Sec
WRITE: 2943 MB/Sec (Tricky)

---> VIDEO BUS <---
READ: 8767 MB/Sec
WRITE: 10102 MB/Sec

SortBench:
-------------------------------------------------------------
SORTBENCH 1.1 (Gunnar von Boehn)
Its a CPU benchmark that stresses CPU, DCache and branch prediction.
-------------------------------------------------------------
1 K Element : 3091.48 MB/sec
2 K Element : 3103.53 MB/sec
4 K Element : 3110.91 MB/sec
8 K Element : 3102.89 MB/sec
16 K Element : 3110.96 MB/sec
32 K Element : 3109.41 MB/sec

BogoMIPS:
Calibrating delay loop..

Ok - 656.00 BogoMips

Dhrystone:
Dhrystone Benchmark, Version 2.1 (Language: C)
Program compiled without 'register' attribute

Execution starts, 50000000 runs through Dhrystone
Execution ends

Duration in seconds: 28.1
Microseconds for one run through Dhrystone: 0.6
Dhrystones per Second: 1778409.9
Dhrystone MIPS (DMIPS) 1012

Quicksort:
Elaborating quicksort of 1000 numbers repeated for 10 times
Unsorted array:

Total time taken by CPU: 2.44

Sieve:

Sieve of Eratosthenes (Scaled to 10 Iterations)
Version 1.2, 03 April 1992

Array Size Number Last Prime Linear RunTime MIPS
(Bytes) of Primes Time(sec) (Sec)
8191 1899 16381 0.000684 0.000684 2425.1
10000 2261 19997 0.000835 0.000879 2311.8
20000 4202 39989 0.001669 0.001758 2341.8
40000 7836 79999 0.003339 0.003515 2370.4
80000 14683 160001 0.006677 0.007813 2157.9
160000 27607 319993 0.013354 0.015624 2181.8
320000 52073 639997 0.026709 0.031252 2204.1
640000 98609 1279997 0.053418 0.056248 2473.5
1280000 187133 2559989 0.106836 0.125008 2246.8
2560000 356243 5119997 0.213671 0.250015 2267.0
5120000 679460 10239989 0.427343 0.499954 2286.8
10240000 1299068 20479999 0.854686 1.399994 1646.7
20480000 2488465 40960001 1.709372 4.400024 1056.1

Relative to 10 Iterations and the 8191 Array Size:
Average RunTime = 0.000865 (sec)
High MIPS = 2473.5
Low MIPS = 1056.1

Whetstone:

Please wait...

Loops: 50000 Iterations: 1 Duration: 9.6 seconds
C Converted Double Precision Whetstones: 523.2 MIPS

All tests done.

Now can I please get a 32 bit driver for AmigaOs4.1 so we can move on to the GPU benchmark ?

Don't take it seriously it's just a joke...

Thanks again for the script and merging the CPU benchmarks.

I am somewhat pleasantly surprised that these results are so different.....es shows once again how strong this ARM Cpu is, but also shows its weaknesses in some areas.

---> VIDEOBUS <---
READ: 8767 MB/S.
WRITE: 10102 MB/sec.

That impressed me the most, how does that come about, maybe it's because the main memory as well as the garficard memory are used as shared memory.

With the benchmark you can imagine how well AmigaOs4.1 with Qemu runs on my machine. That is the point why I chose this emulation, it gives me a good feeling to be able to use AmigaOs4.1 under it. AmigaOs4.1 runs albeit somewhat limited as my 2nd operating system thanks to Qemu.

Numbers can be manipulated, with videos it becomes very difficult, if they would like I can also create a video about it again.

Edited by Maijestro on 2023/5/24 22:43:54
Edited by Maijestro on 2023/5/24 22:49:05
Edited by Maijestro on 2023/5/24 22:51:53
Edited by Maijestro on 2023/5/24 23:09:01

MacStudio ARM M1 Max Qemu//Pegasos2 AmigaOs4.1 FE / AmigaOne x5000/40 AmigaOs4.1 FE

Hans

Re: QEMU Emulation vs Hardware CPU Benchmarks

Posted on: 2023/5/25 0:17 #17

Home away from home

@eliyahu

Yes, the FPU emulation gets in the way of some of the copy benchmarks. There's clearly something wrong on my system with the Whetstone benchmark. Maybe I haven't kept my system up-to-date enough...

@joerg
Quote:

Not possible since 32 bit CPUs don't support 64 bit integer loads/stores, 64 bit accesses are only supported for double (and 128 bit accesses for vector on some CPUs).
64 bit C integer loads/stores have to be executed as two independent 32 bit assembler loads/stores, and you have the results for 32 bit integer accesses in the READ32/WRITE32 results already.

You're clearly unfamiliar with the unusual P1022 CPU in the A1222. It doesn't have a proper PowerPC FPU, but does have 64-bit GPRs which can be used via its Stream-Processing-Engine (SPE). The RadeonHD & RX drivers both have SPE optimized copy routines in them.

Hans

Join Kea Campus' Amiga Corner and support Amiga content creation
https://keasigmadelta.com/ - see more of my work

Hans

Re: QEMU Emulation vs Hardware CPU Benchmarks

Posted on: 2023/5/25 0:23 #18

Home away from home

@Maijestro

Quote:

I am somewhat pleasantly surprised that these results are so different.....es shows once again how strong this ARM Cpu is, but also shows its weaknesses in some areas.

Yes, the results are interesting. I am impressed with the ARM numbers. Derfs' Ryzen numbers are also significantly better than my laptop.

@all
Has anyone tried Qemu on a Raspberry Pi 4 or a PineBook Pro? I'd be interested to see what a weaker ARM CPU can do (one that doesn't have Apple's M1 extensions to aid with x86 emulation).

Hans

Join Kea Campus' Amiga Corner and support Amiga content creation
https://keasigmadelta.com/ - see more of my work

joerg

Re: QEMU Emulation vs Hardware CPU Benchmarks

Posted on: 2023/5/25 5:18 #19

Home away from home

@Maijestro
Quote:

---> VIDEOBUS <---
READ: 8767 MB/S.
WRITE: 10102 MB/sec.

That impressed me the most, how does that come about,

Because you are testing mainboard RAM accesses, the part of it used for the SM501 emulation, not gfx card RAM accesses.
The RageMem L1/L2 cache results compared to RAM ones are of course wrong as well, your host CPU has much larger L1 and L2 caches than the emulated 32/512 KB.

Unlike QEmu UAE seems to support direct access to the gfx card, for example compare derfs QEmu results (11034/5330 MB/s) with the UAE (51/49 MB/s) ones.

Quote:

Now can I please get a 32 bit driver for AmigaOs4.1 so we can move on to the GPU benchmark ? Don't take it seriously it's just a joke...

To get usable gfx speed you need an AmigaOS gfx driver for QEmu, just like the uaegfx driver used in UAE.
Best way for implementing a QEmu gfx driver would probably be one using Cairo and/or OpenGL, that way it would work with any host gfx card.
The currently used SM501/2 emulation is the worst possible way, because of it's strange 4 byte 24 bit modes neither supported by AmigaOS nor host gfx cards, and even emulating a Permedia 2 or a Voodoo 3/5 would be much better than that.

Edited by joerg on 2023/5/25 16:26:25
Edited by joerg on 2023/5/25 16:31:40

walkero

Re: QEMU Emulation vs Hardware CPU Benchmarks

Posted on: 2023/5/25 13:17 #20

Site Builder

@Hans
I tried the Qemu 8 under Raspberry Pi 400 some time ago and the feeling I had was that it was slower than my classic PPC A1200. But I will try to run it again and get results based on your script.

Follow me on
Ko-fi, Twitter, YouTube, Twitch

Register To Post	(1) 2 3 »
	Top Previous Topic Next Topic

Currently Active Users Viewing This Thread: 1 ( 0 members and 1 Anonymous Users )