Login
Username:

Password:

Remember me



Lost Password?

Register now!

Sections

Who's Online
19 user(s) are online (14 user(s) are browsing Forums)

Members: 0
Guests: 19

more...

Support us!

Headlines

 
  Register To Post  

« 1 2 3 4 (5)
Re: Project - hardware to run AOS4 for 35 euro on QEMU 10 + GPU  passthrough
Quite a regular
Quite a regular


See User information
I did some more benchmarking to get more understanding of the problem instead of trying to base solutions on theories. I've posted the results on the qemu-ppc mailing list Looks like dcbz isn't the biggest issue but AltiVec is. So for now you may get better results with -cpu g3 or some other G3 variant that don't use AltiVec. For accessing RAM AltiVec sometimes helps a little but for VRAM access it does not. For this dcbz isn't the biggest problem (for RAM access it's measureable though). I still don't know what AltiVec ops to check, I can get a disassembly and look at what opcodes are used but I don't really want to do that. Maybe if a smaller test case could be written with just one loop or even an assembly one that may be easier to check and optimise. By a quick look these don't use many AltiVec instructions though, most are different loads and I've seen vperm and nothing else so just checking how those are translated by QEMU might be enough but I let somebody else try to do that.

Go to top
Re: Project - hardware to run AOS4 for 35 euro on QEMU 10 + GPU  passthrough
Quite a regular
Quite a regular


See User information
@joerg
Quote:
Firmware ENV variables:
Works on all systems, not only U-Boot ones even if the functions have UBoot in their names, and even on classic Amigas and Pegasos2 without NVRAM support (they use a special version of it and a kickstart module text file instead, but of course unlike the other implementations that's read-only):

As said before env variables aren't the most convenient way for QEMU. Adding a text config to kickstart.zip is much easier. These may be preffered on real machine but for QEMU it's not easily available or easy to set.

Go to top
Re: Project - hardware to run AOS4 for 35 euro on QEMU 10 + GPU  passthrough
Home away from home
Home away from home


See User information
@balaton

Quote:
On real hardware sure but I meant for QEMU with vfio-pci. I don't even know if it would work currently so maybe it's not top priority over other possible improvements. But adding a config for it helps testing it in any case.


VFIO gives direct access to real hardware. Provided that the address mappings are set up correctly (i.e., the guest OS' physical addresses match the actual physical addresses), then GART would enable the GPU's DMA engines to read/write directly to RAM. That should make a difference.

For testing, GART was accidentally enabled in RadeonRX.chip v2.14 (at least for the Sam460). I can't remember if that one was released, or if we caught that during beta testing. Either way, anyone with access that version would be able to test. Eventuall v54.8 will make it to a public release.


Quote:
They say optimists are oblivious pessimists But it does look like from my perspective that your work is just going into a black hole...


Yes, Matthew (i.e., @amigakit) is very preoccupied with the A600GS and A1200GS, which slows down everything else. He's the bottleneck.

Bear in mind:
1. The VirtioGPU driver is a brand new product, which therefore requires extra work from @amigakit to release, and he's preoccupied with another project...
2. The Pegasos2 kernel patches are in ExecSG, and Hyperion is in charge of its release schedule. Hyperion is in a total mess right now, although work is happening behind-the-scenes
3. The RadeonHD/RX drivers are existing products, and releasing a new update takes a lot less work

Hans

Join Kea Campus' Amiga Corner and support Amiga content creation
https://keasigmadelta.com/ - see more of my work
Go to top
Re: Project - hardware to run AOS4 for 35 euro on QEMU 10 + GPU  passthrough
Home away from home
Home away from home


See User information
@joerg

Quote:
Using a kickstart module text file for config:
If you have access to the AmigaOS 4.x sources check for example my diskboot.kmod sources (Hyperion has no permission to use it anymore, but you can use the part of my sources parsing the config for your drivers).
IIRC it's something like...

Thanks, I'll take a look.

I wish that the uboot/firmware ENV variables were workable with QEMu, because that would be much simpler.

Quote:
@geennams's nearly 2 GB/s nvme.device benchmark result on the X5000 with PCIe x4 v2.0 is only a theoretical result, transferring 128 MB of data in a single DMA transfer is next to never done with real world DOS/filesystem operations.
Using 16 KB it's only 222 MB/s, and with the max. transfer size NGFS uses (128 KB) it's 450 MB/s.
I don't know anything about 3D GPUs, but I guess most GPU texture, etc. DMA transfers are quite small as well and not several MB at once either.

Hmmm. Looking at the lower-end results in his table (link), they're on the slow side for what I'd expect. PCIe overhead isn't so large that you need to transfer 10s to 100s of MB to get up to the GB/s transfer speeds. He probably has other overhead slowing things down.

We hit such overhead when using the CPU's DMA engines for WritePixelArray/ReadPixel array. The overhead of setting up blocks of RAM for DMA, and then unlocking them again added up. We still managed to get up to 1GB/s, depending on the CPU & GPU combo (e.g., this result).

Hans

Join Kea Campus' Amiga Corner and support Amiga content creation
https://keasigmadelta.com/ - see more of my work
Go to top
Re: Project - hardware to run AOS4 for 35 euro on QEMU 10 + GPU  passthrough
Not too shy to talk
Not too shy to talk


See User information
@balaton
Quote:
So for now you may get better results with -cpu g3 or some other G3 variant that don't use AltiVec

I don't know if I tested it correctly. I used the system as usual with apllications. I did not notice much difference.
Dry tests are -> https://hdrlab.org.nz/benchmark/gfxbench2d/OS/AmigaOS/Result/2941
I have no idea what values for cpu g3 would be significantly better.

Go to top
Re: Project - hardware to run AOS4 for 35 euro on QEMU 10 + GPU  passthrough
Home away from home
Home away from home


See User information
@smarkusg

Look at the MemCopy tests. Those are the only ones that are affeced by G3 vs G4. Normally, a G4's memcopy results would be roughly 2x that of the G3, because the altivec instructions can move data in 128-bit blocks instead of max 64-bit like the FPU. That makes a big difference with PCIe.

On QEmu, it depends entirely how the altivec instructions are translated to the host machine's CPU (and how good the machine's PCIe controller is).

Hans

Join Kea Campus' Amiga Corner and support Amiga content creation
https://keasigmadelta.com/ - see more of my work
Go to top
Re: Project - hardware to run AOS4 for 35 euro on QEMU 10 + GPU  passthrough
Quite a regular
Quite a regular


See User information
You can follow the discussion on the qemu mailing list that I linked above. I did some profiling and the results are consistent with that. Looks like dcbz has very small impact. AltiVec load and store should do 128 bit ops when available but using vperm AltiVec op is also not optimal but still has only a few percent impact. Currently it is believed that the i/o operation is causing a recompile in the loop which is mainly causing the slow down but I don't know yet how to fix that and don't fully understand how QEMU handles it. So the results that not using AltiVec gives only about 1% speed increase is plausible as it does not remove the biggest bottleneck so the impact of AltiVec and dcbz is negligible to that.

Go to top
Re: Project - hardware to run AOS4 for 35 euro on QEMU 10 + GPU  passthrough
Not too shy to talk
Not too shy to talk


See User information
I'll get back to the topic thread a bit....


*****!!! - Many thanks to EntwicklerX for the unpaid access to the full games they created for AOS4 for testing and testing on the QEMU GPU passthrough. - !!!*****


I'm currently testing M.A.C.E - it's hard to tear myself away from this game, so for starters
Game running on full screen 1680x1050 setting ‘high’
fps:
main mode - 274-190 frames
gameplay - 180-72 frames depending on the number of objects.

screenshot -> https://ibb.co/7xwTwRcG
screenshot -> https://ibb.co/gZwJJwNP
screenshot -> https://ibb.co/jk84Q98F


Super Star Blast - works very well. Just as terribly addictive as M.A.C.E.
Game running on full screen 1680x1050

Vsync = yes
Holds 59-60 fps in gameplay
Vsync = no jumps 67-82 fps in gameplay
Slight sound problems do not exist in reality. I saw them when I was ripping the video. This is a problem with my phone. I accidentally gave recording in 4k and my phone is a bit too slow.

Video - > https://youtu.be/Kd_a9kpbXtQ


to be continued.....

legend:
- the games must be purchased which I strongly encourage you to do.
- the games have been donated to me at the moment on EntwicklerX's own initiative and will not be shared with others.
- The game film, if not included, will be added later.
- In order not to create a new topic, all tests will be added here and later moved to the first topic.
- If there are changes to the emulation of qemu, the test results will be corrected with a description.


Edited by smarkusg on 2025/4/29 20:31:09
Edited by smarkusg on 2025/4/30 16:55:16
Go to top

  Register To Post
« 1 2 3 4 (5)

 




Currently Active Users Viewing This Thread: 2 ( 0 members and 2 Anonymous Users )




Powered by XOOPS 2.0 © 2001-2024 The XOOPS Project