Not possible, the gfx drivers are started before DOS, most Kickstart modules can't do any stdio. For example that's the reasons debug output of the kernel, and nearly all other Kickstart modules, is done using serial output instead, and not written in a debug log file. The easiest way is using an U-Boot variable, and for QEmu it would be best it the defaults you are using for the AmigaOne emulation would be stored in an editable config file.
Problem with env variable or default nvram.bin is that user will need to remember to include it at every start which they will forget and complain. It must be either default or done once then be persistent. I can't easily add text file to set default nvram as there's no QEMU option for that and no other example in other machines which should still work alike so I can't add random hacks. But it's possible to include a text file in Kickstart, the pegasos2 nvram.resource does that and works so other modules could do the same. I think that's the simplest solution but the problem is theoretical at the moment as we don't even know such config is needed or we can just fix it elsewhere.
*) the results on a newer processor than mine from 2012 should be much better
This may be a myth. People tried on different machines already and all got the same results that seem to be limited by something else than CPU speed so I'm not sure it would help until proven.
Your results show that the faster card is actually slower in VRAM access and only slightly faster with larger rectangles and in compositing. IMHO does not worth the almost 4 times power draw when you get the same result with a simpler card. It may be different if we find out what limits performance but until then unless there's something this faster card can do that the slower one doesn't it does not seem to be a good choice.
@balaton There are tests that I conduct as part of my project. Yes, with VRAM there can be a problem. See geennaam's results https://ftp.hdrlab.org.nz/benchmark/gf ... 2d/OS/AmigaOS/Result/2773 My slow machine has better results in this test. From what I remember @joerg wrote that the CPU is always used anyway. My older machine makes up for what geennaam's newer machine was slower at. Graphics cards are similar.
What does it depend on ? Maybe the processor , BIOS settings (inventions related to gaming boards) .... a lot of different unknowns. Power consumption mainly depends on the use of the card anyway. How it works at idle - it will be a small difference.
Power consumption mainly depends on the use of the card anyway. How it works at idle - it will be a small difference.
Assuming the driver knows about power management and does not run the card at full power all the time. With Windows or Linux drivers that probably works. I'm not too sure about AmigaOS or MorphOS drivers. Maybe there's a way to monitor this somehow to make sure. I think Nikitas used radeontop or something like that to see activity of the card, I don't know if that can also show settings. The opposite can also be true that if the driver can't put the card to higher power setting then it could run slower than it could. As you say lots of unknowns.
I know that someone tried PCI passthrough and complained that his card uses fans at 100%. In this card I'm testing this is not the case. I managed to hang QEMU once (more my mistake in the configuration) with the card howled like a lion. It is possible that the driver for RadeonHD cards for AOS4 works correctly in this topic.
This may be a myth. People tried on different machines already and all got the same results that seem to be limited by something else than CPU speed so I'm not sure it would help until proven.
With gpu-passthrough speed: sure, but with pure cpu-based routines probably the fastest single core the better ? At least with winuae (which use core of qemu for ppc emulation as you know) the fastest single core cpu the fastest emulation in whole
Quote:
Your results show that the faster card is actually slower in VRAM access and only slightly faster with larger rectangles and in compositing. IMHO does not worth the almost 4 times power draw when you get the same result with a simpler card. It may be different if we find out what limits performance but until then unless there's something this faster card can do that the slower one doesn't it does not seem to be a good choice.
The same happens on real hardware btw (at least on x5000 for sure), and not only on QEMU: the fastest newest card slower in VRAM access , but faster with larger rectangles and in compositing. And even in the large rectangles we somehow hit the limit of "something" (on real hardware too), when does not matter what card you will insert in, it always the same FPS mostly in the places where we hit this "some" limits. Which is unknown why happens.
But anyway, by Smarkusg's tests, it looks like at least it still can be be faster in 4 times (i am about his tests with irrlicht engine quake3 demo) to meet with those limits we have on real hardware.
@smarkusg Quote:
I know that someone tried PCI passthrough and complained that his card uses fans at 100%.
I can say for sure, that something strange happens with FANS on real hardware too. Sometime it's too loud while shouldn't, sometimes it too slow, while should be fast.
As kas1e has already described, it may well be that Linux/Qemu is not to blame for these slowdowns. It is quite possible that it comes from the guest system. If we investigate this further we may be able to find the root cause of the problem. Similar to the kernel problems of the Pegasos2 real machine found by Qemu.
Personally, I also noticed when testing OpenMoHAA that what we have is only a fraction of the performance that should actually be available to us.
Quote:
kas1e wrote:
I hope Hans can add a bit to what i say there, but:
I remember when Hans tested all the bandwidth limitations, he measured the performance inside the RadeonRX.chip driver for x5000/020, and found that something weird is going on with the X5000's PCIe controller: DMA transfer rate, was no faster than CPU transfers: It's maxing out at about 300 MiB/s, which is a fraction of the 2GiB/s theoretical max for 4x PCIe gen2.
Also were found by those tests that x5000/040 is slower than x5000/020 on copy transfers for about 30% of the 020 speed (that explain why some 3d games on 040 is slower than on 020), while this not seems to relevant, it's again show that something bad happens with PCIe controller on the initial setup.. Dunno maybe by UBOOT, or by kernel, or whatever else, but we should have at least 1.5G/s of transfer, not just 300M/s.
It looks like we are a little bit faster than "PCIE gen 1.0 x1", which give 256MB/s. And even "PCIE gen 1.0 x2" give 512MB/S which we already didn't reach.
In end of all it ends nowhere for now, only that we know that there issues with bandwidth and we not known from where they come: uboot, kernel, or whatever. But by some reasons we can't reach even fraction of PCIe gen2 limitation. And probably that is what we have when we have many textures in one frame (this can explain and lighting issues, and speed drop in games when there a lot to show, even without lighting).
What would have really interested me would have been how the power consumption is under Linux PPC on the X5000, I would have liked to test that, but as far as I know there are no RX graphics card drivers for it.
Edited by Maijestro on 2025/4/26 6:45:03
MacStudio ARM M1 Max Qemu//Pegasos2 AmigaOs4.1 FE / AmigaOne x5000/40 AmigaOs4.1 FE
Assuming the driver knows about power management and does not run the card at full power all the time.
Enhancer Software includes a Power Preferences tool with which you can configure the Radeon HD/RX gfx cards to "Low power", "High power" or "Dynamic" mode.
I remember when Hans tested all the bandwidth limitations, he measured the performance inside the RadeonRX.chip driver for x5000/020, and found that something weird is going on with the X5000's PCIe controller: DMA transfer rate, was no faster than CPU transfers: It's maxing out at about 300 MiB/s, which is a fraction of the 2GiB/s theoretical max for 4x PCIe gen2.
Whatever it is, it's neither U-Boot, the kernel nor the motherboard hardware. @geennaam managed to get nearly 2 Gb/s with his nvme.device on the X5000.
This may be a myth. People tried on different machines already and all got the same results that seem to be limited by something else than CPU speed so I'm not sure it would help until proven.
With gpu-passthrough speed: sure, but with pure cpu-based routines probably the fastest single core the better ? At least with winuae (which use core of qemu for ppc emulation as you know) the fastest single core cpu the fastest emulation in whole
Single core speed is an important part, but additionally in all AmigaOS related PPC emulation benchmark results I've seen so far, no matter if QEmu or WinUAE, the AMD Ryzen x64 CPUs seem to be faster than Intel x64 CPUs from the same time/generation, even if those Intel CPUs are much faster with native host OSes like Linux/x64 or Windows than the AMD Ryzen counterparts.
Single core speed is an important part, but additionally in all AmigaOS related PPC emulation benchmark results I've seen so far, no matter if QEmu or WinUAE, the AMD Ryzen x64 CPUs seem to be faster than Intel x64 CPUs from the same time/generation, even if those Intel CPUs are much faster with native host OSes like Linux/x64 or Windows than the AMD Ryzen counterparts.
By what benchmark? You said before that benchmarks are useless so how do you know that Ryzen is faster than Intel for this? So far the only somewhat reliable measure seems to be some application benchmark like running Quake timedemo or different apps that use some functions which is a better measure than synthetic benchmarks.
Whatever it is, it's neither U-Boot, the kernel nor the motherboard hardware. @geennaam managed to get nearly 2 Gb/s with his nvme.device on the X5000.
We can exclude hardware and U-Boot from that but not sure about kernel. The nvme.device may used its own routines while the graphics stack may use something from the kernel so I would not exclude that yet.
@Maijestro Does the e5500 support little endian? Newer POWER (BookS) CPUs have LE mode and can run ppc64le version of Linux which probably works better as the problem with newer drivers may be that they were not tested for big endian machines. Much the same problem as with newer web browser engines. AmigaOS is big endian but Linux has both versions and nowadays ppc64le is the only one still somewhat supported.
With gpu-passthrough speed: sure, but with pure cpu-based routines probably the fastest single core the better ? At least with winuae (which use core of qemu for ppc emulation as you know) the fastest single core cpu the fastest emulation in whole
We are discussing GPU pass through in this thread and people said before that maybe it would be faster with faster CPU but then all people who tried whatever CPUs they had got similar slow or even slower results that does not seem related to CPU speed to me. So while faster CPU is better for emulation this may not be the main reason for the GPU slowness we see or it's not proven what effect it might have. So until that's understood I would not say it matters for GPU performance as that's just an assumption without being confirmed.
Enhancer Software includes a Power Preferences tool with which you can configure the Radeon HD/RX gfx cards to "Low power", "High power" or "Dynamic" mode.
I thought that Power Preferences only works with Radeon RX and RadeonHD v5 drivers. Out of curiosity I'll check because README regarding Irrlicht Engine's @kas1e wrote that “High power” mode is required for RX cards. Maybe it must be for Radeon HD as well.
edit: Power Preferences does not work with Radeon HD