Who's Online |
11 user(s) are online ( 8 user(s) are browsing Forums)
Members: 0
Guests: 11
more...
|
|
Headlines |
-
db_icons_pack_01.lha - graphics/icon
Sep 12, 2024
-
hwp_rapagui.lha - library/hollywood
Sep 12, 2024
-
arabic_console_devicepro2.lha - driver/input
Sep 8, 2024
-
amiarcadia.lha - emulation/gamesystem
Sep 8, 2024
-
ciagent.lha - emulation/misc
Sep 8, 2024
-
deark.lha - utility/archive
Sep 6, 2024
-
amitranslate.lha - utility/text
Sep 6, 2024
-
amissl-sdk.lha - development/misc
Sep 5, 2024
-
amissl.lha - library/misc
Sep 5, 2024
-
snoopy.lha - utility/filetool
Sep 5, 2024
|
|
|
|
Re: Qemu + VFIO GPU RadeonRX 550 + AmigaOS4 extremely slow
|
|
Just popping in
|
@balaton
Thank you for your work, I will pull when the patches are merged with master and I'll compile again. I'm not expecting anything close to a miracle, but I'm curious to see if it will slightly improve the results in @Hans GPU benchmark tool.
|
|
|
|
Re: Qemu + VFIO GPU RadeonRX 550 + AmigaOS4 extremely slow
|
Posted on: 6/22 19:34
#42
|
Just popping in
|
@joerg
It was requested by @balaton after examining a "perf" report I sent. He can explain the details on this as I don't have the required QEMU & PowerPC ISA knowledge to answer.
In the meantime, I wonder why the overhead in @balaton results is so big. Is it because the TLB mechanism is software emulated maybe?
|
|
|
|
Re: Qemu + VFIO GPU RadeonRX 550 + AmigaOS4 extremely slow
|
Posted on: 6/21 20:03
#44
|
Just popping in
|
@Hans Yes, I know. And to move the graphics card upwards and upwards to the board is not a solution. At the end, I'll weld the GPU with the CPU for a little more bandwidth.
|
|
|
|
Re: Qemu + VFIO GPU RadeonRX 550 + AmigaOS4 extremely slow
|
Posted on: 6/21 10:32
#45
|
Just popping in
|
@white By asking ChatGPT I understand why RX550 is faster on the primary PCIe slot. It's directly connected to CPU and provides more bandwidth... But as I said it is still slow anyway. I'll see what else I can do. And probably @balaton & @Hans could find some room for improvements on Qemu PPC & RadeonRX.chip. Quote: The primary PCIe slot on your ASUS Z790 D-4 motherboard is faster than the secondary slot due to differences in the number of PCIe lanes and their connectivity to the CPU and chipset. Here are the key reasons:
1. **Direct CPU Connection**: The primary PCIe slot (usually labeled as PCIe x16_1) is typically directly connected to the CPU. This allows it to take full advantage of the maximum number of PCIe lanes and the fastest possible data transfer rates.
2. **Lane Allocation**: Even if you are using an x8 GPU card, the primary slot might be configured to support up to x16 lanes, providing more bandwidth compared to the secondary slot, which might be limited to x8 or even x4 lanes depending on the motherboard design.
3. **Chipset Connectivity**: The secondary PCIe slot is often connected to the chipset rather than directly to the CPU. This connection can introduce additional latency and reduce the available bandwidth because the data has to pass through the chipset before reaching the CPU.
4. **Bandwidth Sharing**: On many motherboards, secondary PCIe slots share bandwidth with other devices, such as M.2 SSDs or additional PCIe slots. This can further reduce the available bandwidth for the secondary PCIe slot compared to the primary slot, which usually has a dedicated connection.
To illustrate this with an example: - **Primary PCIe Slot (PCIe x16_1)**: This slot might operate at PCIe 4.0 x16 or x8 if the GPU only uses 8 lanes, directly connected to the CPU, providing up to 32 GB/s (for x16) or 16 GB/s (for x8) bandwidth. - **Secondary PCIe Slot (PCIe x16_2)**: This slot might operate at PCIe 4.0 x4 or x8, connected through the chipset, providing up to 8 GB/s (for x4) or 16 GB/s (for x8) bandwidth, but with added latency due to the chipset.
In summary, the primary PCIe slot is faster primarily because it has a direct connection to the CPU and typically offers more PCIe lanes with dedicated bandwidth, while the secondary slot has to share resources and connect through the chipset, leading to reduced performance.
|
|
|
|
Re: Qemu + VFIO GPU RadeonRX 550 + AmigaOS4 extremely slow
|
Posted on: 6/21 10:06
#46
|
Just popping in
|
There some (slightly) good news. - I enabled the integrated INTEL GPU. - I replaced my primary GPU (an NVDIIA) with the Radeon RX 550. So, the RX is now plugged into the primary PCIe slot, and the NVIDIA card is removed completely. The improvements are visible. Not that much to say that you can work on this system. It's slow. But it is improved anyway. Here are the results: http://hdrlab.org.nz/benchmark/gfxbench2d/OS/AmigaOS/Result/2794Overall Score: 297.87While on my old test: http://hdrlab.org.nz/benchmark/gfxbench2d/OS/AmigaOS/Result/2784Overall Score: 220.52Now I'm running the GPU test with -cpu 750cxe and I'll post the results when finished.
|
|
|
|
Re: Qemu + VFIO GPU RadeonRX 550 + AmigaOS4 extremely slow
|
Posted on: 6/20 12:48
#47
|
Just popping in
|
@Hans
Thanks a lot for your explanation about the GPU drivers (Windows/Amiga). I'm enjoying this process because I'm learning things. And this is more valuable than achieving the actual (QEMU/AOS4/RX550) goal.
Regarding the slow VRAM => RAM transfer rate on my system, I'm trying different things, but I have not yet found a clue why this is happening.
|
|
|
|
Re: Qemu + VFIO GPU RadeonRX 550 + AmigaOS4 extremely slow
|
Posted on: 6/20 12:34
#48
|
Just popping in
|
@balaton Quote: When testing with x86 guests it might use KVM so guest code runs on the CPU which might give different results. You can add -accel tcg to force it to translate x86 code the same way as it does PPC code which will be slower but might be closer to the PPC guest case. Although PPC and x86 ops are different so the translation is also different but at least it goes through more the same process as with PPC guest. I'm using Windows 11 VM via Virsh QEMU/KVM. And I don't know what to alter in the produced XML configuration in order to enable TCG... I ran the Windows version of @Hans Gfx2d tool again. This time on my host I was monitoring my RX550 GPU via
sudo radeontop
The radeontop metrics going like crazy during the Win11 test. So I believe that it uses the GPU (Of course I can't know better than you, I'm telling what I see). The usage on each metric even goes at 90%. On AmigaOS4, while running @Hans tool, the radeontop metrics are going very slow. Every half of a second it uses just a small portion of the resources of the RX550. For example, 0.83% of the "Graphic Pipeline" metric. Around the same amount of resources are consumed for the other metrics too. Also I found that when I use bboot v0.7 I can get 2G of RAM but Ranger & Sysmon tools report that the CPU clock synchronizes at 999Mhz. When I use the pegasos.rom, I get 1.53Ghz.
Edited by nikitas on 2024/6/20 12:55:41 Edited by nikitas on 2024/6/20 12:56:02
|
|
|
|
Re: Qemu + VFIO GPU RadeonRX 550 + AmigaOS4 extremely slow
|
|
Just popping in
|
@Hans Hm.. Wait a minute. I have another VM with Windows 11 Guest in which I have VFIO'd exactly the same GPU card and I downloaded there your Windows version of Gfx2Bench2D. Your tool is a bit outdated compared to Amiga version but it produced the following results: The Amiga version (on QEMU AOS4.1) has 54 tests and takes 3 hours. The Windows version (on QEMU/KVM Windows 11) has 38 tests but takes... seconds to finish. I don't know if this is a checkmate, but the question remains for me. The results are showing: Better QEMU VFIO handling on x64? Or better handling of the Windows AMD Driver?
------------------------------------------------------------
GfxBench2D 2.9
A benchmark tool for graphics cards.
Written by Hans de Ruiter.
Copyright (C) 2011, by Hans de Ruiter, all rights reserved
------------------------------------------------------------
System Information:
OS: Windows 11 Home (build 22631)
Motherboard/Device: Standard PC (Q35 + ICH9, 2009), manufactured by: QEMU
CPU: 13th Gen Intel(R) Core(TM) i5-13400, @ 2.496 GHz
L1 Cache Size: 65536, L2 Cache Size: 16777216, L3 Cache Size: 16777216
Total RAM: 15.9808 GiB
External Bus (FSB) Speed: 0 Hz
Clock granularity is: 1030us
Tests will take at least 1.030 seconds each.
Initialising Direct2D test.
Board name: Radeon RX550/550 Series
Product ID: 0x699f Vendor ID: 0x1002 SubProduct ID: 0xe468 SubVendor ID: 0x1da2
Card driver: C:\Windows\System32\DriverStore\FileRepository\u0402263.inf_amd64_1366da2d694c570c\B400781\aticfx64.dll,C:\Windows\System32\DriverStore\FileRepository\u0402263.inf_amd64_1366da2d694c570c\B400781\aticfx64.dll,C:\Windows\System32\DriverStore\FileRepository\u0402263.inf_amd64_1366da2d694c570c\B400781\aticfx64.dll,C:\Windows\System32\DriverStore\FileRepository\u0402263.inf_amd64_1366da2d694c570c\B400781\amdxc64.dll (31.0.21912.14)
VRAM: 4 GiB
Display mode: 1920x1080@59 (32 bpp)
WritePixelArray: 2664.938 MiB/s (took 1.142000 seconds).
ReadPixelArray: 3189.887 MiB/s (took 1.122000 seconds).
FillRect:
Size Time (s) Ops/s MPixel/s
(16, 16) 1.109 2896694.319 707.201
(32, 32) 1.126 2819097.691 2753.025
(64, 64) 1.261 2393024.584 9347.752
(128, 128) 1.050 1079047.619 16860.119
(256, 256) 1.158 376311.744 23519.484
(512, 512) 1.106 301297.468 75324.367
(1024, 1024) 1.142 161941.331 161941.331
BltBitMap:
Size Time (s) Ops/s MPixel/s
(16, 16) 1.138 2039821.617 498.003
(32, 32) 1.140 2047721.053 1999.728
(64, 64) 1.139 1953935.909 7632.562
(128, 128) 1.066 1062851.782 16607.059
(256, 256) 1.086 267506.446 16719.153
(512, 512) 1.120 67892.857 16973.214
(1024, 1024) 1.133 15455.428 15455.428
OverlappedBltBitMap:
Size Time (s) Ops/s MPixel/s
(16, 16) 1.125 79613.333 19.437
(32, 32) 1.124 78139.680 76.308
(64, 64) 1.100 76865.455 300.256
(128, 128) 1.086 73211.786 1143.934
(256, 256) 1.139 47367.867 2960.492
(512, 512) 1.133 19743.160 4935.790
(1024, 1024) 3.742 5344.735 5344.735
Composite:
Size Time (s) Ops/s MPixel/s
(16, 16) 1.224 2314133.987 564.974
(32, 32) 1.135 2312889.868 2258.682
(64, 64) 1.130 2223106.195 8684.009
(128, 128) 1.200 944166.667 14752.604
(256, 256) 1.179 246405.428 15400.339
(512, 512) 1.176 63383.503 15845.876
(1024, 1024) 1.137 14174.142 14174.142
NOTE: Compositing (or alpha blending) used premultiplied alpha mode.
CompositeSrcMask:
Size Time (s) Ops/s MPixel/s
(16, 16) 1.135 1742954.185 425.526
(32, 32) 1.130 1724620.354 1684.200
(64, 64) 1.153 1638477.884 6400.304
(128, 128) 1.073 959925.443 14998.835
(256, 256) 1.118 247174.419 15448.401
(512, 512) 1.164 54682.990 13670.747
(1024, 1024) 1.148 11861.498 11861.498
NOTE: The source mask's alpha channel was multiplied by the source bitmap's alpha channel.
NOTE: The source bitmap's alpha channel was premultiplied.
Random:
Time (s) Ops/s MPixel/s
4.078 19617.460 6568.958
Also Geennham's guest CPU syncs at: Motorola MPC 7447/7457 Apollo, 1.2 @ 1.53 GHzWhile my guest CPU: Motorola MPC 7447/7457 Apollo, 1.2 @ 1000 MHzMy host's Intel chipset takes the blame here?
Edited by nikitas on 2024/6/19 7:23:19 Edited by nikitas on 2024/6/19 7:25:23
|
|
|
|
Re: Qemu + VFIO GPU RadeonRX 550 + AmigaOS4 extremely slow
|
Posted on: 6/18 20:17
#50
|
Just popping in
|
@balaton
Using qemu tracing on specific trace-events would be useful in our case? In order too see what TBs are running with & without VFIO. If yes, what events need to be traced?
I'm trying to study some QEMU tracing backends but I'm still missing knowledge to understand what exactly have to do. Anyway, I'm eager to learn QEMU internals as much as I am capable of, of course.
Also, currently, I passed through a real HD drive for my QEMU PegasosII AOS4 VM. I don't know if this can have any positive/negative impact on performance.
Regarding the RadeonRX metrics needed by @Hans, I'm consciously avoiding it, because it is a little boring to De-Vfio the GPU, remove amdgpu blacklist and any other thing might needed.. And then VFIO the GPU again etc. But if it is going to help, I'll have to do it.
You also insisted on using pci.1 instead of pci.0. Indeed, I told you that with VFIO, the pci.1 GPU did not work for me... Pegasos loads the Kickstart and then gets stuck forever.
Has anybody managed to pass through the RX GPU via pci.1 (PegasosII machine)?
Edited by nikitas on 2024/6/18 21:41:31 Edited by nikitas on 2024/6/19 4:12:28
|
|
|
|
Re: Qemu + VFIO GPU RadeonRX 550 + AmigaOS4 extremely slow
|
Posted on: 6/18 10:37
#51
|
Just popping in
|
@white
I am currently conducting a Special Tracing Operation in order to invade QEMU PPC translation blocks and see what is running differently with and without VFIO. Sadly, I think it will take the same or even longer than the "Special Military Operation".
|
|
|
|
Re: Qemu + VFIO GPU RadeonRX 550 + AmigaOS4 extremely slow
|
Posted on: 6/17 22:42
#52
|
Just popping in
|
@balaton @Hans
A minor update. When you disable compositing effects and downgrading resolution to 16 bits, 64k colors, then "Enable Interrupts" in "Screen Mode" works (as slow as without it).
While the workbench windows are struggling to draw, when I use Amistore (written in Hollywood, as I read) it's very fast. I login, navigate through the menu, no visible drawing or delays. Here, using RadeonRX has the "same" speed as the sm501 emulated card. (did not measured anything to say precisely, mentioning only as a user experience).
I don't know if this has to do with anything, just reporting it.
|
|
|
|
Re: Qemu + VFIO GPU RadeonRX 550 + AmigaOS4 extremely slow
|
|
Just popping in
|
@balaton Yes, those OpenSuse, Gentoo PegasosPPC versions seem to be much older than Debian Jessie. - Finnix distro seems incompatible. - Adelie distro QEMU command I tried is:
qemu-system-ppc \
-machine pegasos2 \
-cpu G4 \
-bios pegasos2.rom \
-rtc base=localtime \
-drive if=none,id=CD0,file=adelie.iso,format=raw -device ide-cd,drive=CD0,bus=ide.1 \
-drive if=none,id=DH0,file=hd.img,format=raw -device ide-hd,drive=DH0,bus=ide.0 \
-device VGA,romfile="" \
-serial stdio \
-d guest_errors,unimp
Then, in Pegasos2 bios:
boot cd boot/grubcore.img
It yields this:
OF stdout device is: /failsafe
Preparing to boot Linux version 5.15.132-mc6-easy (builder@ppc64) (gcc (Adelie 8.5.0) 8.5.0, GNU ld (GNU Binutils) 2.41) #1 SMP Sun Nov 12 07:51:09 UTC 2023
Detected machine type: 00000500
command line: BOOT_IMAGE=(ieee1275/ide0)/kernel-ppc root=live:LABEL=Adelie-ppc rd.live.dir=/ rd.live.squashimg=ppc.squashfs
memory layout at init:
memory_limit : 00000000 (16 MB aligned)
alloc_bottom : 05e0e000
alloc_top : 20000000
alloc_top_hi : 20000000
rmo_top : 20000000
ram_top : 20000000
instantiating rtas at 0x0fbfd000... done
boot cpu hw idx 0
Fixing up missing ISA range on Pegasos...
Fixing up IDE interrupt on Pegasos...
Fixing up IDE class-code on Pegasos...
copying OF device tree...
Building dt strings...
Building dt structure...
Device tree strings 0x05e0f000 -> 0x05e0e0a4
Device tree struct 0x05e10000 -> 0x00000000
Quiescing Open Firmware ...
Booting Linux via __start() @ 0x03800000 ...
Linux/PPC 5.15.1
But it stays there forever. At the end of the serial output there is an:
Invalid form of CMPI at 0x0020001c, L = 1
|
|
|
|
Re: Qemu + VFIO GPU RadeonRX 550 + AmigaOS4 extremely slow
|
Posted on: 6/15 22:36
#54
|
Just popping in
|
@balaton @Skateman
Yes, indeed, I managed to install Debian 8.11.0 on QEMU Pegasos2 using the DVD image (as @kas1e said). The RX550 is passed through, I see it with lspci, but there is no amdgpu driver so when I try to start X Server with "radeon", "ati" "fbdev" drivers it fails with several different errors. I could post xorg log file, if it would be useful.
@balaton Regarding the CPU 100% thing I maybe mislead you. On the host is always 100%. On guest, using cpu_watcher I get CPU 100% & CPU 60% load when I remove the focus from the cpu_watcher window. Using tequila there is idle task 70%.
The QEMU documentation page reports only the Debian-8.11.0 that can work with Pegasos2.
I read that Pegasos2 (real machine), is supported also by: - Void Linux PPC - Gentoo Linux
I don't know if it is worthy to try though. If they are maintained or just include the "amdgpu" driver.
Regarding the Debian 8.11.0 which comes with Kernel 3.16, I don't know if I can upgrade the Kernel to 4.20, which, I read, that contains the "amdgpu" driver.
@balaton Regarding the -cpu 750cxe option, it might improve the performance a little or it might be a placebo effect, can't be sure.
The fact that when I "Enable Interrupts" on AmigasOS4 "ScreenMode", and the performance is getting even slower rather than faster (as expected to be faster), what this could mean, I wonder.
|
|
|
|
Re: Qemu + VFIO GPU RadeonRX 550 + AmigaOS4 extremely slow
|
Posted on: 6/13 22:43
#55
|
Just popping in
|
@balaton Ok, so, we leave the VFIO RX550 + AOS4.1FE experiment here, I suppose. Things I forgot to mention: - In AOS4 ScreenMode if you check "Enable Interrupts" the system is even more slow and it eventually freezes. - In AOS4 ScreenMode if you check "Allow Fake Modes", a warning appears in the serial output when using RadeonRX.chip.debug. And the option never gets saved (even after the required reboot). - IN AOS4 Ranger I get the CPU synchronized at 999Mhz. In other systems I've seen in this forum or in youtube videos, the Cpu clock syncs at 1.2Ghz or more. - IN AOS4 Ranger --> Exec --> Resources, the RadeonRX_RM.resource & RadeonRX_DPM.resource are the only ones to have the Type as "Unkonwn" instead of "Resource". If it means something. (Ranger screenshot: https://ibb.co/vQZ7S49). In general the guest system is stable. Programs working properly, MPEG videos are playing in Emotion Player with FORCESOFTWARE=FALSE. But everything very very slow, and the video works with very frequent "hiccups" both audio & graphical. And of course as I already wrote before in every single thing I do the CPU is 100%. If I let it idle drops to 61% or 40%. Regarding the DebianPPC, I've downloaded/installed the iso that has the exact same name as in QEMU docs and initiated the "install/pegasos" file. The installation was completed successfully. 3 partitions have been created. Anyway, I'll see what I can do, hopefully I'll get it working.
Edited by nikitas on 2024/6/13 22:59:54 Edited by nikitas on 2024/6/13 23:00:35
|
|
|
|
Re: Qemu + VFIO GPU RadeonRX 550 + AmigaOS4 extremely slow
|
Posted on: 6/13 19:59
#56
|
Just popping in
|
@balaton Also, I've installed DebianPPC v8.11.0 as per QEMU doc page, in order to see if the VFIOed RX550 adapts better there. The problem is I can't even get it booted. My QEMU command is:
sudo taskset -c 13 qemu-system-ppc \
-machine pegasos2 \
-bios pegasos2.rom \
-rtc base=localtime \
-drive if=none,id=DH0,file=hd.img,format=raw -device ide-hd,drive=DH0,bus=ide.0 \
-device rtl8139,netdev=ETH0 -netdev user,id=ETH0 \
-device bochs-display \
-vga none \
-device VGA,romfile="" \
-serial stdio
Which yields (the well-known...):
UNHANDLED INT 10 FUNCTION 0300 WITHIN EMULATION
UNHANDLED INT 10 FUNCTION 1301 WITHIN EMULATION
UNHANDLED INT 10 FUNCTION 0300 WITHIN EMULATION
UNHANDLED INT 10 FUNCTION 1301 WITHIN EMULATION
entering main read/eval loop...
UNHANDLED INT 10 FUNCTION 0300 WITHIN EMULATION
UNHANDLED INT 10 FUNCTION 1301 WITHIN EMULATION
UNHANDLED INT 10 FUNCTION 0300 WITHIN EMULATION
UNHANDLED INT 10 FUNCTION 1301 WITHIN EMULATION
Then I blindly type:
" /failsafe" io
To get the firmware's "ok" back. Then I tried the following which did not work:
boot hd:0 vmlinuz
It can't follow the symlink? Anyway, then I tried the following:
boot hd:0 vmlinuz-3.16.0-6-powerpc root=/dev/sda2
Which tries to boot but it ends up with a Kernel panic error:
zImage starting: loaded at 0x00400000 (sp: 0x00762fa0)
Allocating 0x8a1ec0 bytes for kernel ...
OF version = 'Pegasos2,1.1'
Trying to claim from 0x400000 to 0x76edfc (0x36edfc) got 400000
gunzipping (0x01800000 <- 0x0040c000:0x00761c62)...done 0x743000 bytes
Linux/PowerPC load: root=/dev/sda2
Finalizing device tree... using OF tree (promptr=01003ad0)
OF stdout device is: /failsafe
Preparing to boot Linux version 3.16.0-6-powerpc (debian-kernel@lists.debian.org) (gcc version 4.8.4 (Debian 4.8.4-1) ) #1 Debian 3.16.56-1+deb8u1 (2018-05-08)
Detected machine type: 00000500
command line: root=/dev/sda2
memory layout at init:
memory_limit : 00000000 (16 MB aligned)
alloc_bottom : 020a6000
alloc_top : 20000000
alloc_top_hi : 20000000
rmo_top : 20000000
ram_top : 20000000
instantiating rtas at 0x0fbfd000... done
Fixing up missing ISA range on Pegasos...
Fixing up IDE interrupt on Pegasos...
Fixing up IDE class-code on Pegasos...
copying OF device tree...
Building dt strings...
Building dt structure...
Device tree strings 0x020a7000 -> 0x020a7746
Device tree struct 0x020a8000 -> 0x020b3000
Calling quiesce...
returning from prom_init
Linux/PPC 3.16.0
arch: exit
Have fun! [ 0.000000] Using CHRP machine description
[ 0.000000] Total memory = 512MB; using 1024kB for hash table (at cff00000)
[ 0.000000] Initializing cgroup subsys cpuset
[ 0.000000] Initializing cgroup subsys cpu
[ 0.000000] Initializing cgroup subsys cpuacct
[ 0.000000] Linux version 3.16.0-6-powerpc (debian-kernel@lists.debian.org) (gcc version 4.8.4 (Debian 4.8.4-1) ) #1 Debian 3.16.56-1+deb8u1 (2018-05-08)
[ 0.000000] chrp type = 6 [Genesi Pegasos]
[ 0.000000] Pegasos l2cr : L2 cache was not active, activating
[ 0.000000] PCI bus 0 controlled by /pci@80000000 at 80000000
[ 0.000000] PCI host bridge /pci@80000000 (primary) ranges:
[ 0.000000] IO 0x00000000fe000000..0x00000000fe00ffff -> 0x0000000000000000
[ 0.000000] MEM 0x0000000080000000..0x00000000bfffffff -> 0x0000000080000000
[ 0.000000] PCI bus 0 controlled by /pci@C0000000 at c0000000
[ 0.000000] PCI host bridge /pci@C0000000 ranges:
[ 0.000000] IO 0x00000000f8000000..0x00000000f800ffff -> 0x0000000000000000
[ 0.000000] MEM 0x00000000c0000000..0x00000000dfffffff -> 0x00000000c0000000
[ 0.000000] Zone ranges:
[ 0.000000] DMA [mem 0x00000000-0x1fffffff]
[ 0.000000] Normal empty
[ 0.000000] HighMem empty
[ 0.000000] Movable zone start for each node
[ 0.000000] Early memory node ranges
[ 0.000000] node 0: [mem 0x00000000-0x1fffffff]
[ 0.000000] Built 1 zonelists in Zone order, mobility grouping on. Total pages: 130048
[ 0.000000] Kernel command line: root=/dev/sda2
[ 0.000000] PID hash table entries: 2048 (order: 1, 8192 bytes)
[ 0.000000] Dentry cache hash table entries: 65536 (order: 6, 262144 bytes)
[ 0.000000] Inode-cache hash table entries: 32768 (order: 5, 131072 bytes)
[ 0.000000] Sorting __ex_table...
[ 0.000000] Memory: 509756K/524288K available (5164K kernel code, 424K rwdata, 1428K rodata, 404K init, 1403K bss, 14532K reserved, 0K highmem)
[ 0.000000] Kernel virtual memory layout:
[ 0.000000] * 0xfffcf000..0xfffff000 : fixmap
[ 0.000000] * 0xff800000..0xffc00000 : highmem PTEs
[ 0.000000] * 0xff7e0000..0xff800000 : early ioremap
[ 0.000000] * 0xe1000000..0xff7e0000 : vmalloc & ioremap
[ 0.000000] NR_IRQS:512 nr_irqs:512 16
[ 0.000000] i8259 legacy interrupt controller initialized
[ 0.000581] clocksource: timebase mult[1e000005] shift[24] registered
[ 0.008751] Console: colour dummy device 80x25
[ 0.009941] pid_max: default: 32768 minimum: 301
[ 0.010588] Security Framework initialized
[ 0.014226] AppArmor: AppArmor disabled by boot time parameter
[ 0.014266] Yama: disabled by default; enable with sysctl kernel.yama.*
[ 0.015430] Mount-cache hash table entries: 1024 (order: 0, 4096 bytes)
[ 0.015457] Mountpoint-cache hash table entries: 1024 (order: 0, 4096 bytes)
[ 0.025745] Initializing cgroup subsys memory
[ 0.026074] Initializing cgroup subsys devices
[ 0.026208] Initializing cgroup subsys freezer
[ 0.026302] Initializing cgroup subsys net_cls
[ 0.026393] Initializing cgroup subsys blkio
[ 0.026467] Initializing cgroup subsys perf_event
[ 0.026529] Initializing cgroup subsys net_prio
[ 0.026979] ftrace: allocating 17989 entries in 53 pages
[ 0.061317] MPC7450 family performance monitor hardware support registered
[ 0.081325] devtmpfs: initialized
[ 0.087127] futex hash table entries: 256 (order: -1, 3072 bytes)
[ 0.094963] NET: Registered protocol family 16
[ 0.111697] PCI: Probing PCI hardware
[ 0.112838] PCI host bridge to bus 0000:00
[ 0.113276] pci_bus 0000:00: root bus resource [io 0x0000-0xffff]
[ 0.113363] pci_bus 0000:00: root bus resource [mem 0x80000000-0xbfffffff]
[ 0.113616] pci_bus 0000:00: root bus resource [bus 00-ff]
[ 0.280689] PCI host bridge to bus 0001:01
[ 0.280744] pci_bus 0001:01: root bus resource [io 0xffff0000-0xffffffff] (bus address [0x0000-0xffff])
[ 0.280770] pci_bus 0001:01: root bus resource [mem 0xc0000000-0xdfffffff]
[ 0.280802] pci_bus 0001:01: root bus resource [bus 01-ff]
[ 0.323889] vgaarb: device added: PCI:0000:00:03.0,decodes=io+mem,owns=none,locks=none
[ 0.323940] vgaarb: loaded
[ 0.323980] vgaarb: bridge control possible 0000:00:03.0
[ 0.325428] SCSI subsystem initialized
[ 0.334704] Switched to clocksource timebase
[ 0.374372] NET: Registered protocol family 2
[ 0.379722] TCP established hash table entries: 4096 (order: 2, 16384 bytes)
[ 0.379809] TCP bind hash table entries: 4096 (order: 2, 16384 bytes)
[ 0.379986] TCP: Hash tables configured (established 4096 bind 4096)
[ 0.380806] TCP: reno registered
[ 0.380880] UDP hash table entries: 256 (order: 0, 4096 bytes)
[ 0.381007] UDP-Lite hash table entries: 256 (order: 0, 4096 bytes)
[ 0.383804] NET: Registered protocol family 1
[ 0.385802] pci 0000:00:0c.1: Fixing VIA IDE, force legacy mode on
[ 0.398965] Thermal assist unit not available
[ 0.400344] ------------[ cut here ]------------
[ 0.400369] WARNING: at /build/linux-aecbOt/linux-3.16.56/kernel/resource.c:638
[ 0.400426] Modules linked in:
[ 0.400684] CPU: 0 PID: 1 Comm: swapper Not tainted 3.16.0-6-powerpc #1 Debian 3.16.56-1+deb8u1
[ 0.400786] task: df81e800 ti: df83a000 task.ti: df83a000
[ 0.400806] NIP: c0047678 LR: c004765c CTR: 00000000
[ 0.400828] REGS: df83bd50 TRAP: 0700 Not tainted (3.16.0-6-powerpc Debian 3.16.56-1+deb8u1)
[ 0.400842] MSR: 00029032 <EE,ME,IR,DR,RI> CR: 84000028 XER: 00000000
[ 0.400961]
[ 0.400961] GPR00: c004765c df83be00 df81e800 c06df368 c06df368 c05ce55c f1003fff c06df37c
[ 0.400961] GPR08: f1003fff 00000001 c06df368 00000000 84000084 00000000 c0004a8c 00000000
[ 0.400961] GPR16: 00000000 00000000 00000000 00000000 00000000 00000000 c07435b0 0000006a
[ 0.400961] GPR24: c066f814 c0740000 df995420 c06e0000 c06e0000 c06df058 c06e1150 c06df368
[ 0.401146] NIP [c0047678] __insert_resource+0x4c/0x1b0
[ 0.401165] LR [c004765c] __insert_resource+0x30/0x1b0
[ 0.401210] Call Trace:
[ 0.401286] [df83be00] [c004765c] __insert_resource+0x30/0x1b0 (unreliable)
[ 0.401365] [df83be10] [c00484b8] insert_resource+0x1c/0x3c
[ 0.401378] [df83be20] [c03600f0] platform_device_add+0xb8/0x294
[ 0.401390] [df83be40] [c0360c24] platform_add_devices+0x48/0xac
[ 0.401401] [df83be60] [c0684230] mv643xx_eth_add_pds+0x40/0x15c
[ 0.401411] [df83be80] [c00043c0] do_one_initcall+0xd0/0x240
[ 0.401423] [df83bef0] [c067757c] kernel_init_freeable+0x15c/0x1f8
[ 0.401433] [df83bf30] [c0004ab0] kernel_init+0x24/0x114
[ 0.401444] [df83bf40] [c0015420] ret_from_kernel_thread+0x5c/0x64
[ 0.401494] Instruction dump:
[ 0.401586] 7c7e1b78 7c9f2378 7fc3f378 7fe4fb78 4bfffacd 7c6a1b79 7d49fa78 7f8af040
[ 0.401616] 7d290034 5529d97e 41820060 419e013c <0f090000> 2f890000 409e0154 810a0000
[ 0.401729] ---[ end trace 02ae67b83e3b81b5 ]---
[ 0.401870] platform orion-mdio: failed to claim resource 0
[ 0.403505] Device 'mv643xx_eth.0' does not have a release() function, it is broken and must be fixed.
[ 0.403537] ------------[ cut here ]------------
[ 0.403546] WARNING: at /build/linux-aecbOt/linux-3.16.56/drivers/base/core.c:250
[ 0.403552] Modules linked in:
[ 0.403622] CPU: 0 PID: 1 Comm: swapper Tainted: G W 3.16.0-6-powerpc #1 Debian 3.16.56-1+deb8u1
[ 0.403632] task: df81e800 ti: df83a000 task.ti: df83a000
[ 0.403639] NIP: c0358730 LR: c0358730 CTR: 00000000
[ 0.403648] REGS: df83bd50 TRAP: 0700 Tainted: G W (3.16.0-6-powerpc Debian 3.16.56-1+deb8u1)
[ 0.403653] MSR: 00029032 <EE,ME,IR,DR,RI> CR: 42008022 XER: 00000000
[ 0.403674]
[ 0.403674] GPR00: c0358730 df83be00 df81e800 0000005a 00000000 00000000 c0851650 000000a4
[ 0.403674] GPR08: c073143c c0730000 c073143c 000000a4 22008022 00000000 c0004a8c 00000000
[ 0.403674] GPR16: 00000000 00000000 00000000 00000000 00000000 00000000 c07435b0 0000006a
[ 0.403674] GPR24: c066f814 c0740000 df995420 c06ba190 df995420 df97a040 c06df1e8 c06df1f0
[ 0.403746] NIP [c0358730] device_release+0xa8/0xb8
[ 0.403756] LR [c0358730] device_release+0xa8/0xb8
[ 0.403761] Call Trace:
[ 0.403770] [df83be00] [c0358730] device_release+0xa8/0xb8 (unreliable)
[ 0.403784] [df83be20] [c0273c34] kobject_cleanup+0xa4/0x1d8
[ 0.403796] [df83be40] [c0360c48] platform_add_devices+0x6c/0xac
[ 0.403807] [df83be60] [c0684230] mv643xx_eth_add_pds+0x40/0x15c
[ 0.403817] [df83be80] [c00043c0] do_one_initcall+0xd0/0x240
[ 0.403828] [df83bef0] [c067757c] kernel_init_freeable+0x15c/0x1f8
[ 0.403839] [df83bf30] [c0004ab0] kernel_init+0x24/0x114
[ 0.403848] [df83bf40] [c0015420] ret_from_kernel_thread+0x5c/0x64
[ 0.403854] Instruction dump:
[ 0.403861] 813f0148 2f890000 419e0010 81290020 2f890000 409effb0 809f0024 2f840000
[ 0.403880] 419e0018 3c60c061 38634bc4 481ac1b9 <0fe00000> 4bffff9c 809f0000 4bffffe8
[ 0.403913] ---[ end trace 02ae67b83e3b81b6 ]---
[ 0.407839] audit: initializing netlink subsys (disabled)
[ 0.408870] audit: type=2000 audit(1718315221.379:1): initialized
[ 0.413575] zbud: loaded
[ 0.414598] VFS: Disk quotas dquot_6.5.2
[ 0.414735] Dquot-cache hash table entries: 1024 (order 0, 4096 bytes)
[ 0.415886] msgmni has been set to 997
[ 0.432062] alg: No test for stdrng (krng)
[ 0.432341] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 252)
[ 0.432965] io scheduler noop registered
[ 0.433021] io scheduler deadline registered
[ 0.433338] io scheduler cfq registered (default)
[ 0.436891] Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled
[ 0.441791] console [ttyS0] disabled
[ 0.465045] serial8250.0: ttyS0 at I/O 0x2f8 (irq = 0, base_baud = 115200) is a 16550A
[ 0.500401] console [ttyS0] enabled
[ 0.501238] pmac_zilog: 0.6 (Benjamin Herrenschmidt <benh@kernel.crashing.org>)
[ 0.501607] Serial: MPC52xx PSC UART driver
[ 0.501926] Generic non-volatile memory driver v1.1
[ 0.502528] Linux agpgart interface v0.103
[ 0.503897] Warning: no ADB interface detected
[ 0.505265] mousedev: PS/2 mouse device common for all mice
[ 0.507134] rtc-generic rtc-generic: rtc core: registered rtc-generic as rtc0
[ 0.508021] ledtrig-cpu: registered to indicate activity on CPUs
[ 0.508774] TCP: cubic registered
[ 0.509146] NET: Registered protocol family 10
[ 0.515909] mip6: Mobile IPv6
[ 0.516196] NET: Registered protocol family 17
[ 0.516529] mpls_gso: MPLS GSO support
[ 0.517976] registered taskstats version 1
[ 0.526068] rtc-generic rtc-generic: setting system clock to 2024-06-13 21:47:02 UTC (1718315222)
[ 0.533644] List of all partitions:
[ 0.533855] No filesystem could mount root, tried:
[ 0.534188] Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)
[ 0.534581] CPU: 0 PID: 1 Comm: swapper Tainted: G W 3.16.0-6-powerpc #1 Debian 3.16.56-1+deb8u1
[ 0.534964] Call Trace:
[ 0.535076] [df83bdc0] [c000966c] show_stack+0xf8/0x1b0 (unreliable)
[ 0.535337] [df83be10] [c0504124] panic+0xdc/0x240
[ 0.535539] [df83be70] [c06779f4] mount_root+0x0/0x68
[ 0.535746] [df83bed0] [c0677bb0] prepare_namespace+0x154/0x19c
[ 0.535982] [df83bef0] [c0677604] kernel_init_freeable+0x1e4/0x1f8
[ 0.536226] [df83bf30] [c0004ab0] kernel_init+0x24/0x114
[ 0.536439] [df83bf40] [c0015420] ret_from_kernel_thread+0x5c/0x64
[ 0.537083] ---[ end Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)
When I try without the failsafe trick. And I blindly type:
boot hd:0 vmlinuz-3.16.0-6-powerpc root=/dev/sda2
(Not the easiest thing to type without seeing what you are writing.) It yields a "Segmentation fault" and the process is terminated:
PegasosII Boot Strap (c) 2002-2003 bplan GmbH
Running on CPU PVR:80020102
Enable L1 ICache... Done.
Clean/Flush Block enabled
Reading W83194 : FAILED.
Setting Front Side Bus to 133MHz... FAILED.
Configuring DDR... Done.
Configuring PCI0... Done.
Configuring PCI1... Done.
Configuring ETH... Done.
Releasing IDE reset ... Done.
Configuring Legacy Devices
Segmentation fault
Any ideas?
Edited by nikitas on 2024/6/13 20:16:34 Edited by nikitas on 2024/6/13 20:17:50
|
|
|
|
Re: Qemu + VFIO GPU RadeonRX 550 + AmigaOS4 extremely slow
|
Posted on: 6/13 19:19
#57
|
Just popping in
|
@balaton I see, so, I'm sending you the results of this one:
sudo perf mem record -z --pid 5717
Converted with:
sudo perf mem report --stdio > perf-mem.data.txt
|
|
|
|
Re: Qemu + VFIO GPU RadeonRX 550 + AmigaOS4 extremely slow
|
Posted on: 6/13 12:58
#58
|
Just popping in
|
Hello, @balaton I ran the perf mem command using the isolated core 13 like this:
sudo perf mem record -sz --call-graph=lbr -c 13
I generated reports with those commands:
sudo perf mem report --stdio > perf-mem.data.txt
And:
sudo perf inject -i perf.data --jit -o perf.data.jitted -f
sudo perf report -i ./perf.data.jitted -f --stdio > per-mem.data.jitted.txt
Is this what you asked for? Sent them via email.
|
|
|
|
Re: Qemu + VFIO GPU RadeonRX 550 + AmigaOS4 extremely slow
|
|
Just popping in
|
@balaton At first, I was using a logical core that corresponded to an Intel E-Core instead of P-Core. I recompiled Qemu with:
../configure --enable-gcrypt --enable-modules --enable-module-upgrades --enable-vhost-user-blk-server --enable-libusb
I did the entire process again and I sent you the perf report via email. Don't know if it makes any more sense now. I read that the 13th Gen Intel® Core™ i5-13400 CPU supports LBR. I will try a Linux PPC distro with Qemu when I have time.
|
|
|
|