Login
Username:

Password:

Remember me



Lost Password?

Register now!

Sections

Who's Online
109 user(s) are online (72 user(s) are browsing Forums)

Members: 0
Guests: 109

more...

Headlines

 
  Register To Post  

« 1 ... 11 12 13 (14) 15 16 17 »
Re: RadeonHD V.5 driver
Home away from home
Home away from home


See User information
@joerg

this PDF is what we need, the Power8 SIMD

https://www.researchgate.net/publicati ... ith_compiler_optimization

saldy no SIMD on P5020/P5040,

(NutsAboutAmiga)

Basilisk II for AmigaOS4
AmigaInputAnywhere
Excalibur
and other tools and apps.
Go to top
Re: RadeonHD V.5 driver
Just popping in
Just popping in


See User information
@arfcarl

Yes, all looks great and it's were it should be.

Rx us working, Sata cable was loose, and I did not see Uboot 🙄... so now I can boot again, just still don't know what is the issue.

Go to top
Re: RadeonHD V.5 driver
Just popping in
Just popping in


See User information
Ok. So, I disabled Compositing and now system will boot, however when I'm trying to turn on Compositing computer with freeze 😭.

EDIT: I disabled Synchronization with vertical refresh. NOW IT WORKS!

And there is power management and my R7 240 is passively cooled NICE.

EDIT: Not so nice. Candi doesn't work (freeze) and CroMagRally the same, freeze.

Go to top
Re: RadeonHD V.5 driver
Home away from home
Home away from home


See User information
@joerg
Quote:
Using an endian-swapping version of GCC might help, like the one which was used for building native x86/x64 Amithlon software.

Only if all software is compiled the same way. Otherwise the drivers would still have to endian-convert all data coming from (or going to) apps and the rest of the system.

@PixelHi

Quote:

Is lack GART support for Sam460 (whatever that is) limitation of Sam460 or its just not easy to implement for it comparing x1000/5000?

In other words will it ever work for Sam460 for RadeonHD/RX?

It's a Sam460 limitation. The Sam460 doesn't have cache coherency. I tried doing manual cache flushing everywhere where it's needed, but couldn't get it working. It would always lock up.

It's unlikely that we'll get it working.

EDIT: That said, it may still be possible to get a performance boost. We've hit some kind of performance wall, even with GART support. When we finally figure out what's going on, there's a chance we'll get a boost even without GART.

Hans

http://hdrlab.org.nz/ - Amiga OS 4 projects, programming articles and more.
https://keasigmadelta.com/ - more of my work
Go to top
Re: RadeonHD V.5 driver
Home away from home
Home away from home


See User information
@PixelHi

Quote:
Ok. So, I disabled Compositing and now system will boot, however when I'm trying to turn on Compositing computer with freeze 😭.

EDIT: I disabled Synchronization with vertical refresh. NOW IT WORKS!

And there is power management and my R7 240 is passively cooled NICE.

EDIT: Not so nice. Candi doesn't work (freeze) and CroMagRally the same, freeze.

This is with the Radeon R7 240? If so, there's a fix coming. I still have a bit of work to do before the beta testers can do their final testing.

Hans

http://hdrlab.org.nz/ - Amiga OS 4 projects, programming articles and more.
https://keasigmadelta.com/ - more of my work
Go to top
Re: RadeonHD V.5 driver
Just popping in
Just popping in


See User information
@Hans

Yes, it's with r7 240. I will try with 250 and 7750, since now at least I can use it 😁. It looks like rx 550 works better at the moment, but I think I settled on r7 240 as more universal. I might try Linux on it.

Old good Amiga days 😏

Hans, do you ask yourself why do you do it?

Go to top
Re: RadeonHD V.5 driver
Just can't stay away
Just can't stay away


See User information
@Hans

About SAM460ex cache coherency.
I don't know anything about that and surely you've tried everything possible, but searching for PPC460EX datasheet I found this:
...
L2 Cache/SRAM
The PPC460EX also provides a 256KB L2 cache between the Processor Local Bus and the processors D- and I-caches. This memory unit can be alternatively programmed to function as 256KB of SRAM.
Features include:
*Four banks of 64KB each
...
*Use as an L2 cache improves processor performance and reduces the PLB load
-Cache coherency maintained by a hardware snoop mechanism on the Low Latency (LL) Processor Local Bus (PLB) or by software
...

Go to top
Re: RadeonHD V.5 driver
Just popping in
Just popping in


See User information
@joerg

Quote:
Using an endian-swapping version of GCC might help, like the one which was used for building native x86/x64 Amithlon software.


I wasn't aware of a specific endian swapping version of GCC. There was an Intel compiler to assist with compiling big endian dependant code on x86.

Also, speaking of the Amithlon compiler, where are the programs compiled for it? I thought Aminet would be full of it, but I can't find them anymore. There's more OS4 software on Aminet!

Go to top
Re: RadeonHD V.5 driver
Just popping in
Just popping in


See User information
@LiveForIt

Quote:
I tried use the reverse load and store instructions I was disappointed with results. If remember correct, I got as poor results as bit mask bits, then shift & Or result code. Yeh it might be I don’t have the instruction, or maybe some trap instruction interrupt bug, some forgot check what CPU has or has not the instructions, not sure how that works.


The instructions should be fine as there is only one needed for read or write direct on memory. Or pointer register direct on PPC.

I did see an issue with GCC endian swap macros. The problem here is it's optimised for x86 which has both register swap and memory R/W swap, or at least optimised for variable data swap. PPC only has memory R/W swap. So if on PPC, the macro wants to swap data already in a register, and PPC doesn't support that. So the code then moves bytes around and ends up as a mess. It really shouldn't be that bad and although PPC has a neat all in one rotate and mask instruction it can't seem to swap bytes quickly. It looks like an after thought. On 68K it's a rol, swap rol. On x86 a bswap.

This is ridiculous really, as PPC can do it natively, though I'm not sure if only some cores support it. Endian is a memory issue at the core. If the macros were designed to R/W from a pointer instead, where the real action is, then it would work well for PPC as it could use the instruction to load reverse from a pointer.


Edited by Hypex on 2023/2/8 12:18:00
Go to top
Re: RadeonHD V.5 driver
Just popping in
Just popping in


See User information
@Hans

Quote:
It's a Sam460 limitation. The Sam460 doesn't have cache coherency. I tried doing manual cache flushing everywhere where it's needed, but couldn't get it working. It would always lock up.


That makes me wonder if Linux cannot use HW acceleration on the Sam graphic drivers? The A1/XE had the same limitation so Linux was slower to use. I actually did enable it to test and it ran well and was fluent on screen. Unfortunately it only lasted a few moments until a system freeze. Said to be related to ring buffer and issues with DMA engine I also recall.

Go to top
Re: RadeonHD V.5 driver
Just can't stay away
Just can't stay away


See User information
@Hypex
Quote:
I wasn't aware of a specific endian swapping version of GCC. There was an Intel compiler to assist with compiling big endian dependant code on x86.
https://aminet.net/package/dev/gcc/x86-ami-gcc

Quote:
Also, speaking of the Amithlon compiler, where are the programs compiled for it? I thought Aminet would be full of it, but I can't find them anymore. There's more OS4 software on Aminet!
http://amithlon.aminet.net/tree
Only seems to work if you enable architecture filtering and i386-amithlon in setup, but there isn't much anyway.


Edited by joerg on 2023/2/7 16:37:34
Edited by joerg on 2023/2/7 16:40:18
Go to top
Re: RadeonHD V.5 driver
Home away from home
Home away from home


See User information
@Hypex
Quote:
That makes me wonder if Linux cannot use HW acceleration on the Sam graphic drivers? The A1/XE had the same limitation so Linux was slower to use. I actually did enable it to test and it ran well and was fluent on screen. Unfortunately it only lasted a few moments until a system freeze. Said to be related to ring buffer and issues with DMA engine I also recall.

The ring buffer is likely the first place where you'd hit problems with lack of cache coherency. However, if you're using an SI card or newer, then the problem could also be that their drivers can't handle big-endian.

IIRC, ACube did have a Linux driver that worked with GART enabled. I don't know for which cards, or how it worked.

I can think of one way to get GART working, and that's to mark all memory used for GART as non-cacheable (or disable the data cache entirely). The L2 cache might need to be disabled too. Needless to say, doing so would come with a serious performance penalty.

Hans

http://hdrlab.org.nz/ - Amiga OS 4 projects, programming articles and more.
https://keasigmadelta.com/ - more of my work
Go to top
Re: RadeonHD V.5 driver
Amigans Defender
Amigans Defender


See User information
https://linuxppc-dev.ozlabs.narkive.co ... he-support-440gx-460ex-gt

There are patches on linux to enable cache coherency on L2. Maybe you can take a look at there

i'm really tired...
Go to top
Re: RadeonHD V.5 driver
Quite a regular
Quite a regular


See User information
@afxgroup

Interesting read:

Latest post at the bottom of the page mentions the following:
Quote:
The L2 cache on the 440GX is cache coherent (via snooping). On the
440SP/440SPe the L2 cache is partially coherent. The LL (Low Latency) PLB
segment is coherent and the HB (High Bandwidth) PLB segment is unfortunately
not. Here an except from the 440SPe users manual:

"
Cache coherency is limited to the Low Latency (LL) PLB bus and is managed by a
hardware snoop mechanism or software (software that is similar to the
existing CPU L1 cache)
"

So we will need to add something to handle the L2 cache on those platforms
correctly. Not needed on 440GX though.

As for 460EX/GT this is currently not clear yet. I'm working on it with AMCC
right now.


In the meantime everything is clear for the PPC460Ex now because this is what can be read from the PPC460Ex datasheet

Quote:

Use as an L2 cache improves processor performance and reduces the PLB load
– Cache coherency maintained by a hardware snoop mechanism on the Low Latency (LL) Processor Local
Bus (PLB) or by software


No other mention of coherency. So still no cache coherency for the HB PLB where PCIe and DDR are located.

Go to top
Re: RadeonHD V.5 driver
Amigans Defender
Amigans Defender


See User information
The problem is that post is 15 years old.. maybe someone can contact the guy to see if something was clear at the end

i'm really tired...
Go to top
Re: RadeonHD V.5 driver
Just popping in
Just popping in


See User information
@geennaam

AFAIK, under AOS4.1 DDR and PCIE are located on the LL PLB segment (and mirrored on the HB) since it uses the settings from U-Boot:

#if defined(CONFIG_440SP) || defined(CONFIG_440SPE) || \
    
defined(CONFIG_460EX) || defined(CONFIG_460GT) || \
    defined
(CONFIG_460SX)
    
/*
     * Enable high bandwidth access
     * This is currently not used, but with this setup
     * it is possible to use it later on in e.g. the Linux
     * EMAC driver for performance gain.
     */
    
mtdcr(SDRAM_PLBADDULL0x00000000); /* MQ0_BAUL */
    
mtdcr(SDRAM_PLBADDUHB0x00000008); /* MQ0_BAUH */

Max Tretene, ACube Systems Srl, Soft3
Go to top
Re: RadeonHD V.5 driver
Quite a regular
Quite a regular


See User information
@m3x

Yes, the DDR controller has a LL port as well. Probably to allow low speed transfers between LL peripherals and DDR without slowing down the high speed segment. But according to the datasheet, there's no direct connection between PCIe and the LL segment. So those transfers are not snooped.

I can't find a programming manual for the 460ex. So I cannot comment on those u-boot registers.

However I can turn the question around. If everything is indeed configured to be routed over the LL bus, then why is cache coherency not working for DDR? LL transfers are snooped according to the datasheet.

My guess would be that the PLB can be segmented in a LL and HS segment. But you can also turn it off. In that case there is just a common PLB4. And no snooping at all.

A second guess would be that only transfers to/from LL peripheral address ranges are snooped. And not all traffic that is routed over the LL segment.

Go to top
Re: RadeonHD V.5 driver
Just can't stay away
Just can't stay away


See User information
@geennaam
No idea about GART, but for example SATA, ethernet and USB drivers do work with DMA on SAM440/460 systems, i.e. there is no complete cache coherency failure like it used to be on the A1-SE and partially A1-XE, and Pegasos-1 systems.

Go to top
Re: RadeonHD V.5 driver
Quite a regular
Quite a regular


See User information
@joerg

DMA works and is not the issue.
Cache coherency is the issue. My soundcard driver didn't work with the sam460 at first.

Analysis of DMA buffers revealed that cache was not coherent with the actual data in memory.

So I had to flush caches manually to make it work.

For the X5000 it just works without manual cache handling.

Go to top
Re: RadeonHD V.5 driver
Just can't stay away
Just can't stay away


See User information
@geennaam
Quote:
DMA works and is not the issue.
Cache coherency is the issue.
DMA (without using manual cache flushes/invalidates, which were required for example on the A1-SE to get anything working at all, but which aren't used by DMA drivers on SAM440/460) can only work if there is working cache coherency.

Edit:
Maybe there is a PCIe-only problem? SATA, and probably USB and ethernet as well, are PCI, not PCIe controllers.

Go to top

  Register To Post
« 1 ... 11 12 13 (14) 15 16 17 »

 




Currently Active Users Viewing This Thread: 1 ( 0 members and 1 Anonymous Users )




Powered by XOOPS 2.0 © 2001-2023 The XOOPS Project