Login
Username:

Password:

Remember me



Lost Password?

Register now!

Sections

Who's Online
28 user(s) are online (23 user(s) are browsing Forums)

Members: 0
Guests: 28

more...

Support us!

Headlines

 
  Register To Post  

« 1 ... 8 9 10 (11)
Re: Pegasos2 with RadeonHD/RX via bridge
Home away from home
Home away from home


See User information
@joerg
Quote:

I guess what Hans meant with "Looks like the driver's PEX bridge code needs updating." is bridge code in his Radeon HD/RX drivers, not something in the AmigaOS kernel.


Yes, seems you are right : i just searched in binarys of debug version of RAdeonRX and RAdeonHD drivers, and find out those strings in both of them:

Enabling blind prefetch on the PEX 8112 bridge
 Cannot enable blind prefetch 
for pci:%ld.%ld,%ldbecause this device doesn't support it


And the same for 8111.

So seems RadeonRX/HD drivers have some configuration magic when detect usage of 8112 or 8111 bridge, but seeing the results of the gfxbench for 8112 it looks like something wrong there.. at least, how else we can explain such a differences with Pericom's one which had default configuration, and faster much in Copy from RAM to VRAM tests, and 2 times faster in from VRAM to RAM copy tests.


ps. seeing datasheet on Pericom's one, find out this:

6.3.20 PREFETCHABLE MEMORY BASE REGISTER – OFFSET 24h......................................................... 30
6.3.21 PREFETCHABLE MEMORY LIMIT REGISTER – OFFSET 24h
........................................................ 30
6.3.22 PREFETCHABLE BASE UPPER 32
-BIT REGISTER – OFFSET 28h................................................. 30
6.3.23 PREFETCHABLE LIMIT UPPER 32
-BIT REGISTER – OFFSET 2Ch............................................... 30


Maybe, if default settings for Pericom's bridge isn't good enough, we can speed it up as well.. At least copy from RAM to VRAM is more or less ok already and will be not noticable in everyday's use , but copy from VRAM to RAM need updating for sure, at least to be on some sane level ..

Join us to improve dopus5!
AmigaOS4 on youtube
Go to top
Re: Pegasos2 with RadeonHD/RX via bridge
Home away from home
Home away from home


See User information
@kas1e

Yes, the drivers have code to reprogram PEX PCI-to-PCIe buses.

The Pericom bridge also has other registers controlling how data prefetching is done.

Hans

Join the Kea Campus - upgrade your skills; support my work; enjoy the Amiga corner.
https://keasigmadelta.com/ - see more of my work
Go to top
Re: Pegasos2 with RadeonHD/RX via bridge
Home away from home
Home away from home


See User information
@everyone
Maybe anyone have a clue how to check if Debian for pegasos2 do support GART for old Radeon9250 ? What we need to know, is if Linux enable full memory coherence (which need it for GART), and if so, we can check how it done, and then do same to make it works on os4 via bridge too (which then, later, will help also QEMU users as well).

Currently, as of now, Radeon drivers running on peg2 via bridge says:

Quote:

RadeonRX (4): GART: num cpu pages 786432, num gpu pages 786432 (3072 MB
RadeonRX (0): Platform doesn't have full memory coherence, Disabling GART.


And then redirecting GART allocation to VRAM.

So we need to know how to enable full memory coherence on pegasos2, so GART will works, and while it will be of no help with ReadPixelArray() or WritePixelArray(), it still will improve things in some areas anyway.

If anyone can point out of how to do it, or, how to check at least if this latest Debian for pegasos2 support full memory coherence for old Radeon9250, that will be helpfull.

Thanks!

Join us to improve dopus5!
AmigaOS4 on youtube
Go to top
Re: Pegasos2 with RadeonHD/RX via bridge
Home away from home
Home away from home


See User information
@all
Just for better understanding that how results of gfxbench looks like for now (for all images hit "open image in new tab" for fullsize):

Copy To/From Ram/Vram:

Resized ImageResized Image

FillRect:

Resized Image

Blit ones:

Resized ImageResized Image

Composite ones:

Resized ImageResized Image


As can be seen, with current state of things (i.e. default and first working version) we can made a conclusion that:

1). Pericom's bridge with PI7C9X111SL chipset works on ~50% better in almost all operations than PLX's one with PEX8112-AA66BI F chipset.

That can be and as cause of the default registers settings and the fact that Pericom's chipset do have in PCI_Status (found by DumpBridge tool) "Fast back-to-back capable" options, which PLX's one didn't have. But that to be seen if it possible to configure both bridges better (from RadeonHD/RX driver's code) to have better results. As of now, it seems Radeons driver can detect at least PLX 8112 bridge, but prefetching didn't set correctly.

2). We do have awful speed of copy from VRAM to RAM back for both bridges. It's just about 20 times (!) slower than with Radeon9250. Yes, Pericom's one again 2 times better, but still too slow.

3). Also we can see, that every small operations, such as 16x16, 32x32, 64x64 and 128x128 are slower than on older Radeon9250. And slower _very_ _very_ much. That was like this for RadeonRX even in compare with RadeonHD.

4). From another side, all operations starting from 256x256 in some cases, but in general starting from 512x512 are much faster, and with each double of size faster a lot. 1024x1024 gives sometime x10 boost in some operations.

That all can explain why currently while for example we do have no problems with VA video playing, or Spencer giving 30-40 FPS on maximum details, we still suffers from small micro-pauses when slide the icons in workbench windowses via scrollbar (read from VRAM to RAM probably play role there).

Another thing, is that currently we don't have GART enabled for pegasos2 too, because as of now, drivers says that "can't enable full memory coherence", so without that GART can't work, but which, if it will work, will give us another boost. Not in WritePixelArray/ReadPixelArray of course (for those configuring of bridges may only help), but in all the other operations.

So ! How to check if full memory coherence is enabled in linux on pegasos2 with Radeon9250 ?:) if we can find out if it possible to have full memory coherence on peg2, we csn have GART for, and it will help QEMU users too


Edited by kas1e on 2024/8/4 8:45:09
Join us to improve dopus5!
AmigaOS4 on youtube
Go to top
Re: Pegasos2 with RadeonHD/RX via bridge
Just can't stay away
Just can't stay away


See User information
@kas1e
It's not only the bridge, for example your copyFromVRAM results with the 9250 AGP card are faster than a X5000 with RadeonRX https://www.hdrlab.org.nz/benchmark/gf ... 2d/OS/AmigaOS/Result/2348
Copy from VRAM 40.33 MiB/s
and a X1000 with a RadeonHD https://www.hdrlab.org.nz/benchmark/gf ... 2d/OS/AmigaOS/Result/1551
Copy from VRAM 41.77 MiB/s

In general is looks like it's faster on the very old R100/R200 Radeon gfx cards than on HD/RX ones, maybe because with those ancient gfx cards other OSes used CPU based accesses instead of GART as well and it was more optimized for that than on the newer gfx cards.

Go to top
Re: Pegasos2 with RadeonHD/RX via bridge
Home away from home
Home away from home


See User information
@joerg
Yep, looks like this is the problem of RadeonHD/RX card, but at least 40Mib/S, is not 2Mib/S as with bridges :) And that probably not issues of PCI 33mhz usage , but of bridge configuring ? As it just too slow.

And interesting to know, on the other systems (windwos, linux) , copy from VRAM also that awfull slow as we have on our systems ? And same for 16x16, 32x32, 64x64 and 128x128 copies ? I.e. even on windows Radeon9250 will be faster in those operations than RadeonHD/RX ?

Join us to improve dopus5!
AmigaOS4 on youtube
Go to top
Re: Pegasos2 with RadeonHD/RX via bridge
Just popping in
Just popping in


See User information
Crazy ChatGIPITTY response : 😁
The Accelerated Graphics Port (AGParchitecture in a Linux system is designed to provide a high-speed point-to-point channel for attaching a video card to a computer's motherboard, primarily to assist in the acceleration of 3D graphics. Here's an overview of the architecture:

### 1. **AGP Overview**
AGP is a dedicated interface between the video card and the motherboardallowing the graphics card to directly access the system memoryThis is crucial for efficient rendering of graphics as it allows textures and other graphic data to be transferred quickly.

### 2. **Key Components of AGP in Linux**

#### 2.1 AGP Bridge
The AGP bridge is a hardware interface that connects the AGP slot on the motherboard to the system’s northbridgeIt manages the communication between the CPUsystem memory, and the AGP card.

#### 2.2 Graphics Address Remapping Table (GART)
GART is a crucial feature of AGP that allows the video card to directly access the system memoryIt handles the mapping of the AGP address space to the physical address spaceenabling the video card to use system memory as texture memory.

#### 2.3 AGP Driver
In Linuxthe AGP driver is responsible for initializing and managing the AGP bridgeThe main AGP driver is typically located at `/usr/src/linux/drivers/char/agp/`. It interacts with the hardware to set up the AGP bridgeconfigure the aperture, and manage the GART.

#### 2.4 Kernel Modules
Several kernel modules are involved in AGP support in Linux:
- `
agpgart`: This is the core AGP support module.
- `
intel-agp`, `nvidia-agp`, etc.: These are chipset-specific AGP drivers that handle specific hardware configurations.

### 3. **AGP Architecture in Linux**

#### 3.1 Initialization
- **Kernel Configuration**: The AGP support is enabled in the kernel configuration (`CONFIG_AGP=y`).
- **
Module Loading**: The `agpgartmodule and appropriate chipset-specific module (e.g., `intel-agp`) are loaded.

#### 3.2 GART Setup
- **Aperture Configuration**: The AGP driver configures the aperture size and base address.
- **
Mapping**: GART maps the AGP address space to the physical system memoryallowing the video card to access large blocks of memory efficiently.

#### 3.3 Memory Management
- **Page Tables**: The AGP driver maintains page tables to keep track of memory mappings.
- **
Caching and Coherence**: The driver ensures memory coherence by managing cache consistency between the CPU and the GPU.

#### 3.4 Communication
- **DMA Transfers**: The AGP interface allows Direct Memory Access (DMAtransfersenabling the video card to read and write system memory without CPU intervention.
- **
Interrupt Handling**: The AGP driver handles interrupts generated by the AGP devicefacilitating efficient communication between the CPU and the GPU.

### 4. **AGP Driver Components**

#### 4.1 Core AGP Driver (`agpgart`)
Initializes the AGP bridge.
Configures the GART.
Manages memory mappings and aperture.

#### 4.2 Chipset-Specific Drivers
Handle chipset-specific initialization and configuration.
Examples include `intel-agp`, `amd-agp`, `nvidia-agp`.

#### 4.3 DRM (Direct Rendering Manager)
Works in conjunction with the AGP driver to provide direct rendering capabilities.
Manages GPU resources and provides an interface for user-space applications to interact with the GPU.

### 5. **AGP in Xorg**
- **AGP Support in Xorg**: The Xorg server utilizes AGP for efficient renderingThe Xorg logs will show AGP card detection and initialization details.
- **
DRI (Direct Rendering Infrastructure)**: Enables direct access to the graphics hardware under the X Window System in a safe and efficient manner.

### 6. **Typical Workflow**

1. **Kernel Boot**: During the boot processthe kernel initializes the AGP subsystem.
2. **Module Loading**: The `agpgart` and chipset-specific AGP modules are loaded.
3. **GART Configuration**: The AGP driver configures the GARTsetting up the aperture and mappings.
4. **Xorg Initialization**: When Xorg startsit initializes the AGP card and configures the rendering pipeline.
5. **Rendering**: Applications utilize the AGP interface via the Direct Rendering Manager to render graphics efficiently.

### 7. **Troubleshooting**

- **dmesg and Xorg Logs**: Checking these logs can provide insight into any issues during AGP initialization or usage.
- **
Kernel Configuration**: Ensuring that the kernel is correctly configured for AGP support is crucial.
- **
Module Compatibility**: Ensuring that the correct AGP and chipset-specific modules are loaded.

### Conclusion

AGP architecture in a Linux system is a complex interaction between hardware and software components designed to provide efficient graphics renderingUnderstanding the role of each component and how they interact can help in configuring and troubleshooting AGP-related issues.

Here's a diagram to reflect the architecture of AGP in a Linux system:

```plaintext
+----------------------------+
|        User Space          |
|                            |
| +----------------------+   |
| |      Xorg Server     |   |
| |                      |   |
| | +------------------+ |   |
| | | Direct Rendering | |   |
| | | Infrastructure   | |   |
| | +------------------+ |   |
| +----------------------+   |
|                            |
+------------|---------------+
             |
+------------|--------------------------------+
|         Kernel Space                        |
|                                             |
| +------------------+     +---------------+  |
| |  DRM Subsystem   |<--->|   AGP Driver  |  |
| | (Direct Rendering|     |   (agpgart)   |  |
| |    Manager)      |     +-------|-------+  |
| +--------|---------+             |          |
|          |                       |          |
| +--------v----------------+  +---v--------+ |
| |Chipset-Specific AGP     |  |  GART      | |
| |Drivers (e.g., intel-agp,|  |  (Graphics | |
| |amd-agp, nvidia-agp)     |  |  Address   | |
| +--------|----------------+  |  Remapping | |
|          |                   |  Table)    | |
| +--------v----------------+  +------------+ |
| |   AGP Bridge            |                 |
| +--------|----------------+                 |
|          |                                   |
+----------v-----------------------------------+
             |
+------------|--------------------------------+
|         Hardware                             |
|                                              |
| +--------------------+   +----------------+  |
| |    System Memory   |<--| AGP Graphics   |  |
| |                    |   |  Card (e.g.,   |  |
| |                    |   |  Radeon 9250)  |  |
| +--------------------+   +----------------+  |
|                                              |
+----------------------------------------------+
```

### Diagram Explanation:

1. **User Space**:
   - **Xorg Server**: The Xorg server is responsible for handling graphical rendering in user space.
   - **Direct Rendering Infrastructure (DRI)**: Part of Xorg that allows direct access to the graphics hardware, facilitating efficient rendering.

2. **Kernel Space**:
   - **DRM Subsystem**: The Direct Rendering Manager handles GPU resource management and provides an interface for user-space applications to interact with the GPU.
   - **AGP Driver (agpgart)**: The core driver that initializes and manages the AGP bridge and GART.
   - **Chipset-Specific AGP Drivers**: These drivers handle the initialization and management of AGP for specific chipsets (e.g., `intel-agp`, `amd-agp`, `nvidia-agp`).
   - **GART (Graphics Address Remapping Table)**: Handles the mapping of AGP address space to physical memory, allowing the GPU to access system memory.

3. **Hardware**:
   - **AGP Bridge**: The hardware interface that connects the AGP slot on the motherboard to the system’s northbridge, managing communication between the CPU, system memory, and the AGP card.
   - **System Memory**: The main memory of the computer that the GPU can access via the GART.
   - **AGP Graphics Card**: The video card that utilizes the AGP interface for high-speed communication with the system memory (e.g., Radeon 9250).

This diagram provides a visual representation of the interaction between different components involved in AGP architecture in a Linux system, from user space applications down to the hardware level.


Sure, here'
s a detailed guide on what to look for in each step:

1. **Check Kernel Configuration**:
   - **
Command**: `grep -i 'gart\|radeon' /boot/config-$(uname -r)`
   - **
What to Look For**: 
     - `
CONFIG_AGP=y`
     - `
CONFIG_AGP_AMD64=y`
     - `
CONFIG_DRM=y`
     - `
CONFIG_DRM_RADEON=y`
     - `
CONFIG_GART_IOMMU=y`
   - 
These entries indicate that AGP (which GART is part of) and Radeon support are compiled into the kernel.

2. **Verify Kernel Modules**:
   - **
Command**: `lsmod | grep 'agp\|radeon'`
   - **
What to Look For**:
     - `
agpgart`
     - `
radeon`
     - `
drm`
     - `
drm_kms_helper`
   - 
These modules being loaded indicates that AGP support and the Radeon DRM driver are active.

3. **Inspect dmesg Output**:
   - **
Command**: `dmesg | grep -i 'gart\|radeon'`
   - **
What to Look For**:
     - 
Lines indicating the initialization of the AGP bridge:
       ```

       [   1.234567] agpgart: Detected AMD GART aperture
       [   1.234567] agpgart: AGP aperture is 256M @ 0xd0000000
       
```
     - 
Lines showing the Radeon driver initialization:
       ```

       [   2.345678] [drm] radeon kernel modesetting enabled.
       [   2.345678] [drm] GART: num cpu pages 65536, num gpu pages 65536
       
```
     - 
Look for any errors or warnings related to AGP or Radeon.

4. **Review Xorg Configuration**:
   - **
Command**: `grep -i 'gart\|radeon' /var/log/Xorg.0.log`
   - **
What to Look For**:
     - 
Entries indicating the use of AGP:
       ```

       (II) RADEON(0): AGP card detected
       (II) RADEON(0): AGP 4x mode enabled
       
```
     - 
Entries showing successful initialization of the Radeon driver:
       ```

       (II) RADEON(0): Direct rendering enabled
       (II) RADEON(0): GART: num cpu pages 65536, num gpu pages 65536
       
```

5. **Check Radeon DRM Driver Documentation**:
   - **
Location**: `/usr/src/linux/Documentation/gpu`
   - **
What to Look For**:
     - 
Look for documentation files such as `radeon.rst`, `drm-mm.rst`, and others that mention GART support and memory coherence.
     - 
Specificallyany notes on enabling AGP GART or specific configuration options.

6. **Test Memory Coherence**:
   - 
This step requires custom code or toolsTypicallyyou would look for tools that can benchmark memory access patterns to verify coherence.
   - **
Example tool**: `memtester`
   - **
Command**: `memtester 1024 5` (where `1024is the amount of memory to test in MB, and `5is the number of iterations)
   - **
What to Look For**: Successful completion without errors indicates good memory coherence.

7. **Refer to Source Code**:
   - **
Location**: The Radeon driver source code in the Linux kernelusually found in the `drivers/gpu/drm/radeon/directory.
   - **
What to Look For**:
     - 
Look for function calls and definitions related to GART initialization and memory coherence.
     - 
Specificallyfunctions like `radeon_gart_table_init`, `radeon_gart_bind`, and `radeon_gart_unbind`.

### Example Outputs
Here are examples of what the outputs might look like:

**
Kernel Configuration**
```
plaintext
CONFIG_AGP=y
CONFIG_AGP_AMD64=y
CONFIG_DRM=y
CONFIG_DRM_RADEON=y
CONFIG_GART_IOMMU=y
```

**
Kernel Modules**
```
plaintext
agpgart                49152  0
radeon               1638400  0
drm_kms_helper        151552  1 radeon
drm                   360448  3 drm_kms_helper,radeon
```

**
dmesg Output**
```
plaintext
[    1.234567] agpgart: Detected AMD GART aperture
[    1.234567] agpgart: AGP aperture is 256M @ 0xd0000000
[    2.345678] [drm] radeon kernel modesetting enabled.
[    2.345678] [drm] GART: num cpu pages 65536, num gpu pages 65536
```

**
Xorg.0.log**
```
plaintext
(II) RADEON(0): AGP card detected
(II) RADEON(0): AGP 4x mode enabled
(II) RADEON(0): Direct rendering enabled
(II) RADEON(0): GART: num cpu pages 65536, num gpu pages 65536
```

By following these steps and looking for these specific entriesyou can determine if GART is supported and if full memory coherence is enabled for the Radeon 9250 on Debian for Pegasos II.

Here's an overview of what the code might look like at each stage or level of the AGP architecture in a Linux system:

### 1. **User Space: Xorg Server and Direct Rendering Infrastructure**

In user space, the Xorg server and Direct Rendering Infrastructure (DRI) interact with the AGP driver via ioctl calls and user-space libraries like libdrm.

#### Xorg Server Initialization
```c
// Sample code to initialize the Xorg server and DRI
#include <xf86drm.h>
#include <xf86drmMode.h>
#include <drm.h>

// Open the DRM device
int fd = open("/dev/dri/card0", O_RDWR | O_CLOEXEC);

// Initialize the DRM device
drmVersionPtr version = drmGetVersion(fd);
if (version) {
    printf("DRM Version: %s\n", version->name);
    drmFreeVersion(version);
}

// Set up DRI
drmSetClientCap(fd, DRM_CLIENT_CAP_UNIVERSAL_PLANES, 1);
drmSetClientCap(fd, DRM_CLIENT_CAP_ATOMIC, 1);

// ... additional setup for Xorg and DRI
```

### 2. **Kernel Space: DRM Subsystem and AGP Driver**

The kernel space involves the Direct Rendering Manager (DRM) subsystem, the AGP driver, and chipset-specific drivers.

#### DRM Subsystem (drm)
```c
// DRM driver code example
#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/drm.h>

static int drm_open(struct inode *inode, struct file *filp) {
    // Open DRM device
    return 0;
}

static struct file_operations drm_fops = {
    .owner = THIS_MODULE,
    .open = drm_open,
    .release = drm_release,
    // ... other file operations
};

static struct drm_driver drm_driver = {
    .driver_features = DRIVER_GEM | DRIVER_MODESET,
    .fops = &drm_fops,
    .name = "my_drm_driver",
    .desc = "My DRM Driver",
    // ... other driver fields
};

static int __init drm_init(void) {
    // Initialize DRM subsystem
    return drm_dev_register(&drm_driver, 0);
}

static void __exit drm_exit(void) {
    drm_dev_unregister(&drm_driver);
}

module_init(drm_init);
module_exit(drm_exit);

MODULE_LICENSE("GPL");
```

#### AGP Driver (agpgart)
```c
// AGP driver code example
#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/agp_backend.h>

static int agp_open(struct inode *inode, struct file *filp) {
    // Open AGP device
    return 0;
}

static struct file_operations agp_fops = {
    .owner = THIS_MODULE,
    .open = agp_open,
    .release = agp_release,
    // ... other file operations
};

static struct agp_bridge_data my_agp_bridge = {
    .dev = &my_pci_dev,
    .aperture_size = 256 * 1024 * 1024, // 256 MB
    // ... other bridge fields
};

static int __init agp_init(void) {
    // Initialize AGP subsystem
    return agp_add_bridge(&my_agp_bridge);
}

static void __exit agp_exit(void) {
    agp_remove_bridge(&my_agp_bridge);
}

module_init(agp_init);
module_exit(agp_exit);

MODULE_LICENSE("GPL");
```

### 3. **Hardware Interaction**

In the kernel space, the interaction with hardware is handled by chipset-specific drivers and the core AGP driver. The GART is set up and managed to allow the GPU to access system memory.

#### Chipset-Specific AGP Driver (intel-agp, amd-agp, etc.)
```c
// Chipset-specific AGP driver code example
#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/pci.h>
#include <linux/agp_backend.h>

static int intel_agp_init(struct pci_dev *pdev, const struct pci_device_id *ent) {
    struct agp_bridge_data *bridge;

    // Allocate and initialize AGP bridge
    bridge = kzalloc(sizeof(*bridge), GFP_KERNEL);
    if (!bridge)
        return -ENOMEM;

    bridge->dev = &pdev->dev;
    bridge->aperture_size = 128 * 1024 * 1024; // 128 MB

    // Add the bridge to the AGP subsystem
    return agp_add_bridge(bridge);
}

static void intel_agp_remove(struct pci_dev *pdev) {
    struct agp_bridge_data *bridge = pci_get_drvdata(pdev);

    // Remove the AGP bridge
    agp_remove_bridge(bridge);
    kfree(bridge);
}

static struct pci_device_id intel_agp_pci_table[] = {
    { PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82845G) },
    { 0 }
};

static struct pci_driver intel_agp_pci_driver = {
    .name = "intel_agp",
    .id_table = intel_agp_pci_table,
    .probe = intel_agp_init,
    .remove = intel_agp_remove,
};

module_pci_driver(intel_agp_pci_driver);

MODULE_LICENSE("GPL");
```

#### GART Management
```c
// Example GART setup in the AGP driver
static int setup_gart(struct agp_bridge_data *bridge) {
    // Allocate GART table
    bridge->gart_table = dma_alloc_coherent(bridge->dev, bridge->aperture_size,
                                            &bridge->gart_bus_addr, GFP_KERNEL);
    if (!bridge->gart_table)
        return -ENOMEM;

    // Initialize GART entries
    for (int i = 0; i < bridge->aperture_size / PAGE_SIZE; i++) {
        bridge->gart_table[i] = cpu_to_le32(0xDEADBEEF);
    }

    return 0;
}

static int my_agp_probe(struct pci_dev *pdev, const struct pci_device_id *id) {
    struct agp_bridge_data *bridge;

    // Initialize AGP bridge data
    bridge = kzalloc(sizeof(*bridge), GFP_KERNEL);
    if (!bridge)
        return -ENOMEM;

    // Set up GART
    if (setup_gart(bridge)) {
        kfree(bridge);
        return -ENOMEM;
    }

    // Add the bridge to the AGP subsystem
    return agp_add_bridge(bridge);
}

static struct pci_device_id my_agp_pci_table[] = {
    { PCI_DEVICE(PCI_VENDOR_ID_MYCHIPSET, PCI_DEVICE_ID_MYAGP) },
    { 0 }
};

static struct pci_driver my_agp_pci_driver = {
    .name = "my_agp",
    .id_table = my_agp_pci_table,
    .probe = my_agp_probe,
    .remove = my_agp_remove,
};

module_pci_driver(my_agp_pci_driver);

MODULE_LICENSE("GPL");
```

This example code provides a simplified view of the various stages and levels involved in the AGP architecture in a Linux system, from user space initialization to kernel space management and hardware interaction.

For engineers, providing additional context, clear explanations, and practical examples can help bridge the gap between high-level concepts and low-level implementation details. Let'
s refine the explanation to ensure it is comprehensible and actionable for engineers.

### Overview of AGP in Linux

#### 1. What is AGP?
AGP (Accelerated Graphics Portis an interface specification designed for the fast transfer of 3D graphics data between the video card and the main memory.

#### 2. Key Components:
- **AGP Bridge**: Hardware that connects the AGP slot to the system’s memory and CPU.
- **
GART (Graphics Address Remapping Table)**: Allows the GPU to access system memory.
- **
AGP Driver**: Software that manages AGP bridge and GART.

### Steps to Implement AGP Support

#### 1. **User Space: Xorg Server and Direct Rendering Infrastructure**

**Example Code**:
```
c
#include <xf86drm.h>
#include <xf86drmMode.h>
#include <drm.h>

int fd = open("/dev/dri/card0", O_RDWR | O_CLOEXEC);

drmVersionPtr version = drmGetVersion(fd);
if (version) {
    printf("DRM Version: %s\n", version->name);
    drmFreeVersion(version);
}

drmSetClientCap(fd, DRM_CLIENT_CAP_UNIVERSAL_PLANES, 1);
drmSetClientCap(fd, DRM_CLIENT_CAP_ATOMIC, 1);
```

**
Explanation**:
This code opens the DRM (Direct Rendering Managerdevice and initializes itDRM is responsible for managing graphics resources and enabling direct rendering.

#### 2. **Kernel Space: DRM Subsystem and AGP Driver**

**DRM Subsystem Example**:
```
c
#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/drm.h>

static int drm_open(struct inode *inode, struct file *filp) {
    return 0;
}

static struct file_operations drm_fops = {
    .owner = THIS_MODULE,
    .open = drm_open,
    .release = drm_release,
};

static struct drm_driver drm_driver = {
    .driver_features = DRIVER_GEM | DRIVER_MODESET,
    .fops = &drm_fops,
    .name = "my_drm_driver",
    .desc = "My DRM Driver",
};

static int __init drm_init(void) {
    return drm_dev_register(&drm_driver, 0);
}

static void __exit drm_exit(void) {
    drm_dev_unregister(&drm_driver);
}

module_init(drm_init);
module_exit(drm_exit);

MODULE_LICENSE("GPL");
```

**
Explanation**:
This module defines a simple DRM driver with basic open and release operationsDRM handles GPU resources and enables direct rendering.

**
AGP Driver Example**:
```
c
#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/agp_backend.h>

static int agp_open(struct inode *inode, struct file *filp) {
    return 0;
}

static struct file_operations agp_fops = {
    .owner = THIS_MODULE,
    .open = agp_open,
    .release = agp_release,
};

static struct agp_bridge_data my_agp_bridge = {
    .dev = &my_pci_dev,
    .aperture_size = 256 * 1024 * 1024,
};

static int __init agp_init(void) {
    return agp_add_bridge(&my_agp_bridge);
}

static void __exit agp_exit(void) {
    agp_remove_bridge(&my_agp_bridge);
}

module_init(agp_init);
module_exit(agp_exit);

MODULE_LICENSE("GPL");
```

**
Explanation**:
This module sets up an AGP bridgeinitializing it with a specified aperture size and adding it to the AGP subsystem.

#### 3. **Hardware Register-Level Interaction**

**Setting AGP Aperture**:
```
c
#include <linux/pci.h>
#include <linux/io.h>

#define AGP_APERTURE_SIZE_OFFSET 0x10
#define AGP_APERTURE_BASE_OFFSET 0x14

static void configure_agp_aperture(struct pci_dev *pdev) {
    u32 aperture_size = 128 * 1024 * 1024;
    u32 aperture_base = 0xd0000000;

    pci_write_config_dword(pdev, AGP_APERTURE_SIZE_OFFSET, aperture_size);
    pci_write_config_dword(pdev, AGP_APERTURE_BASE_OFFSET, aperture_base);
}
```

**
Explanation**:
This code configures the AGP aperture size and base address by writing to the appropriate PCI configuration space registers.

**
Setting up GART**:
```
c
#include <linux/dma-mapping.h>
#include <linux/pci.h>

#define GART_BASE_ADDRESS_REGISTER 0x58

static int setup_gart(struct pci_dev *pdev) {
    dma_addr_t gart_bus_addr;
    u32 *gart_table;
    size_t gart_size = 128 * 1024;

    gart_table = dma_alloc_coherent(&pdev->dev, gart_size, &gart_bus_addr, GFP_KERNEL);
    if (!gart_table)
        return -ENOMEM;

    for (int i = 0; i < gart_size / sizeof(u32); i++) {
        gart_table[i] = cpu_to_le32(0xDEADBEEF);
    }

    pci_write_config_dword(pdev, GART_BASE_ADDRESS_REGISTER, gart_bus_addr);

    return 0;
}
```

**
Explanation**:
This function allocates memory for the GART tableinitializes it, and sets the base address in the GPU's GART register.

**Enabling AGP Mode**:
```c
#include <linux/pci.h>

#define AGP_STATUS_REGISTER 0x4
#define AGP_COMMAND_REGISTER 0x8

static void enable_agp_mode(struct pci_dev *pdev) {
    u32 agp_status;
    u32 agp_command;

    pci_read_config_dword(pdev, AGP_STATUS_REGISTER, &agp_status);

    agp_command = agp_status & 0x0000000F;
    pci_write_config_dword(pdev, AGP_COMMAND_REGISTER, agp_command | 0x00000001);
}
```

**Explanation**:
- This function enables AGP mode by reading the status register, configuring the command register, and writing the appropriate value to enable AGP.

**Interrupt Handling**:
```c
#include <linux/interrupt.h>
#include <linux/pci.h>

#define AGP_INTERRUPT_STATUS_REGISTER 0xC0
#define AGP_INTERRUPT_ENABLE_REGISTER 0xC4

static irqreturn_t agp_interrupt_handler(int irq, void *dev_id) {
    struct pci_dev *pdev = dev_id;
    u32 status;

    pci_read_config_dword(pdev, AGP_INTERRUPT_STATUS_REGISTER, &status);

    if (status & 0x1) {
        pci_write_config_dword(pdev, AGP_INTERRUPT_STATUS_REGISTER, status);
        return IRQ_HANDLED;
    }

    return IRQ_NONE;
}

static int enable_agp_interrupts(struct pci_dev *pdev) {
    int irq = pdev->irq;
    int ret;

    ret = request_irq(irq, agp_interrupt_handler, IRQF_SHARED, "agp_irq", pdev);
    if (ret)
        return ret;

    pci_write_config_dword(pdev, AGP_INTERRUPT_ENABLE_REGISTER, 0x1);

    return 0;
}
```

**Explanation**:
- This code sets up an interrupt handler for the AGP device, enabling interrupts and handling them by clearing the interrupt flag.

### Conclusion
These detailed explanations and code examples provide a comprehensive overview of AGP architecture in Linux, from user space initialization to kernel space management and hardware register-level interactions. This should be sufficient for engineers to understand and implement AGP support in a Linux system.

To comprehensively compile and implement AGP support with GART for the Radeon 9250 on OS4, you need to consider several aspects beyond just the code itself. Here’s a detailed list of what you need:

### Development Environment Setup

1. **Toolchain**
   - Ensure you have a compatible C/C++ compiler (e.g., GCC or Clang) installed for OS4.
   - Set up necessary build tools (e.g., make, cmake).

2. **Development Libraries**
   - Install or have access to the required development libraries, such as those for handling PCI configuration space and memory allocation (similar to `libpci` and `libdrm` in Linux).

3. **Kernel Source and Headers**
   - Obtain the kernel source code for OS4 to ensure you can compile kernel modules.
   - Ensure you have the necessary headers for kernel development.

### Detailed Implementation Steps

#### 1. **Configure AGP Bridge and GART**

**AGP Bridge Configuration:**
- Implement functions to configure the AGP bridge’s aperture size and base address.

```c
#include <os4_pci.h> // Hypothetical header for PCI functions in OS4

#define AGP_APERTURE_SIZE_OFFSET 0x10
#define AGP_APERTURE_BASE_OFFSET 0x14

void os4_configure_agp_aperture(struct pci_dev *pdev) {
    uint32_t aperture_size = 128 * 1024 * 1024; // 128 MB
    uint32_t aperture_base = 0xd0000000;        // Base address
    
    os4_pci_write_config_dword(pdev, AGP_APERTURE_SIZE_OFFSET, aperture_size);
    os4_pci_write_config_dword(pdev, AGP_APERTURE_BASE_OFFSET, aperture_base);
}
```

**GART Setup:**
- Implement GART setup and memory allocation functions.

```c
#include <os4_dma.h> // Hypothetical header for DMA functions in OS4

#define GART_BASE_ADDRESS_REGISTER 0x58

int os4_setup_gart(struct pci_dev *pdev) {
    dma_addr_t gart_bus_addr;
    uint32_t *gart_table;
    size_t gart_size = 128 * 1024; // Example size for the GART table

    // Allocate memory for the GART table
    gart_table = os4_dma_alloc_coherent(gart_size, &gart_bus_addr);
    if (!gart_table)
        return -ENOMEM;

    // Initialize the GART table entries
    for (int i = 0; i < gart_size / sizeof(uint32_t); i++) {
        gart_table[i] = cpu_to_le32(0xDEADBEEF); // Placeholder address
    }

    // Write the GART base address to the GPU register
    os4_pci_write_config_dword(pdev, GART_BASE_ADDRESS_REGISTER, gart_bus_addr);

    return 0;
}
```

#### 2. **Radeon Driver Implementation**

**Radeon Driver Initialization:**
- Implement functions for initializing the Radeon driver and setting up GART.

```c
#include <os4_drm.h> // Hypothetical header for DRM functions in OS4

int os4_radeon_init(struct pci_dev *pdev) {
    // Initialize AGP aperture and GART
    os4_configure_agp_aperture(pdev);
    if (os4_setup_gart(pdev))
        return -ENOMEM;

    // Additional initialization for Radeon driver
    // ...

    return 0;
}
```

#### 3. **Enable Full Memory Coherence**

- Ensure memory coherence by implementing cache management and memory barrier functions as required by the hardware and OS4.

```c
#include <os4_memory.h> // Hypothetical header for memory management in OS4

void os4_memory_coherence_setup() {
    // Implement memory barriers and cache coherence mechanisms
    // For example:
    os4_memory_barrier();
    os4_cache_flush();
}
```

#### 4. **Interrupt Handling**

**Setup Interrupt Handler:**
- Implement an interrupt handler for AGP and configure the interrupt registers.

```c
#include <os4_interrupt.h> // Hypothetical header for interrupt handling in OS4

#define AGP_INTERRUPT_STATUS_REGISTER 0xC0
#define AGP_INTERRUPT_ENABLE_REGISTER 0xC4

irqreturn_t os4_agp_interrupt_handler(int irq, void *dev_id) {
    struct pci_dev *pdev = dev_id;
    uint32_t status;

    // Read the interrupt status register
    os4_pci_read_config_dword(pdev, AGP_INTERRUPT_STATUS_REGISTER, &status);

    if (status & 0x1) {
        // Handle the interrupt (e.g., clear the interrupt flag)
        os4_pci_write_config_dword(pdev, AGP_INTERRUPT_STATUS_REGISTER, status);
        return IRQ_HANDLED;
    }

    return IRQ_NONE;
}

int os4_enable_agp_interrupts(struct pci_dev *pdev) {
    int irq = pdev->irq;
    int ret;

    // Request an IRQ and register the interrupt handler
    ret = os4_request_irq(irq, os4_agp_interrupt_handler, IRQF_SHARED, "agp_irq", pdev);
    if (ret)
        return ret;

    // Enable interrupts in the AGP device
    os4_pci_write_config_dword(pdev, AGP_INTERRUPT_ENABLE_REGISTER, 0x1);

    return 0;
}
```

### Testing and Validation

1. **Compile and Load Modules:**
   - Compile the AGP and Radeon drivers as kernel modules for OS4.
   - Load the modules and check for successful initialization.

2. **Check Logs:**
   - Use `dmesg` or OS4-specific logging tools to verify the AGP and Radeon initialization logs.

3. **Run Diagnostic Tools:**
   - Create or use existing diagnostic tools to verify GART setup, memory coherence, and overall AGP functionality.

4. **Performance Testing:**
   - Test with graphics-intensive applications to ensure proper functioning and performance gains.

5. **QEMU Testing:**
   - If applicable, test the implementation in QEMU to ensure compatibility and broader usage.

### Documentation and Support

- **Document the Implementation:**
  - Ensure all steps, functions, and configurations are well-documented to help other developers understand and maintain the code.

- **Community Support:**
  - Engage with the OS4 and broader developer communities to get feedback, report bugs, and receive support for the implementation.

By following these detailed steps, you can implement and compile comprehensive AGP support with GART for the Radeon 9250 on OS4, ensuring compatibility and performance improvements similar to those on Debian for Pegasos II.

Only for the brave 😁

it’s markdown, .md (Typora on win)


Edited by walkero on 2024/8/6 14:53:09
Go to top
Re: Pegasos2 with RadeonHD/RX via bridge
Just can't stay away
Just can't stay away


See User information
@kas1e
Hans' GfxBench2D tool isn't only available for AmigaOS but for Windows as well: https://lists.hdrlab.org.nz/benchmark/gfxbench2d/OS/Windows

In the Windows version there is no copyTo/FromVRAM as that's not possible on this OS, however Hans may still have some stats on how the same Radeon HD/RX cards on Windows compare to AmigaOS.

Go to top
Re: Pegasos2 with RadeonHD/RX via bridge
Home away from home
Home away from home


See User information
@joerg

I run windows version of GfxBench on my windows machine on Notebook with discrete card Intel(R) HD Graphics 620 and Radeon R5 M330, results:

Intel(R) HD Graphics 620:

WritePixelArray3327.288 MiB/(took 0.028000 seconds).
ReadPixelArray3023.148 MiB/(took 1.094000 seconds).

FillRect:
Size                Time (s)           Ops/s        MPixel/s
(1616)               1.103     1071065.277         261.491
(3232)               1.137      772589.270         754.482
(6464)               1.110      377338.739        1473.979
(128128)             1.110      105314.414        1645.538
(256256)             1.027       97544.304        6096.519
(512512)             1.054       56323.529       14080.882
(10241024)           1.038       68847.784       68847.784

BltBitMap
:
Size                Time (s)           Ops/s        MPixel/s
(1616)               1.044      896791.188         218.943
(3232)               1.113      752542.677         734.905
(6464)               1.134      441296.296        1723.814
(128128)             1.121      136485.281        2132.583
(256256)             1.046       38172.084        2385.755
(512512)             1.151       10196.351        2549.088
(10241024)           4.505        2219.756        2219.756

OverlappedBltBitMap
:
Size                Time (s)           Ops/s        MPixel/s
(1616)               1.118       36493.739           8.910
(3232)               1.114       35526.032          34.693
(6464)               1.173       32869.565         128.397
(128128)             1.129       28394.154         443.659
(256256)             1.400       14285.714         892.857
(512512)             4.767        4195.511        1048.878
(10241024)          18.603        1075.095        1075.095

Composite
:
Size                Time (s)           Ops/s        MPixel/s
(1616)               1.135     1004385.022         245.211
(3232)               1.099      822902.639         803.616
(6464)               1.120      380141.071        1484.926
(128128)             1.053      111667.616        1744.807
(256256)             1.110       30354.054        1897.128
(512512)             1.287        7770.008        1942.502
(10241024)           2.240        4464.286        4464.286


Can't run for Radeon R5 M330 , as seems GfxBench choose the first one to test with.

As can be seen, 16x16, 32x32 and 64x64 is not that bad: yes, they not ultra fast, but they faster in 2-3 times than old Radeon9250 ones. In case with our drivers, we have some pretty bad results for those :( And i don't mean pegasos2 there, but in whole, on any platforms (especially on x5000).

Also WritePixelArray() and ReadPixelArary() (dunno through what kind of functions it used on windows, is windows have those?) gives the same results, while ReadPixelArray() should mean reading from VRAM to RAM mainly, so should be slower too ?


Edited by kas1e on 2024/8/4 18:54:55
Join us to improve dopus5!
AmigaOS4 on youtube
Go to top
Re: Pegasos2 with RadeonHD/RX via bridge
Just can't stay away
Just can't stay away


See User information
@kas1e
Quote:
Also WritePixelArray() and ReadPixelArary() (dunno through what kind of functions it used on windows, is windows have those?) gives the same results, while ReadPixelArray() should mean reading from VRAM to RAM mainly, so should be slower too ?
I don't know how those functions work in the Windows versions of GfxBench2D, but on AmigaOS it's
- Read/WritePixelArray() on supported platforms, like Sam4x0, X1000, X5000 and A1222, use larger (= faster) DMA transfers.
- On Platforms without any DMA support (A1 SE/XE/µA1 and Peg2) those functions are basically the same as copyTo/FromVRAM and the CPU does the copies. On CPUs with AltiVec support (A1/Peg2 with G4 CPU, X1000) it should be at least twice as fast as on CPUs without AltiVec (classic Amigas, A1/Peg2 with a G3 CPU, Sam4x0, X5000).

Go to top
Re: Pegasos2 with RadeonHD/RX via bridge
Not too shy to talk
Not too shy to talk


See User information
Radeon RX580 on Win11.

Test failedYour graphics card drivers may be faulty.
WritePixelArray9499.648 MiB/(took 0.322000 seconds).
Test failedYour graphics card drivers may be faulty.
ReadPixelArray9500.283 MiB/(took 0.228000 seconds).

FillRect:
Size                Time (s)           Ops/s        MPixel/s
(1616)               1.124     2130674.377         520.184
(3232)               1.041     2134486.071        2084.459
(6464)               1.085     2047926.267        7999.712
(128128)             1.110     1811665.766       28307.278
(256256)             1.084      379595.018       23724.689
(512512)             1.109      308310.189       77077.547
(10241024)           1.096      184306.569      184306.569

BltBitMap
:
Size                Time (s)           Ops/s        MPixel/s
(1616)               1.072     2081001.866         508.057
(3232)               1.110     2061550.450        2013.233
(6464)               1.128     1969858.156        7694.758
(128128)             1.062     1743564.972       27243.203
(256256)             1.105      628393.665       39274.604
(512512)             1.114      158302.513       39575.628
(10241024)           1.109       39753.832       39753.832

OverlappedBltBitMap
:
Size                Time (s)           Ops/s        MPixel/s
(1616)               1.140      104230.702          25.447
(3232)               1.088      109799.632         107.226
(6464)               1.091      108912.007         425.438
(128128)             1.104       99146.739        1549.168
(256256)             1.106       79723.327        4982.708
(512512)             1.110       39639.640        9909.910
(10241024)           1.721       11621.150       11621.150

Composite
:
Size                Time (s)           Ops/s        MPixel/s
(1616)               1.109     2311533.814         564.339
(3232)               1.109     2290130.748        2236.456
(6464)               1.025     2167804.878        8467.988
(128128)             1.117     1915464.637       29929.135
(256256)             1.180      553838.136       34614.883
(512512)             1.040      148370.192       37092.548
(10241024)           1.124       34803.381       34803.381
NOTE
Compositing (or alpha blendingused premultiplied alpha mode.

CompositeSrcMask:
Size                Time (s)           Ops/s        MPixel/s
(1616)               1.024     1356201.172         331.104
(3232)               1.022     1358855.186        1327.007
(6464)               1.204     1318224.252        5149.313
(128128)             1.035     1192699.517       18635.930
(256256)             1.142      542591.068       33911.942
(512512)             1.063      141237.065       35309.266
(10241024)           1.119       33205.541       33205.541
NOTE
The source mask's alpha channel was multiplied by the source bitmap's alpha channel.
NOTEThe source bitmap's alpha channel was premultiplied.

Random:
Time (s)               Ops/s        MPixel/s
       1.642       48721.072       16296.230

------------------------------------------------------------
Some tests failed, so the result cannot be submitted to the results server

Go to top
Re: Pegasos2 with RadeonHD/RX via bridge
Just can't stay away
Just can't stay away


See User information
@kas1e
Quote:
And interesting to know, on the other systems (windwos, linux) , copy from VRAM also that awfull slow as we have on our systems ?
Yes, check for example Georg's i5-4590, GeForce RTX 2060, Linux X11 results in
https://www.amigans.net/modules/newbb/ ... id=149012#forumpost149012
and (converted from MB to MiB by Hans)
https://www.amigans.net/modules/newbb/ ... id=149096#forumpost149096
Vesa driver with shadowfb disabled:
     
read (    20.4 MiB/sec): ShmGetImage 500x500 square
     write
(   306.1 MiB/sec): ShmPutImage 500x500 square

Of course not as slow as your Radeon RX with bridge results on Pegasos2, but slower than some X1000 and X5000 with Radeon HD or RX gfx card, and the read (copyFromVRAM) even slower than your AGP Radeon9250.

Go to top
Re: Pegasos2 with RadeonHD/RX via bridge
Home away from home
Home away from home


See User information
@all
As interesting as the GfxBench2D results are, does anyone know if GART is enabled in the Pegasos 2 Linux kernel? The Marvell chipset documentation does mention memory coherency, but it would be great to know if it actually works.

If it works, then the next question would be how to set it up properly...

Hans

Join the Kea Campus - upgrade your skills; support my work; enjoy the Amiga corner.
https://keasigmadelta.com/ - see more of my work
Go to top
Re: Pegasos2 with RadeonHD/RX via bridge
Home away from home
Home away from home


See User information
@all
Small steps to improve things going on:

After some tests, we find that on Pegasos 2 the bus number that the RadeonHD/RX cards is plugged in doesn't match the bus number that's programmed into the bridge. That fixed now in peg2's kernel, and so, the code in RadeonHD/RX for enabling prefetching (radeon drivers have some for 8111 and 8112 plx chips), can detect the chips on pegasos2 too.

Now, we find out that seems that code for enabing prefetching never worked properly : memory regions for making prefetch working is wronly programmed. At least i tested plx bridge on x5000 as well and that what dumpbridge tool says:

Quote:

PCI_IO_BASE (full): 0x1000
PCI_IO_LIMIT (full): 0x1fff
PCI_MEMORY_BASE (full): 0xe0100000
PCI_MEMORY_LIMIT (full): 0xffffffff
PCI_PREF(ETCH)_MEMORY_BASE (full): 0x10000000
PCI_PREF(ETCH)_MEMORY_LIMIT (full): 0xfffff


Same on pegasos2 now. On x1000 cant test , as it didnt boot when i tried to use plx based bridge in, and on sam460 i have one single pci which is busy with sata card, so cant test if on all platforms memory regions for prefetch is wronly programmed. But on x5k and peg2 that for sure.

So.. next step to fix memory ranges to see if correct prefetching will improve situation on plx based bridge, and then to try to improve things also for Pericom's bridge, as this one looks much better with its default state in compare with plx one.

And, still question remain: is there any prooved to work code for peg2 hardware does not matter on what OS doing full memory coherence so we can see how ? Any help apprecated!

Join us to improve dopus5!
AmigaOS4 on youtube
Go to top

  Register To Post
« 1 ... 8 9 10 (11)

 




Currently Active Users Viewing This Thread: 2 ( 0 members and 2 Anonymous Users )




Powered by XOOPS 2.0 © 2001-2023 The XOOPS Project