Login
Username:

Password:

Remember me



Lost Password?

Register now!

Sections

Who's Online
81 user(s) are online (60 user(s) are browsing Forums)

Members: 1
Guests: 80

Georg, more...

Support us!

Headlines

 
  Register To Post  

« 1 2 (3)
Re: x1000 onboard network opensource driver in progress
Just popping in
Just popping in


See User information
What if you change your interrupt code to always return 0?

And/or change interrupt priority (either to very high or very low)?

Edit: also I would add a static counter variable to the interrupt and every 1000 counts or so debug-output it to serial (unconditionally = even if "interrupt is not for me")). To see at lockup what happens: maybe there's interrupt-flood.

I would generally also look more at what happens during lockup (the OS is still likely doing/executing something) not just how to reproduce it.

Edit2: One idea behind the tests mentioned is to check whether there maybe is some interrupt handler installed in the system by some other driver for the same irq that maybe erraneously (sometimes) thinks and says "yes, this irq is for me" - when it isn't - and returning != 0 and then causing other interrupt handlers for the same irq not to be called anymore. So those other interrupt handler never get chance to clear the irq-pending-state and the interrupts from then on keeps happening endlessly -> lockup.


Edited by Georg on 2026/3/13 11:57:20
Go to top
Re: x1000 onboard network opensource driver in progress
Home away from home
Home away from home


See User information
@TSK

Boot from cdrom, and check that you don't have outdated tools in s:startup-sequence, or user-startup.
often when the OS is frozen its possible get into Amiga computer by using serial cable, but you need to setup a termial on aux: using newshell.

(NutsAboutAmiga)

Basilisk II for AmigaOS4
AmigaInputAnywhere
Excalibur
and other tools and apps.
Go to top
Re: x1000 onboard network opensource driver in progress
Home away from home
Home away from home


See User information
@Georg
Quote:

What if you change your interrupt code to always return 0?

You mean probably not just return 0, as we still need to write to RXCH_RESET so, if we not, it surely will lockup, but you mean to strip interrupt handler code to the bare minimum write+return 0 and see if it change anything, that what you mean ?

Quote:

And/or change interrupt priority (either to very high or very low)?


Yep, will check 128 / -128

Quote:

I would generally also look more at what happens during lockup (the OS is still likely doing/executing something) not just how to reproduce it.


You think that CPU still alive even if video dead, mouse/keyboard dead, and serial dead ?

Quote:

One idea behind the tests mentioned is to check whether there maybe is some interrupt handler installed in the system by some other driver for the same irq that maybe erraneously (sometimes) thinks and says "yes, this irq is for me" - when it isn't - and returning != 0 and then causing other interrupt handlers for the same irq not to be called anymore. So those other interrupt handler never get chance to clear the irq-pending-state and the interrupts from then on keeps happening endlessly -> lockup.


That worth of trying too, of course, thanks, will check this all out.

There also another idea coming when i read yours : memory can survive "reset" button after lockup, so we just rom irq handler doing a counter , which line reached, timestap maybe , when lockup happens, reset, boot with no s-s , read from that memory, see the last N interrupt events before lockup. Can works ? I can do a simple test like just write some crap to some fixed address, press reset, immediately after boot check if that value is still there. If yes - idea can works. What you think ?


Also, thinking more about, if nothing will help, X1000 do have COP (Debug) Header. In TRM written that "The COP (debugger) header is provided for factory test purposes and its use is not
recommended.", but we know what mean those "factory test purposes" : cpu tests :) But i currently do not know what of ppc jtag debugger support PA6T at all , Varysis of course know it, but not me :) I also can go OpenOCD + FTDI-based JTAG, but then, dunno if OpenOCD support PA6T-1682M.. But that for later, first will try to do all best from software side.

Join us to improve dopus5!
AmigaOS4 on youtube
Go to top
Re: x1000 onboard network opensource driver in progress
Just popping in
Just popping in


See User information
@kas1eQuote:
kas1e wrote:@Georg
You mean probably not just return 0, as we still need to write to RXCH_RESET so, if we not, it surely will lockup, but you mean to strip interrupt handler code to the bare minimum write+return 0 and see if it change anything, that what you mean ?

No, don't strip/change code in the interrupt handler. Only change the return value to always be 0. Unless things are different in AOS4 I think it used to be so that an interrupt handler returns TRUE or 1 if the interrupt handler found out that "yes, the interrupt was for me" and FALSE or 0 if the interrupt handler finds out that it "was not for me".

That return value should really just be some optimization for the OS (to call less interrupt handlers if it thinks or it is being told that the interrupt was already handled by the current handler it is calling). But I think it should be safe to always return 0 and then cause other interrupt handlers of an IRQ to be called anyway. I think in theory the interrupt handlers for an IRQ that can be shared anyway need to handle the situation where the handler is called even if the IRQ was triggered by a different device than it's own.

What does your interrupt handler return at the moment?

Quote:

Yep, will check 128 / -128


127 / -128

Quote:

You think that CPU still alive even if video dead, mouse/keyboard dead, and serial dead ?


My guess is that it likely is. Would try to run a little program in the background which installs maybe a vertical blank interrupt and periodically outputs something to serial. Don't know if vertb is best for this. It should be some interrupt that has higher (hw) priority than those external device (network/audio/...) interrupts.

Go to top
Re: x1000 onboard network opensource driver in progress
Home away from home
Home away from home


See User information
@Georg
Quote:

What does your interrupt handler return at the moment?

Was 1, tried 0 : still lockup , so reverted to 1 again.

Quote:

127 / -128

Originally i set 10, now tried with both 127 and -128 : lockups too on both values :(

Quote:

Would try to run a little program in the background which installs maybe a vertical blank interrupt and periodically outputs something to serial.


So doing simple:

static ULONG WatchdogHandler(struct ExceptionContext *ctxstruct ExecBase *SysBaseAPTR data)
{
    (
void)ctx;
    (
void)SysBase;

    
struct WatchdogData *wd = (struct WatchdogData *)data;

    
wd->tick++;

    
/* Print approximately once per second.
     * VERTB fires at ~50 Hz on most display modes. */
    
if (wd->tick wd->last_tick >= 50)
    {
        
wd->last_tick wd->tick;
        
wd->seconds++;

        
wd->IExec->DebugPrintF("[WD] alive: %lu s (tick=%lu)\n",
                               
wd->secondswd->tick);
    }

    return 
0;   /* not our */
}

and

    
struct Interrupt handler;
    
handler.is_Node.ln_Type NT_INTERRUPT;
    
handler.is_Node.ln_Pri  127;   
    
handler.is_Node.ln_Name = (STRPTR)"pa6t_eth watchdog";
    
handler.is_Data         = (APTR)&wd;
    
handler.is_Code         = (VOID (*)())WatchdogHandler;

    
IExec->AddIntServer(INTB_VERTB, &handler);




Then installed like "watchdog &", running stress text via network => lockup , all i got just 2 lines before all die:

[stress8/8 connections open.  Starting receive...
[
stress]  Time Received  Est.pkts  Est.wraps Active
[stress] ------|-----------|-----------|-----------|-------
[
WDalive39 s (tick=1950)
[
WDalive40 s (tick=2000)


Damn ! :)

Maybe i need to install it let's say to be one time in 1/4 of second, so maybe will have time to print anything before all die but cause of lockup happens ?

Join us to improve dopus5!
AmigaOS4 on youtube
Go to top
Re: x1000 onboard network opensource driver in progress
Home away from home
Home away from home


See User information
@all
I just added watchdog monitor (also on VERB) just in the driver itself, so to print _ALL_ status registers of everything even 1/4 second, by all i mean all IOB ones, all MAC ones, all DMA engine ones, all DMA-interface-rx ones, and all RX/TX channels ones , as well, as on the running i jump state of all dma-channels just in case too. And result : all state registers looks correct, all of them.

Join us to improve dopus5!
AmigaOS4 on youtube
Go to top
Re: x1000 onboard network opensource driver in progress
Just popping in
Just popping in


See User information
Quote:

still lockup


Those changes were mostly a theory in case the irq is shared with other device drivers. Is there a tool (scout? sysmon?) where you can check if there are other installed interrupt handler for the irq your driver is using, or if your driver is the only one using the specific irq?

If it's not the only one, maybe try what happens if you disable the driver (sound? whatever) that's using same irq.

Maybe another thing you can try is to add a counter and a little serial debug output at beginning of your interrupt and at the end before the return. Don't know how many interrupts are typically happening during net transfer, so don't know if you have to do the output every 1000 ticks, every 10000 ticks or whatever.

Go to top
Re: x1000 onboard network opensource driver in progress
Just popping in
Just popping in


See User information
@kas1e

Instead of VERTB watchdog, maybe try also timer.device softint loop instead, as that's maybe less relying on "external" stuff (gfx card, bus / pci / ??):

68k sample code From Amiga Dev CD 2.1:

#include <exec/memory.h>
#include <exec/interrupts.h>
#include <devices/timer.h>
#include <dos/dos.h>
#include <clib/exec_protos.h>
#include <clib/dos_protos.h>
#include <clib/alib_protos.h>
#include <stdio.h>

#define MICRO_DELAY 1000
#define OFF     0
#define ON      1
#define STOPPED 2

struct TSIData {
    
ULONG tsi_Counter;
    
ULONG tsi_Flag;
    
struct MsgPort *tsi_Port;
};

struct TSIData *tsidata;

void tsoftcode(void);    /* Prototype for our software interrupt code */

void main(void)
{
    
struct MsgPort *port;
    
struct Interrupt *softint;
    
struct timerequest *tr;

    
ULONG endcount;

    
/* Allocate message port, data & interrupt structures. Don't use CreatePort() */
    /* or CreateMsgPort() since they allocate a signal (don't need that) for a    */
    /* PA_SIGNAL type port. We need PA_SOFTINT.                                   */
    
if (tsidata AllocMem(sizeof(struct TSIData), MEMF_PUBLIC|MEMF_CLEAR))
    {
        if(
port AllocMem(sizeof(struct MsgPort), MEMF_PUBLIC|MEMF_CLEAR))
        {
            
NewList(&(port->mp_MsgList));                             /* Initialize message list */
            
if (softint AllocMem(sizeof(struct Interrupt), MEMF_PUBLIC|MEMF_CLEAR))
            {
                
softint->is_Code tsoftcode;    /* The software interrupt routine */
                
softint->is_Data tsidata;
                
softint->is_Node.ln_Pri 0;

                
port->mp_Node.ln_Type NT_MSGPORT;       /* Set up the PA_SOFTINT message port  */
                
port->mp_Flags PA_SOFTINT;              /* (no need to make this port public). */
                
port->mp_SigTask = (struct Task *)softint;     /* pointer to interrupt structure */

                /* Allocate timerequest */
                
if (tr = (struct timerequest *) CreateExtIO(portsizeof(struct timerequest)))
                {
                    
/* Open timer.device. NULL is success. */
                    
if (!(OpenDevice("timer.device"UNIT_MICROHZ, (struct IORequest *)tr0)))
                    {
                        
tsidata->tsi_Flag ON;        /* Init data structure to share globally. */
                        
tsidata->tsi_Port port;

                        
/* Send of the first timerequest to start. IMPORTANT: Do NOT   */
                        /* BeginIO() to any device other than audio or timer from      */
                        /* within a software or hardware interrupt. The BeginIO() code */
                        /* may allocate memory, wait or perform other functions which  */
                        /* are illegal or dangerous during interrupts.                 */
                        
printf("starting softint. CTRL-C to break...\n");


                        
tr->tr_node.io_Command TR_ADDREQUEST;    /* Initial iorequest to start */
                        
tr->tr_time.tv_micro MICRO_DELAY;        /* software interrupt.        */
                        
BeginIO((struct IORequest *)tr);

                        
Wait(SIGBREAKF_CTRL_C);
                        
endcount tsidata->tsi_Counter;
                        
printf("timer softint counted %ld milliseconds.\n"endcount);

                        
printf("Stopping timer...\n");
                        
tsidata->tsi_Flag OFF;

                        while (
tsidata->tsi_Flag != STOPPEDDelay(10);

                        
CloseDevice((struct IORequest *)tr);
                    }
                    else 
printf("couldn't open timer.device\n");
                    
DeleteExtIO(tr);
                }
                else 
printf("couldn't create timerequest\n");
                
FreeMem(softintsizeof(struct Interrupt));
            }
            
FreeMem(portsizeof(struct MsgPort));
        }
        
FreeMem(tsidatasizeof(struct TSIData));
    }
}


void tsoftcode(void)
{
    
struct timerequest *tr;

    
/* Remove the message from the port. */
    
tr = (struct timerequest *)GetMsg(tsidata->tsi_Port);

    
/* Keep on going if main() hasn't set flag to OFF. */
    
if ((tr) && (tsidata->tsi_Flag == ON))
    {
        
/* increment counter and re-send timerequest--IMPORTANT: This         */
        /* self-perpetuating technique of calling BeginIO() during a software */
        /* interrupt may only be used with the audio and timer device.        */
        
tsidata->tsi_Counter++;
        
tr->tr_node.io_Command TR_ADDREQUEST;
        
tr->tr_time.tv_micro MICRO_DELAY;
        
BeginIO((struct IORequest *)tr);
    }
    
/* Tell main() we're out of here. */
    
else tsidata->tsi_Flag STOPPED;
}

Go to top
Re: x1000 onboard network opensource driver in progress
Home away from home
Home away from home


See User information
@Georg
Quote:
No, don't strip/change code in the interrupt handler. Only change the return value to always be 0. Unless things are different in AOS4 I think it used to be so that an interrupt handler returns TRUE or 1 if the interrupt handler found out that "yes, the interrupt was for me" and FALSE or 0 if the interrupt handler finds out that it "was not for me".
I'm not sure anymore, but I think the return value is ignored for shareable interrupts like the PCI ones and all interrupt handlers in the list are always called.
Reason: More than one PCI device using the same IRQ number may cause an IRQ at the same time.
A PCI device interrupt handler has to clear the PCI device interrupt, if it was an interrupt for it's own device, but must not clear the CPU exception used for it, that's done by the OS after all interrupt handlers in the list were called.

Quote:
Is there a tool (scout? sysmon?) where you can check if there are other installed interrupt handler for the irq your driver is using, or if your driver is the only one using the specific irq?
There is a 20 years old port of Scout. AFAIK it's the only tool which can display the interrupt handler lists.
There is a comment from 2009 that it doesn't work on a Sam440ep.
I don't know if it's hardware related, or if it doesn't work at all on current AmigaOS 4.1 versions anymore and would have to be updated.

Quote:
Instead of VERTB watchdog, maybe try also timer.device softint loop instead, as that's maybe less relying on "external" stuff (gfx card, bus / pci / ??):
Unlike on AmigaOS <= 3.x (using a custom chip timer) the AmigaOS 4.x timer.device doesn't depend on anything external, it's based on the PowerPC CPU timers.


Edited by joerg on 2026/3/14 16:04:12
Go to top
Re: x1000 onboard network opensource driver in progress
Just popping in
Just popping in


See User information
Ranger shows each PCI device’s interrupt number. AFAIR the devices involved in NIC usage each have unique interrupt numbers, so I don’t think it’s likely some other random device is getting in the way.

However, with the driver needing several of the PCI devices, is it possible that one of them is triggering an interrupt you’re not expecting, e.g. for a statistics counter?

Go to top
Re: x1000 onboard network opensource driver in progress
Home away from home
Home away from home


See User information
@ncafferkey
Quote:
Ranger shows each PCI device’s interrupt number.
But it can't display the list of interrupt handlers, unlike Scout for example.

Quote:
AFAIR the devices involved in NIC usage each have unique interrupt numbers, so I don’t think it’s likely some other random device is getting in the way.
Even on the AmigaOne and Sam440/460 (emulation) PCI IRQs might me shared, but on the Pegasos2 (emulation) all PCI devices use the same IRQ number and everything (PATA, SATA, XHCI USB, NIC, sound, gfx, etc.) uses the same shared IRQ and interrupt handler list.

Go to top
Re: x1000 onboard network opensource driver in progress
Home away from home
Home away from home


See User information
@Georg, Joerg

I go heavy and just wrote small tool which list all the interrupts we have in system, by all i mean _all_.
So, for first 15 i use just ExecBase->IntVects[16], then next 5 i skip (those as i see from includes PCI_INTERRUPT_LINE),
and all the others since 20 AddIntServer probe , so in end, i just scan from 0 to 10000 and that what we have on x1000:
(some of first 15 i just named myself, dunno is that can be the case, just from hardware/intbits.h in SDK):

=== Interrupt Handler Listing (vectors 0 10000) ===

Extended vectorslist_a=6FF7CBE0 list_b=6FF7CBF0 stride=16 base=6FF7BF60 max=1055

Vector    0 
[    TBE]: iv_Code=00000000 iv_Data=00000000 iv_Node=00000000  (empty)
Vector    1 DSKBLK]: iv_Code=00000000 iv_Data=00000000 iv_Node=00000000  (empty)
Vector    2 [SOFTINT]: iv_Code=00000000 iv_Data=00000000 iv_Node=00000000  (empty)
Vector    3 [  PORTS]: iv_Code=020186B0 iv_Data=6FF7C000 iv_Node=00000000  (empty)
Vector    4 [  COPER]: iv_Code=020186B0 iv_Data=6FF7C020 iv_Node=00000000  (empty)
Vector    5 [  VERTB]: iv_Code=020186B0 iv_Data=6FF7C010 iv_Node=00000000  2 handler(s)
  
pri127  code=7FBA6F6C  data=6231DA58  name="pa6t_eth monitor"
  
pri=  10  code=023C8748  data=6FFA3420  name="graphics.library"
Vector    6 [   BLIT]: iv_Code=00000000 iv_Data=6FFA3420 iv_Node=6FFA3496  (empty)
Vector    7 [   AUD0]: iv_Code=00000000 iv_Data=00000000 iv_Node=00000000  (empty)
Vector    8 [   AUD1]: iv_Code=00000000 iv_Data=00000000 iv_Node=00000000  (empty)
Vector    9 [   AUD2]: iv_Code=00000000 iv_Data=00000000 iv_Node=00000000  (empty)
Vector   10 [   AUD3]: iv_Code=00000000 iv_Data=00000000 iv_Node=00000000  (empty)
Vector   11 [    RBF]: iv_Code=00000000 iv_Data=00000000 iv_Node=00000000  (empty)
Vector   12 DSKSYN]: iv_Code=00000000 iv_Data=00000000 iv_Node=00000000  (empty)
Vector   13 [  EXTER]: iv_Code=020186B0 iv_Data=6FF7C030 iv_Node=00000000  (empty)
Vector   14 [  INTEN]: iv_Code=020186B0 iv_Data=6FF7C050 iv_Node=00000000  (empty)
Vector   15 [    NMI]: iv_Code=020186B0 iv_Data=6FF7C040 iv_Node=00000000  (empty)
Vector   16PCI INTA (shared PCI interrupt linenot probed)
Vector   17PCI INTB (shared PCI interrupt linenot probed)
Vector   18PCI INTC (shared PCI interrupt linenot probed)
Vector   19PCI INTD (shared PCI interrupt linenot probed)
Vector   641 handler(s)
  
pri100  code=023F98FC  data=6FFA4000  name="VBInt PCIGraphics.card (0)"
Vector  1691 handler(s)
  
pri=  10  code=7FBA6EA4  data=6231DA58  name="pa6t_eth.device"
Vector 10411 handler(s)
  
pri=   0  code=02006E64  data=00000000  name="(null)"
Vector 10493 handler(s)
  
pri=  10  code=025DB130  data=6F872AC0  name="OHCI Interrupt Handler"
  
pri=   5  code=0259B044  data=6FFFD040  name="sb600sata IRQ handler"
  
pri=   0  code=7FBAD6FC  data=619481A0  name="(null)"
Vector 10502 handler(s)
  
pri=  10  code=025DB130  data=6F872C40  name="OHCI Interrupt Handler"
  
pri=  10  code=025DB130  data=6F336050  name="OHCI Interrupt Handler"
Vector 10512 handler(s)
  
pri=  10  code=025DB130  data=6F872DC0  name="OHCI Interrupt Handler"
  
pri=  10  code=025DB130  data=6F3361D0  name="OHCI Interrupt Handler"
Vector 10521 handler(s)
  
pri=  10  code=025E6264  data=6F336350  name="EHCI Interrupt Handler"

=== Done23 valid vectors13 handlers found ===


This also mean, that on our 169 interrupt we have nothing else (169 because 144 is DMA egine + 20 that TX dma channels + 5 this is RX dma channel we use for).

Next, i tried second Georg's suggestion about adding counter at begining and at the end. With VERTB at first: I added print 4 times in second, because RX Interrupt fires for now 8500 times in a second, so, ~2100 for 1 print.

At the end:

[stress]   146s 1931561 KB |   1466657 |     22916 8
[IRQenter=1340602 exit=1340290 miss=312
[IRQenter=1342175 exit=1341863 miss=312
[IRQenter=1343793 exit=1343481 miss=312
[IRQenter=1345368 exit=1345055 miss=313
[IRQenter=1346943 exit=1346629 miss=314
[IRQenter=1348518 exit=1348203 miss=315
[stress]   147s 1944510 KB |   1476521 |     23070 8
[IRQenter=1350091 exit=1349776 miss=315
[IRQenter=1351665 exit=1351350 miss=315
[IRQenter=1353238 exit=1352923 miss=315
[IRQenter=1354813 exit=1354498 miss=315
[IRQenter=1356425 exit=1356109 miss=316
[IRQenter=1358101 exit=1357785 miss=316
[stress]   148s 1957912 KB |   1486742 |     23230 8
[IRQenter=1359676 exit=1359359 miss=317
[IRQenter=1361235 exit=1360918 miss=317

***BURN CPU DIE****


I.e. clean right up to the end. enter - exit - miss = 0 till the very last line, so handler never got stuck inside and the cpu just died. No stuck handler, no interrupt flood...


Next, 3st suggestion, replaced VERTB on timer's softint. I just tried it as external app, probably no big sense to put it inside of the driver as was with VERTB ?

And result, completely dead cpu:

[stress]     7s |   93358 KB |     70906 |      1107 8
[IRQenter=4450762 exit=4448752 miss=2010
[SWDalive 2573
[IRQenter=4452409 exit=4450398 miss=2011
[SWDalive 2574
[IRQenter=4453961 exit=4451950 miss=2011


That mean that both VERTB and CPU's softint died together.. So CPU really dead, mean it's some very very heavy ..


Next things about which i may think for now:

1). As i understand , when CPU died like that, the things happens is : machine check (vector 0x200 for all the PPC include our PA6T). If something cause this machine check, the registers state saved,
and then cpu jump to that 0x200, and if vector corrupted -> boom. Idea is : we add little stub right at beginig of machine check vector, which dump the regs to some memory area which can survive the 'reset'
button. Then once we boot after reset, with boot with "no s-s" , and just read the data from this address. Or, we even can read it by CFE itself probably. What you think ?

I tried simple those:

BOOL ok_bus IExec->AddIntServer(TRAPNUM_BUS_ERROR, &trapBus);
    
BOOL ok_dsi IExec->AddIntServer(TRAPNUM_DATA_SEGMENT_VIOLATION, &trapDSI);
    
BOOL ok_isi IExec->AddIntServer(TRAPNUM_INST_SEGMENT_VIOLATION, &trapISI);


But no luck of course, as there more code involved calling them. So, is it possible to install stub right on 0x200 , bypassing everything. Maybe just from CFE doing so before booting OS4 ?


2). Add to the driver "polling" variant. No interrupts involved, no IOB, just minimal DMA (as without MAC will not work). Suck of course, but for first will work and can be used somehow (or maybe not somehow), and for second we will know for sure if issue is usage of IOB/etc together, or simple transfering of data, and it will be better to debug lighter version.

3). Buy JTAG, and check the state when CPU died..


I also need to add, that in Linux driver there is known "Errata 5971" (see in pasemi_mac.c) :

if (RX_RING_SIZE) {
        
/* Errata 5971 workaround: L2 target of headers */
        
write_iob_reg(PAS_IOB_COM_PKTHDRCNT0);
        
&= (RX_RING_SIZE-1);
    }


I of course add this one too, and confirm by tests that it all ok after. But while this Errata bug looks like the one what can cause such issues, i for sure fix it and check that it is by all register states, etc.

There are more info about just in case: https://lists.ozlabs.org/pipermail/lin ... /2007-October/043601.html
(and there you can go through prev/next messages, all about linux version of network driver)


Edited by kas1e on 2026/3/14 19:34:16
Edited by kas1e on 2026/3/14 19:41:40
Edited by kas1e on 2026/3/14 19:48:59
Join us to improve dopus5!
AmigaOS4 on youtube
Go to top
Re: x1000 onboard network opensource driver in progress
Just popping in
Just popping in


See User information
@joergQuote:
joerg wrote:@ncafferkey
Quote:
Ranger shows each PCI device’s interrupt number.
But it can't display the list of interrupt handlers, unlike Scout for example.


Yes, a live list is better, but there was a suggestion in a previous post that Scout might not work on current OS4, so Ranger's PCI list may be better than nothing. But in fact, I just remembered that Ranger has a list of interrupt handlers too (Exec->IntHandlers).

Quote:

Quote:
AFAIR the devices involved in NIC usage each have unique interrupt numbers, so I don’t think it’s likely some other random device is getting in the way.
Even on the AmigaOne and Sam440/460 (emulation) PCI IRQs might me shared, but on the Pegasos2 (emulation) all PCI devices use the same IRQ number and everything (PATA, SATA, XHCI USB, NIC, sound, gfx, etc.) uses the same shared IRQ and interrupt handler list.


I was referring specifically to the X1000's built-in NIC. I know NICs in general often share IRQ numbers.

Go to top
Re: x1000 onboard network opensource driver in progress
Just can't stay away
Just can't stay away


See User information
@kas1e
Quote:
1). As i understand , when CPU died like that, the things happens is : machine check (vector 0x200 for all the PPC include our PA6T). If something cause this machine check, the registers state saved,
and then cpu jump to that 0x200, and if vector corrupted -> boom. Idea is : we add little stub right at beginig of machine check vector, which dump the regs to some memory area which can survive the 'reset'
button. Then once we boot after reset, with boot with "no s-s" , and just read the data from this address. Or, we even can read it by CFE itself probably. What you think ?

If it's dead without interrupts still going (not even timer interrupts work) then this looks like the way to go. Doesn't AmigaOS install some handler for MCE that dumps state? You can look at 0x200 from a debugger to see if there's any code there. I think you can't set it before AmigaOS boots as it might replace it but you should be able to install your own handler overwriting 0x200 to jump to your handler. When this is called better not rely on that any of the OS still works so your handler should be some simple assembly to dump state to serial or somewhere only using CPU without calling any routines, just writing serial registers.
I don't know if this applies to PA6T but a PowerPC manual says this about MCE:
Quote:
The causes for machine check exceptions are implementation-dependent, but typically these causes are related to conditions such as bus parity errors or attempting to access an invalid physical address. The machine check exception is disabled when MSR[ME] = 0. If a machine check exception condition exists and the ME bit is cleared, the processor goes into the checkstop state.

So it can be disabled so check the ME bit is enabled to get the handler called and see if AmigaOS has a handler already. Also it says cause can be accessing invalid physical address so check all things in driver that involves physical addresses. Maybe you're missing some virtual to physical translation somewhere as DMA uses physical addresses but CPU uses virtual but don't know if AmigaOS uses one to one mapping or needs some address translation for DMA addresses.

Go to top
Re: x1000 onboard network opensource driver in progress
Just popping in
Just popping in


See User information
@balaton

OS4 has different virtual and physical addresses. GetDMAList() can provide the translation.

Go to top
Re: x1000 onboard network opensource driver in progress
Home away from home
Home away from home


See User information
@ncafferkey
Quote:

Yes, a live list is better, but there was a suggestion in a previous post that Scout might not work on current OS4, so Ranger's PCI list may be better than nothing. But in fact, I just remembered that Ranger has a list of interrupt handlers too (Exec->IntHandlers).

Ranger in this regard didn't show much (at least, not the first 15 for sure) and it also didn't show past 100 : check it, it just some you only first 100 excluding first 15 for real.

My way as i show query everything and any amount : first 15, 5 skipped, and all others till any amount, and so past 100 too, and we find there and my one 169, and USB,sb600sata interrupts, etc. If anyone need can share binary/source.

Quote:

I was referring specifically to the X1000's built-in NIC. I know NICs in general often share IRQ numbers.


As i can see not in case with x1000.

Quote:

OS4 has different virtual and physical addresses. GetDMAList() can provide the translation.


Yes, but not for EMAC ones at 0xe0000000: those ones always stays the same (while of course mapped by CFE on start, but then for us they always the same and looks like physical, and we can direct wrote to them).

@balaton
Thanks, will try this all out.

@All
Now, i removed ALL interrupts based code, absolutely. Keep polling/timer.device based code (but still has to use DMA, of course), and i got same lockup !! So it's DMA data transfer! The only thing still active in the polling version is the DMA writing received packets into our buffers through the IOB, so.. No interrupt flood, no irq priorities, etc ...

That a bit help probably us now..

Also, i found a pattern (At least, i hope so). The pure "ping" with default 64byte packets, or even with ping -s 1472, do not cause a lockup. The cause of lockup is massive transfer, so more constant ring wraps. I can't be 100% that this is the case, but at least for now it looks like this..

Join us to improve dopus5!
AmigaOS4 on youtube
Go to top
Re: x1000 onboard network opensource driver in progress
Just popping in
Just popping in


See User information
@kas1e

Maybe you need to add (more) cache flushing calls?

Go to top
Re: x1000 onboard network opensource driver in progress
Just popping in
Just popping in


See User information
Do you do this DMA stuff all by yourself or are you using some OS functionality for it? If the first, maybe the OS uses pasemi DMA "stuff" itself and there's clash because of missing arbitration.
Like on AOS Classic if someone tried to use the blitter directly, without OwnBlitter()/DisownBlitter).

(Stumbled on https://wiki.amigaos.net/wiki/DMA_Resource, but X1000 isn't mentioned).

Go to top
Re: x1000 onboard network opensource driver in progress
Home away from home
Home away from home


See User information
@ncafferkey
Quote:

Maybe you need to add (more) cache flushing calls?

Thanks, will check

Quote:

Do you do this DMA stuff all by yourself or are you using some OS functionality for it? If the first, maybe the OS uses pasemi DMA "stuff" itself and there's clash because of missing arbitration.


Yeah, by yourself, without using pasemi_dma.resource (but this one present in system). I may try to remove it for tests from x1000, if it still will work without (if it not used somewhere in the kernel and as a must).


@Balaton

So i were able to at least read MCE register and read real code on 0x200 ! (in SuperVisor mode, of course), see:

[mce] ===== MCE Probe v2 (MMU bypass) =====
[
mceMSR 0x0000B030  ME=1 DR=1 IR=1 PR=0
[mceMCE vector at physical 0x200 (256 bytes):
  
2007C7343A6 7C3243A6 7C6802A6 7C7143A6
  210
3C600200 606379BC 78630020 7C6803A6
  220
4E800020 7FE00008 7FE00008 7FE00008
  230
7FE00008 7FE00008 7FE00008 7FE00008
  240
7FE00008 7FE00008 7FE00008 7FE00008
  250
7FE00008 7FE00008 7FE00008 7FE00008
  260
7FE00008 7FE00008 7FE00008 7FE00008
  270
7FE00008 7FE00008 7FE00008 7FE00008
  280
7FE00008 7FE00008 7FE00008 7FE00008
  290
7FE00008 7FE00008 7FE00008 7FE00008
  2A0
7FE00008 7FE00008 7FE00008 7FE00008
  2B0
7FE00008 7FE00008 7FE00008 7FE00008
  2C0
7FE00008 7FE00008 7FE00008 7FE00008
  2D0
7FE00008 7FE00008 7FE00008 7FE00008
  2E0
7FE00008 7FE00008 7FE00008 7FE00008
  2F0
7FE00008 7FE00008 7FE00008 7FE00008
[mce] ===== Done =====


And when i put it to disassemble, we have real MCE trampoline:

0x2007C7343A6  mtspr  SPRG3r3      save r3 (scratch register)
  
0x2047C3243A6  mtspr  SPRG2r1      save stack pointer
  0x208
7C6802A6  mflr   r3             r3 LR
  0x20C
7C7143A6  mtspr  SPRG1r3      save LR
  0x210
3C600200  lis    r30x0200     \
  0x214
606379BC  ori    r3r30x79BC ;  > r3 0x020079BC
  0x218
78630020  clrldi r3r332     ; /  (zero-extend to 64-bit)
  
0x21C7C6803A6  mtlr   r3             LR kernel handler
  0x220
4E800020  blr                   jump to 0x020079BC
  0x224
-0x2FF7FE00008 trap           safety padding (55 traps)


And i can also wrote to physical 0x7f000000 (tested by some random FACEBEEF word) , and then even after 3 (!) reset buttons in a loop, i can do in cfe "d 0x7f000000" and i can see this FACEBEEF there ! Yahoo !

That mean, that now i need to do stub, and that all, we will know what happens (at least with registers) when we meet with machine-check vector.


Edited by kas1e on 2026/3/15 9:07:01
Edited by kas1e on 2026/3/15 9:32:02
Join us to improve dopus5!
AmigaOS4 on youtube
Go to top
Re: x1000 onboard network opensource driver in progress
Just can't stay away
Just can't stay away


See User information
@kas1e
If you have a handler already for MCE that would probably dump state to serial so maybe just check that in MSR the ME bit is enabled and you might get a dump without having to write your own handler.
I did not mean the register addresses for virt/phys translation but the addresses of the buffers you tell the card to read/write. I don't know how this works but there could be several places that could go wrong. I imagine if this is similar to rtl8139 or an USB controller the driver writes memory with descriptors and tells the card to read these and execute transfers they describe. When you write these descriptors you may need to translate addresses to physical as the card only sees phys addresses. If you write wrong address it may read bad data or overwrite wrong memory or crash if address is totally wrong. Also after writing the descriptors you may need to flush CPU cache before telling the card to go and read it otherwise card may read wrong data.
If the DMA engine is shared with other usage you might need to go through the dma.resource to make sure not trying to use same channel by different parts. I think in A1222 one channel is reserved for network, maybe it's the same on X1000?
Those what I think you should check but I could be wrong. We could look at it if you published the source. The title says opensource driver and you used Linux code so it should be GPL anyway. Maybe if you show the source people could help better.

Go to top

  Register To Post
« 1 2 (3)

 




Currently Active Users Viewing This Thread: 1 ( 0 members and 1 Anonymous Users )



Polls
Running AmigaOS 4 on?
AmigaOne SE/XE or microA1 12% (26)
Pegasos2 3% (8)
X5000 22% (48)
X1000 14% (30)
A1222 8% (19)
Sam 440/460 18% (40)
Classic PowerPC Amiga 2% (6)
WinUAE emulation 7% (16)
Qemu emulation 9% (21)
Total Votes: 214
The poll closed at 2025/12/1 12:00
8 Comments


Powered by XOOPS 2.0 © 2001-2024 The XOOPS Project