x1000 onboard network opensource driver in progress: new version

Re: x1000 onboard network opensource driver in progress

Posted on: 3/13 7:08 #41

Just popping in

What if you change your interrupt code to always return 0?

And/or change interrupt priority (either to very high or very low)?

Edit: also I would add a static counter variable to the interrupt and every 1000 counts or so debug-output it to serial (unconditionally = even if "interrupt is not for me")). To see at lockup what happens: maybe there's interrupt-flood.

I would generally also look more at what happens during lockup (the OS is still likely doing/executing something) not just how to reproduce it.

Edit2: One idea behind the tests mentioned is to check whether there maybe is some interrupt handler installed in the system by some other driver for the same irq that maybe erraneously (sometimes) thinks and says "yes, this irq is for me" - when it isn't - and returning != 0 and then causing other interrupt handlers for the same irq not to be called anymore. So those other interrupt handler never get chance to clear the irq-pending-state and the interrupts from then on keeps happening endlessly -> lockup.

Edited by Georg on 2026/3/13 11:57:20

LiveForIt

Re: x1000 onboard network opensource driver in progress

Posted on: 3/13 12:29 #42

Home away from home

@TSK

Boot from cdrom, and check that you don't have outdated tools in s:startup-sequence, or user-startup.
often when the OS is frozen its possible get into Amiga computer by using serial cable, but you need to setup a termial on aux: using newshell.

(NutsAboutAmiga)

Basilisk II for AmigaOS4
AmigaInputAnywhere
Excalibur
and other tools and apps.

Re: x1000 onboard network opensource driver in progress

Posted on: 3/13 18:52 #43

Home away from home

@Georg
Quote:

What if you change your interrupt code to always return 0?

You mean probably not just return 0, as we still need to write to RXCH_RESET so, if we not, it surely will lockup, but you mean to strip interrupt handler code to the bare minimum write+return 0 and see if it change anything, that what you mean ?

Quote:

And/or change interrupt priority (either to very high or very low)?

Yep, will check 128 / -128

Quote:

I would generally also look more at what happens during lockup (the OS is still likely doing/executing something) not just how to reproduce it.

You think that CPU still alive even if video dead, mouse/keyboard dead, and serial dead ?

Quote:

One idea behind the tests mentioned is to check whether there maybe is some interrupt handler installed in the system by some other driver for the same irq that maybe erraneously (sometimes) thinks and says "yes, this irq is for me" - when it isn't - and returning != 0 and then causing other interrupt handlers for the same irq not to be called anymore. So those other interrupt handler never get chance to clear the irq-pending-state and the interrupts from then on keeps happening endlessly -> lockup.

That worth of trying too, of course, thanks, will check this all out.

There also another idea coming when i read yours : memory can survive "reset" button after lockup, so we just rom irq handler doing a counter , which line reached, timestap maybe , when lockup happens, reset, boot with no s-s , read from that memory, see the last N interrupt events before lockup. Can works ? I can do a simple test like just write some crap to some fixed address, press reset, immediately after boot check if that value is still there. If yes - idea can works. What you think ?

Also, thinking more about, if nothing will help, X1000 do have COP (Debug) Header. In TRM written that "The COP (debugger) header is provided for factory test purposes and its use is not
recommended.", but we know what mean those "factory test purposes" : cpu tests :) But i currently do not know what of ppc jtag debugger support PA6T at all , Varysis of course know it, but not me :) I also can go OpenOCD + FTDI-based JTAG, but then, dunno if OpenOCD support PA6T-1682M.. But that for later, first will try to do all best from software side.

Join us to improve dopus5!
AmigaOS4 on youtube

Re: x1000 onboard network opensource driver in progress

Posted on: 3/13 19:19 #44

Just popping in

@kas1eQuote:

kas1e wrote:@Georg
You mean probably not just return 0, as we still need to write to RXCH_RESET so, if we not, it surely will lockup, but you mean to strip interrupt handler code to the bare minimum write+return 0 and see if it change anything, that what you mean ?

No, don't strip/change code in the interrupt handler. Only change the return value to always be 0. Unless things are different in AOS4 I think it used to be so that an interrupt handler returns TRUE or 1 if the interrupt handler found out that "yes, the interrupt was for me" and FALSE or 0 if the interrupt handler finds out that it "was not for me".

That return value should really just be some optimization for the OS (to call less interrupt handlers if it thinks or it is being told that the interrupt was already handled by the current handler it is calling). But I think it should be safe to always return 0 and then cause other interrupt handlers of an IRQ to be called anyway. I think in theory the interrupt handlers for an IRQ that can be shared anyway need to handle the situation where the handler is called even if the IRQ was triggered by a different device than it's own.

What does your interrupt handler return at the moment?

Quote:

Yep, will check 128 / -128

127 / -128

Quote:

You think that CPU still alive even if video dead, mouse/keyboard dead, and serial dead ?

My guess is that it likely is. Would try to run a little program in the background which installs maybe a vertical blank interrupt and periodically outputs something to serial. Don't know if vertb is best for this. It should be some interrupt that has higher (hw) priority than those external device (network/audio/...) interrupts.

Re: x1000 onboard network opensource driver in progress

Posted on: 3/13 21:52 #45

Home away from home

@Georg
Quote:

What does your interrupt handler return at the moment?

Was 1, tried 0 : still lockup , so reverted to 1 again.

Quote:

127 / -128

Originally i set 10, now tried with both 127 and -128 : lockups too on both values :(

Quote:

Would try to run a little program in the background which installs maybe a vertical blank interrupt and periodically outputs something to serial.

So doing simple:


static ULONG WatchdogHandler(struct ExceptionContext *ctx, struct ExecBase *SysBase, APTR data)

{

    (void)ctx;

    (void)SysBase;



    struct WatchdogData *wd = (struct WatchdogData *)data;



    wd->tick++;



    /* Print approximately once per second.

     * VERTB fires at ~50 Hz on most display modes. */

    if (wd->tick - wd->last_tick >= 50)

    {

        wd->last_tick = wd->tick;

        wd->seconds++;



        wd->IExec->DebugPrintF("[WD] alive: %lu s (tick=%lu)\n",

                               wd->seconds, wd->tick);

    }



    return 0;   /* not our */

}



and



    struct Interrupt handler;

    handler.is_Node.ln_Type = NT_INTERRUPT;

    handler.is_Node.ln_Pri  = 127;   

    handler.is_Node.ln_Name = (STRPTR)"pa6t_eth watchdog";

    handler.is_Data         = (APTR)&wd;

    handler.is_Code         = (VOID (*)())WatchdogHandler;



    IExec->AddIntServer(INTB_VERTB, &handler);

Then installed like "watchdog &", running stress text via network => lockup , all i got just 2 lines before all die:


[stress] 8/8 connections open.  Starting receive...

[stress]  Time | Received  | Est.pkts  | Est.wraps | Active

[stress] ------|-----------|-----------|-----------|-------

[WD] alive: 39 s (tick=1950)

[WD] alive: 40 s (tick=2000)

Damn ! :)

Maybe i need to install it let's say to be one time in 1/4 of second, so maybe will have time to print anything before all die but cause of lockup happens ?

Join us to improve dopus5!
AmigaOS4 on youtube

Re: x1000 onboard network opensource driver in progress

Posted on: 3/14 7:06 #46

Home away from home

@all
I just added watchdog monitor (also on VERB) just in the driver itself, so to print _ALL_ status registers of everything even 1/4 second, by all i mean all IOB ones, all MAC ones, all DMA engine ones, all DMA-interface-rx ones, and all RX/TX channels ones , as well, as on the running i jump state of all dma-channels just in case too. And result : all state registers looks correct, all of them.

Join us to improve dopus5!
AmigaOS4 on youtube

Re: x1000 onboard network opensource driver in progress

Posted on: 3/14 8:15 #47

Just popping in

Quote:

still lockup

Those changes were mostly a theory in case the irq is shared with other device drivers. Is there a tool (scout? sysmon?) where you can check if there are other installed interrupt handler for the irq your driver is using, or if your driver is the only one using the specific irq?

If it's not the only one, maybe try what happens if you disable the driver (sound? whatever) that's using same irq.

Maybe another thing you can try is to add a counter and a little serial debug output at beginning of your interrupt and at the end before the return. Don't know how many interrupts are typically happening during net transfer, so don't know if you have to do the output every 1000 ticks, every 10000 ticks or whatever.

Re: x1000 onboard network opensource driver in progress

Posted on: 3/14 8:30 #48

Just popping in

@kas1e

Instead of VERTB watchdog, maybe try also timer.device softint loop instead, as that's maybe less relying on "external" stuff (gfx card, bus / pci / ??):

68k sample code From Amiga Dev CD 2.1:


#include <exec/memory.h>

#include <exec/interrupts.h>

#include <devices/timer.h>

#include <dos/dos.h>

#include <clib/exec_protos.h>

#include <clib/dos_protos.h>

#include <clib/alib_protos.h>

#include <stdio.h>



#define MICRO_DELAY 1000

#define OFF     0

#define ON      1

#define STOPPED 2



struct TSIData {

    ULONG tsi_Counter;

    ULONG tsi_Flag;

    struct MsgPort *tsi_Port;

};



struct TSIData *tsidata;



void tsoftcode(void);    /* Prototype for our software interrupt code */



void main(void)

{

    struct MsgPort *port;

    struct Interrupt *softint;

    struct timerequest *tr;



    ULONG endcount;



    /* Allocate message port, data & interrupt structures. Don't use CreatePort() */

    /* or CreateMsgPort() since they allocate a signal (don't need that) for a    */

    /* PA_SIGNAL type port. We need PA_SOFTINT.                                   */

    if (tsidata = AllocMem(sizeof(struct TSIData), MEMF_PUBLIC|MEMF_CLEAR))

    {

        if(port = AllocMem(sizeof(struct MsgPort), MEMF_PUBLIC|MEMF_CLEAR))

        {

            NewList(&(port->mp_MsgList));                             /* Initialize message list */

            if (softint = AllocMem(sizeof(struct Interrupt), MEMF_PUBLIC|MEMF_CLEAR))

            {

                softint->is_Code = tsoftcode;    /* The software interrupt routine */

                softint->is_Data = tsidata;

                softint->is_Node.ln_Pri = 0;



                port->mp_Node.ln_Type = NT_MSGPORT;       /* Set up the PA_SOFTINT message port  */

                port->mp_Flags = PA_SOFTINT;              /* (no need to make this port public). */

                port->mp_SigTask = (struct Task *)softint;     /* pointer to interrupt structure */



                /* Allocate timerequest */

                if (tr = (struct timerequest *) CreateExtIO(port, sizeof(struct timerequest)))

                {

                    /* Open timer.device. NULL is success. */

                    if (!(OpenDevice("timer.device", UNIT_MICROHZ, (struct IORequest *)tr, 0)))

                    {

                        tsidata->tsi_Flag = ON;        /* Init data structure to share globally. */

                        tsidata->tsi_Port = port;



                        /* Send of the first timerequest to start. IMPORTANT: Do NOT   */

                        /* BeginIO() to any device other than audio or timer from      */

                        /* within a software or hardware interrupt. The BeginIO() code */

                        /* may allocate memory, wait or perform other functions which  */

                        /* are illegal or dangerous during interrupts.                 */

                        printf("starting softint. CTRL-C to break...\n");





                        tr->tr_node.io_Command = TR_ADDREQUEST;    /* Initial iorequest to start */

                        tr->tr_time.tv_micro = MICRO_DELAY;        /* software interrupt.        */

                        BeginIO((struct IORequest *)tr);



                        Wait(SIGBREAKF_CTRL_C);

                        endcount = tsidata->tsi_Counter;

                        printf("timer softint counted %ld milliseconds.\n", endcount);



                        printf("Stopping timer...\n");

                        tsidata->tsi_Flag = OFF;



                        while (tsidata->tsi_Flag != STOPPED) Delay(10);



                        CloseDevice((struct IORequest *)tr);

                    }

                    else printf("couldn't open timer.device\n");

                    DeleteExtIO(tr);

                }

                else printf("couldn't create timerequest\n");

                FreeMem(softint, sizeof(struct Interrupt));

            }

            FreeMem(port, sizeof(struct MsgPort));

        }

        FreeMem(tsidata, sizeof(struct TSIData));

    }

}





void tsoftcode(void)

{

    struct timerequest *tr;



    /* Remove the message from the port. */

    tr = (struct timerequest *)GetMsg(tsidata->tsi_Port);



    /* Keep on going if main() hasn't set flag to OFF. */

    if ((tr) && (tsidata->tsi_Flag == ON))

    {

        /* increment counter and re-send timerequest--IMPORTANT: This         */

        /* self-perpetuating technique of calling BeginIO() during a software */

        /* interrupt may only be used with the audio and timer device.        */

        tsidata->tsi_Counter++;

        tr->tr_node.io_Command = TR_ADDREQUEST;

        tr->tr_time.tv_micro = MICRO_DELAY;

        BeginIO((struct IORequest *)tr);

    }

    /* Tell main() we're out of here. */

    else tsidata->tsi_Flag = STOPPED;

}

joerg

Re: x1000 onboard network opensource driver in progress

Posted on: 3/14 15:47 #49

Home away from home

@Georg
Quote:

No, don't strip/change code in the interrupt handler. Only change the return value to always be 0. Unless things are different in AOS4 I think it used to be so that an interrupt handler returns TRUE or 1 if the interrupt handler found out that "yes, the interrupt was for me" and FALSE or 0 if the interrupt handler finds out that it "was not for me".

I'm not sure anymore, but I think the return value is ignored for shareable interrupts like the PCI ones and all interrupt handlers in the list are always called.
Reason: More than one PCI device using the same IRQ number may cause an IRQ at the same time.
A PCI device interrupt handler has to clear the PCI device interrupt, if it was an interrupt for it's own device, but must not clear the CPU exception used for it, that's done by the OS after all interrupt handlers in the list were called.

Quote:

Is there a tool (scout? sysmon?) where you can check if there are other installed interrupt handler for the irq your driver is using, or if your driver is the only one using the specific irq?

There is a 20 years old port of Scout. AFAIK it's the only tool which can display the interrupt handler lists.
There is a comment from 2009 that it doesn't work on a Sam440ep.
I don't know if it's hardware related, or if it doesn't work at all on current AmigaOS 4.1 versions anymore and would have to be updated.

Quote:

Instead of VERTB watchdog, maybe try also timer.device softint loop instead, as that's maybe less relying on "external" stuff (gfx card, bus / pci / ??):

Unlike on AmigaOS <= 3.x (using a custom chip timer) the AmigaOS 4.x timer.device doesn't depend on anything external, it's based on the PowerPC CPU timers.

Edited by joerg on 2026/3/14 16:04:12

Re: x1000 onboard network opensource driver in progress

Posted on: 3/14 17:02 #50

Just popping in

Ranger shows each PCI device’s interrupt number. AFAIR the devices involved in NIC usage each have unique interrupt numbers, so I don’t think it’s likely some other random device is getting in the way.

However, with the driver needing several of the PCI devices, is it possible that one of them is triggering an interrupt you’re not expecting, e.g. for a statistics counter?

joerg

Re: x1000 onboard network opensource driver in progress

Posted on: 3/14 17:32 #51

Home away from home

@ncafferkey
Quote:

Ranger shows each PCI device’s interrupt number.

But it can't display the list of interrupt handlers, unlike Scout for example.

Quote:

AFAIR the devices involved in NIC usage each have unique interrupt numbers, so I don’t think it’s likely some other random device is getting in the way.

Even on the AmigaOne and Sam440/460 (emulation) PCI IRQs might me shared, but on the Pegasos2 (emulation) all PCI devices use the same IRQ number and everything (PATA, SATA, XHCI USB, NIC, sound, gfx, etc.) uses the same shared IRQ and interrupt handler list.

Re: x1000 onboard network opensource driver in progress

Posted on: 3/14 18:58 #52

Home away from home

@Georg, Joerg

I go heavy and just wrote small tool which list all the interrupts we have in system, by all i mean _all_.
So, for first 15 i use just ExecBase->IntVects[16], then next 5 i skip (those as i see from includes PCI_INTERRUPT_LINE),
and all the others since 20 AddIntServer probe , so in end, i just scan from 0 to 10000 and that what we have on x1000:
(some of first 15 i just named myself, dunno is that can be the case, just from hardware/intbits.h in SDK):


=== Interrupt Handler Listing (vectors 0 - 10000) ===



Extended vectors: list_a=6FF7CBE0 list_b=6FF7CBF0 stride=16 base=6FF7BF60 max=1055



Vector    0 [    TBE]: iv_Code=00000000 iv_Data=00000000 iv_Node=00000000  (empty)

Vector    1 [ DSKBLK]: iv_Code=00000000 iv_Data=00000000 iv_Node=00000000  (empty)

Vector    2 [SOFTINT]: iv_Code=00000000 iv_Data=00000000 iv_Node=00000000  (empty)

Vector    3 [  PORTS]: iv_Code=020186B0 iv_Data=6FF7C000 iv_Node=00000000  (empty)

Vector    4 [  COPER]: iv_Code=020186B0 iv_Data=6FF7C020 iv_Node=00000000  (empty)

Vector    5 [  VERTB]: iv_Code=020186B0 iv_Data=6FF7C010 iv_Node=00000000  2 handler(s)

  pri= 127  code=7FBA6F6C  data=6231DA58  name="pa6t_eth monitor"

  pri=  10  code=023C8748  data=6FFA3420  name="graphics.library"

Vector    6 [   BLIT]: iv_Code=00000000 iv_Data=6FFA3420 iv_Node=6FFA3496  (empty)

Vector    7 [   AUD0]: iv_Code=00000000 iv_Data=00000000 iv_Node=00000000  (empty)

Vector    8 [   AUD1]: iv_Code=00000000 iv_Data=00000000 iv_Node=00000000  (empty)

Vector    9 [   AUD2]: iv_Code=00000000 iv_Data=00000000 iv_Node=00000000  (empty)

Vector   10 [   AUD3]: iv_Code=00000000 iv_Data=00000000 iv_Node=00000000  (empty)

Vector   11 [    RBF]: iv_Code=00000000 iv_Data=00000000 iv_Node=00000000  (empty)

Vector   12 [ DSKSYN]: iv_Code=00000000 iv_Data=00000000 iv_Node=00000000  (empty)

Vector   13 [  EXTER]: iv_Code=020186B0 iv_Data=6FF7C030 iv_Node=00000000  (empty)

Vector   14 [  INTEN]: iv_Code=020186B0 iv_Data=6FF7C050 iv_Node=00000000  (empty)

Vector   15 [    NMI]: iv_Code=020186B0 iv_Data=6FF7C040 iv_Node=00000000  (empty)

Vector   16: PCI INTA (shared PCI interrupt line, not probed)

Vector   17: PCI INTB (shared PCI interrupt line, not probed)

Vector   18: PCI INTC (shared PCI interrupt line, not probed)

Vector   19: PCI INTD (shared PCI interrupt line, not probed)

Vector   64: 1 handler(s)

  pri= 100  code=023F98FC  data=6FFA4000  name="VBInt PCIGraphics.card (0)"

Vector  169: 1 handler(s)

  pri=  10  code=7FBA6EA4  data=6231DA58  name="pa6t_eth.device"

Vector 1041: 1 handler(s)

  pri=   0  code=02006E64  data=00000000  name="(null)"

Vector 1049: 3 handler(s)

  pri=  10  code=025DB130  data=6F872AC0  name="OHCI Interrupt Handler"

  pri=   5  code=0259B044  data=6FFFD040  name="sb600sata IRQ handler"

  pri=   0  code=7FBAD6FC  data=619481A0  name="(null)"

Vector 1050: 2 handler(s)

  pri=  10  code=025DB130  data=6F872C40  name="OHCI Interrupt Handler"

  pri=  10  code=025DB130  data=6F336050  name="OHCI Interrupt Handler"

Vector 1051: 2 handler(s)

  pri=  10  code=025DB130  data=6F872DC0  name="OHCI Interrupt Handler"

  pri=  10  code=025DB130  data=6F3361D0  name="OHCI Interrupt Handler"

Vector 1052: 1 handler(s)

  pri=  10  code=025E6264  data=6F336350  name="EHCI Interrupt Handler"



=== Done: 23 valid vectors, 13 handlers found ===

This also mean, that on our 169 interrupt we have nothing else (169 because 144 is DMA egine + 20 that TX dma channels + 5 this is RX dma channel we use for).

Next, i tried second Georg's suggestion about adding counter at begining and at the end. With VERTB at first: I added print 4 times in second, because RX Interrupt fires for now 8500 times in a second, so, ~2100 for 1 print.

At the end:


[stress]   146s | 1931561 KB |   1466657 |     22916 | 8

[IRQ] enter=1340602 exit=1340290 miss=312

[IRQ] enter=1342175 exit=1341863 miss=312

[IRQ] enter=1343793 exit=1343481 miss=312

[IRQ] enter=1345368 exit=1345055 miss=313

[IRQ] enter=1346943 exit=1346629 miss=314

[IRQ] enter=1348518 exit=1348203 miss=315

[stress]   147s | 1944510 KB |   1476521 |     23070 | 8

[IRQ] enter=1350091 exit=1349776 miss=315

[IRQ] enter=1351665 exit=1351350 miss=315

[IRQ] enter=1353238 exit=1352923 miss=315

[IRQ] enter=1354813 exit=1354498 miss=315

[IRQ] enter=1356425 exit=1356109 miss=316

[IRQ] enter=1358101 exit=1357785 miss=316

[stress]   148s | 1957912 KB |   1486742 |     23230 | 8

[IRQ] enter=1359676 exit=1359359 miss=317

[IRQ] enter=1361235 exit=1360918 miss=317



***BURN CPU DIE****

I.e. clean right up to the end. enter - exit - miss = 0 till the very last line, so handler never got stuck inside and the cpu just died. No stuck handler, no interrupt flood...

Next, 3st suggestion, replaced VERTB on timer's softint. I just tried it as external app, probably no big sense to put it inside of the driver as was with VERTB ?

And result, completely dead cpu:


[stress]     7s |   93358 KB |     70906 |      1107 | 8

[IRQ] enter=4450762 exit=4448752 miss=2010

[SWD] alive 2573

[IRQ] enter=4452409 exit=4450398 miss=2011

[SWD] alive 2574

[IRQ] enter=4453961 exit=4451950 miss=2011

That mean that both VERTB and CPU's softint died together.. So CPU really dead, mean it's some very very heavy ..

Next things about which i may think for now:

1). As i understand , when CPU died like that, the things happens is : machine check (vector 0x200 for all the PPC include our PA6T). If something cause this machine check, the registers state saved,
and then cpu jump to that 0x200, and if vector corrupted -> boom. Idea is : we add little stub right at beginig of machine check vector, which dump the regs to some memory area which can survive the 'reset'
button. Then once we boot after reset, with boot with "no s-s" , and just read the data from this address. Or, we even can read it by CFE itself probably. What you think ?

I tried simple those:


BOOL ok_bus = IExec->AddIntServer(TRAPNUM_BUS_ERROR, &trapBus);

    BOOL ok_dsi = IExec->AddIntServer(TRAPNUM_DATA_SEGMENT_VIOLATION, &trapDSI);

    BOOL ok_isi = IExec->AddIntServer(TRAPNUM_INST_SEGMENT_VIOLATION, &trapISI);

But no luck of course, as there more code involved calling them. So, is it possible to install stub right on 0x200 , bypassing everything. Maybe just from CFE doing so before booting OS4 ?

2). Add to the driver "polling" variant. No interrupts involved, no IOB, just minimal DMA (as without MAC will not work). Suck of course, but for first will work and can be used somehow (or maybe not somehow), and for second we will know for sure if issue is usage of IOB/etc together, or simple transfering of data, and it will be better to debug lighter version.

3). Buy JTAG, and check the state when CPU died..

I also need to add, that in Linux driver there is known "Errata 5971" (see in pasemi_mac.c) :


if (n > RX_RING_SIZE) {

        /* Errata 5971 workaround: L2 target of headers */

        write_iob_reg(PAS_IOB_COM_PKTHDRCNT, 0);

        n &= (RX_RING_SIZE-1);

    }

I of course add this one too, and confirm by tests that it all ok after. But while this Errata bug looks like the one what can cause such issues, i for sure fix it and check that it is by all register states, etc.

There are more info about just in case: https://lists.ozlabs.org/pipermail/lin ... /2007-October/043601.html
(and there you can go through prev/next messages, all about linux version of network driver)

Edited by kas1e on 2026/3/14 19:34:16
Edited by kas1e on 2026/3/14 19:41:40
Edited by kas1e on 2026/3/14 19:48:59

Join us to improve dopus5!
AmigaOS4 on youtube

Re: x1000 onboard network opensource driver in progress

Posted on: 3/14 20:58 #53

Just popping in

@joergQuote:

joerg wrote:@ncafferkey
Quote:
Ranger shows each PCI device’s interrupt number.
But it can't display the list of interrupt handlers, unlike Scout for example.

Yes, a live list is better, but there was a suggestion in a previous post that Scout might not work on current OS4, so Ranger's PCI list may be better than nothing. But in fact, I just remembered that Ranger has a list of interrupt handlers too (Exec->IntHandlers).

Quote:

Quote:
AFAIR the devices involved in NIC usage each have unique interrupt numbers, so I don’t think it’s likely some other random device is getting in the way.
Even on the AmigaOne and Sam440/460 (emulation) PCI IRQs might me shared, but on the Pegasos2 (emulation) all PCI devices use the same IRQ number and everything (PATA, SATA, XHCI USB, NIC, sound, gfx, etc.) uses the same shared IRQ and interrupt handler list.

I was referring specifically to the X1000's built-in NIC. I know NICs in general often share IRQ numbers.

balaton

Re: x1000 onboard network opensource driver in progress

Posted on: 3/14 23:38 #54

Just can't stay away

@kas1e
Quote:

1). As i understand , when CPU died like that, the things happens is : machine check (vector 0x200 for all the PPC include our PA6T). If something cause this machine check, the registers state saved,
and then cpu jump to that 0x200, and if vector corrupted -> boom. Idea is : we add little stub right at beginig of machine check vector, which dump the regs to some memory area which can survive the 'reset'
button. Then once we boot after reset, with boot with "no s-s" , and just read the data from this address. Or, we even can read it by CFE itself probably. What you think ?

If it's dead without interrupts still going (not even timer interrupts work) then this looks like the way to go. Doesn't AmigaOS install some handler for MCE that dumps state? You can look at 0x200 from a debugger to see if there's any code there. I think you can't set it before AmigaOS boots as it might replace it but you should be able to install your own handler overwriting 0x200 to jump to your handler. When this is called better not rely on that any of the OS still works so your handler should be some simple assembly to dump state to serial or somewhere only using CPU without calling any routines, just writing serial registers.
I don't know if this applies to PA6T but a PowerPC manual says this about MCE:
Quote:

The causes for machine check exceptions are implementation-dependent, but typically these causes are related to conditions such as bus parity errors or attempting to access an invalid physical address. The machine check exception is disabled when MSR[ME] = 0. If a machine check exception condition exists and the ME bit is cleared, the processor goes into the checkstop state.

So it can be disabled so check the ME bit is enabled to get the handler called and see if AmigaOS has a handler already. Also it says cause can be accessing invalid physical address so check all things in driver that involves physical addresses. Maybe you're missing some virtual to physical translation somewhere as DMA uses physical addresses but CPU uses virtual but don't know if AmigaOS uses one to one mapping or needs some address translation for DMA addresses.

https://qmiga.codeberg.page/

Re: x1000 onboard network opensource driver in progress

Posted on: 3/15 0:06 #55

Just popping in

@balaton

OS4 has different virtual and physical addresses. GetDMAList() can provide the translation.

Re: x1000 onboard network opensource driver in progress

Posted on: 3/15 6:28 #56

Home away from home

@ncafferkey
Quote:

Yes, a live list is better, but there was a suggestion in a previous post that Scout might not work on current OS4, so Ranger's PCI list may be better than nothing. But in fact, I just remembered that Ranger has a list of interrupt handlers too (Exec->IntHandlers).

Ranger in this regard didn't show much (at least, not the first 15 for sure) and it also didn't show past 100 : check it, it just some you only first 100 excluding first 15 for real.

My way as i show query everything and any amount : first 15, 5 skipped, and all others till any amount, and so past 100 too, and we find there and my one 169, and USB,sb600sata interrupts, etc. If anyone need can share binary/source.

Quote:

I was referring specifically to the X1000's built-in NIC. I know NICs in general often share IRQ numbers.

As i can see not in case with x1000.

Quote:

OS4 has different virtual and physical addresses. GetDMAList() can provide the translation.

Yes, but not for EMAC ones at 0xe0000000: those ones always stays the same (while of course mapped by CFE on start, but then for us they always the same and looks like physical, and we can direct wrote to them).

@balaton
Thanks, will try this all out.

@All
Now, i removed ALL interrupts based code, absolutely. Keep polling/timer.device based code (but still has to use DMA, of course), and i got same lockup !! So it's DMA data transfer! The only thing still active in the polling version is the DMA writing received packets into our buffers through the IOB, so.. No interrupt flood, no irq priorities, etc ...

That a bit help probably us now..

Also, i found a pattern (At least, i hope so). The pure "ping" with default 64byte packets, or even with ping -s 1472, do not cause a lockup. The cause of lockup is massive transfer, so more constant ring wraps. I can't be 100% that this is the case, but at least for now it looks like this..

Join us to improve dopus5!
AmigaOS4 on youtube

Re: x1000 onboard network opensource driver in progress

Posted on: 3/15 7:36 #57

Just popping in

@kas1e

Maybe you need to add (more) cache flushing calls?

Re: x1000 onboard network opensource driver in progress

Posted on: 3/15 7:57 #58

Just popping in

Do you do this DMA stuff all by yourself or are you using some OS functionality for it? If the first, maybe the OS uses pasemi DMA "stuff" itself and there's clash because of missing arbitration.
Like on AOS Classic if someone tried to use the blitter directly, without OwnBlitter()/DisownBlitter).

(Stumbled on https://wiki.amigaos.net/wiki/DMA_Resource, but X1000 isn't mentioned).