@joerg
Something looks seriously broken with scsispeed.
I've used the following command:
scsispeed DRIVE=nvme.device:0 FAST BUF1=16777216 BUF2=16777216 BUF3=16777216 BUF4=16777216
(Yes, I know, 4 times 16MB. But it was for diagnostics purposes)
I've tested the same command on my latest driver both with and without debug printouts to the debug shell.
The debug driver prints a string with the following information for each read action:
- The timing of the IO command
- The timing of the NVME read handler
- Time between issued IO commands
- Lenght of read size in bytes
So that's a debug terminal full of data
Then scsispeed reports 220MB/s for all four tests. This low speed is understandable due to the additional time it takes for the debug strings.
Debug printout show that the readspeed is actually happening at close to maximum PCIe speed. This was expected given the small overhead at these ind of transfers. Also the time between commands about 8 us.
But when I run the same benchmark while the debug printouts are disabled then the reported speed drops to just 50MB/s
So something is clearly off here.
A quick scan through the scsispeed source shows that there's some timer trickery happening. According to the source because of a low timer resolution. But a microseconds resolution is sufficient to time these kind of transfers. So apparently this trickery is not working properly or some overflow occurs.
Anyways, the benchmark is just a low level timed CMD_READ loop. I will create one myself without the trickery and up to date Exec calls against the latest SDK.
A simple ITimer->GetSysTime() before and after the DoIO() and then a ITimer->SubTime() should do the trick.