Login
Username:

Password:

Remember me



Lost Password?

Register now!

Sections

Who's Online
95 user(s) are online (63 user(s) are browsing Forums)

Members: 1
Guests: 94

kas1e, more...

Headlines

Forum Index


Board index » All Posts (geennaam)




Re: NVMe device driver
Quite a regular
Quite a regular


@Rolar

No I use the OS4 copy command. And I do not use a SPEED option but the BUF option.

Can I create and format a Fat or Ntfs partition with aos4? Or do have to do so on another OS?

Go to top


Re: NVMe device driver
Quite a regular
Quite a regular


@Rolar

The filesystem itself. Apparently the Amiga implementation is slow. Do you manage 1GB on the X5000 with Linux? Or on a Pc?

Go to top


Re: NVMe device driver
Quite a regular
Quite a regular


@Rolar

No I haven't. But as you can read above. The driver itself manages well beyond 1GigaByte/s. The limitation is in the filesystem.

Go to top


Re: NVMe device driver
Quite a regular
Quite a regular


@tonyw

OK, I see the email now. Somehow we overlook each other's emails. I've send you an email some time ago about the transfer limit but never got a reply.

I will send you my latest beta driver this weekend. It should not crash anymore. I will also send you a probe which will probe all NVMe drives in your system.

Go to top


Re: NVMe device driver
Quite a regular
Quite a regular


@tonyw

The current release on os4depot will crash when no NVMe drive is found.

I will upload a new version soon which solves this issue.

On top of that there might be quality issues with the X5000. I get IO timeouts when I use the X1 slot in front of the Xena slot. The other slots work fine for me.

Go to top


Re: Radeon RX cards on X1000
Quite a regular
Quite a regular


@sailor

Yes, after all these years we somehow still love our Amigas. But that doesn't mean that she has to look like a grandma with a hip replacement.

Go to top


Re: Radeon RX cards on X1000
Quite a regular
Quite a regular


@Hans

You're not the only one with a non-popular opinion

We all know the old joke about Windows: "32 bit extensions and a graphical shell for a 16 bit patch to an 8 bit operating system originally coded for a 4 bit microprocessor, written by a 2 bit company, that can't stand 1 bit of competition."

But at least Windows has seen so many rewrites and API changes to become the 21st century OS as it is today.

In Amiga land, everything has te remain the same as it was 30 years ago. While expecting seemles integration of 21 century technology. (The "only amiga makes it possible" mantra)

Hence all the stability issues:
- No memory protection
- Old code needs to run in the AmigaOS4 environment without sandboxing (because "not the amiga way" or something)
- 2 GB limitation, because the amiga way is to use the upper bit for somethng else (2GB ought to be enough for everybody :-p )
- permit()/forbid() debacle which seriously slows down multicore development
- No development leadership -> No documentations how everything should be done "properly" -> Everybody hacks their own way because we are supposed to be smart and creative.
- And so on

Every serious componany does a reality check once in a while. And evaluate if it's time to let go of the past and make a step into the future.

Our community is more concerned about who owns what. And who is entitled to develop for AmigaOS4.

The AmigaOS4 way is "A 21st century hack and patch job of a PowerPC port of a C code reimplementation of a 32bit 80's OS originally coded in assembly for a 16bit processor which is developed with 2 bits years between updates by 1 bit companies who are more pre-occupied with lawsuits over who owns the corpse than actually paying developers to get the work done."

Go to top


Re: M.A.C.E. Tower Defense
Quite a regular
Quite a regular


I've just purchased this game from itch. The game starts but I am stuck at the "choose your battlefield" window.

There's no start button or something. And nothing happens when double clicking on "1".

Any ideas what could be wrong?

Edit: Changed screenmode to 1920x1080 and now the play button in the right corner is visible. So it's a matter of screenmode.


Edited by geennaam on 2023/4/23 10:48:26
Go to top


Re: NVMe device driver
Quite a regular
Quite a regular


@joergQuote:
If the limit is at most 2 MB can't you split the reads/writes in the AmigaOS device part and send several commands at once to the NVMe IO handler? For example if you get a 16 MB read or write send 8 2 MB reads/writes to the NVMe IO handler queue and wait until all 8 are completed.

Of course. That is how my driver works. But when the queue is full you'll have to split in multiple passes. And this is additional overhead. SFS will start preparing the next transfer only after the current one has finished. So more time spend in the handler means more time for each transfer.

Go to top


Re: NVMe device driver
Quite a regular
Quite a regular


@Radov

I'm afraid that it's not ragemem issue. My own ram Benchmark tool shows similar results as ragemem. Also Linux memory Benchmark tools shows similar results for each core. Two threads shows double performance for the P5020. So apparently it is a cpu limitation. It looks like a cache miss is very expensive. It's also possible that 64bit mode will result in higher performance. There are specific full cacheline control instruction which might improve performance but these are not supported by our compilers.

But the NVMe Benchmark clearly shows that Ram or PCIe itself is not the bottleneck

Go to top


Re: NVMe device driver
Quite a regular
Quite a regular


@LiveForIt

So it's a good thing that I have no knowledge about linux nor C++

I think you mean the CatWeazle MK4? Don't own one but it looks like its offered as a legacy PCI device only. And your driver has to upload the FPGA loadfile.

Loading firmware for a modern PCIe card is pretty easy these days. Just create a DMA buffer and dump the firmware in it. Point the PCie device to the DMA buffer and ring the doorbell. PCIe card takes care of the upload.

This is done for the Soundcore 3D based soundblasters to program the onboard DSP.
But also for firmware upgrade of NVMe cards. Unfortunately those NVMe manufacturers offer update utlities instead of raw update images. Otherwise, I would have created a firmware update tool for AmigaOS4.

Go to top


Re: NVMe device driver
Quite a regular
Quite a regular


Thanks for the kind words

However I consider writing a device driver to be a lot easier task than for example porting a game. But maybe that's just judged from my perspective.

Go to top


Re: NVMe device driver
Quite a regular
Quite a regular


@joerg

That makes two of us.

Or actually, I was never involved .

So I have to create a driver based on bits and pieces of information where available. But most of it is just trial and error to see what works.

Creating the initial nvme backend was just two weekends worth of coding. Just follow the NVMe standard and everything is fine.
The remaining two months was figuring out how amigaos4 device drivers/kmods are supposed to work.

Go to top


Re: NVMe device driver
Quite a regular
Quite a regular


@joerg

This still doesn't make sense to me.

According to https://wiki.amigaos.net/wiki/Anatomy_of_a_SATA_Device_Driver, all I have to do is make sure that my nvme.device kmod has a lower priority then mounter.library to ensure that mounter.library is available before I call IMounter-> AnnounceDeviceTags().

The way I understand it, the mounter.library call is all that's needed to mount the partitions on the nvme SSD. And this is exactly what happens.

So why would anything else be needed to continue booting from nvme once the kernel and kmods are loaded and executed from sata ssd?


Edited by geennaam on 2023/4/21 12:14:42
Go to top


Re: NVMe device driver
Quite a regular
Quite a regular


@joergQuote:
No idea where this 16 MB limit comes from. Are you sure it's not just a limit of C:Copy?


Yes, it's a combination of SFS/02 blocksize and default c:copy BUF size.

SFS2 with 512 bytes blocksize results in default 2MB-512 transfers sizes. A 8 times higher blocksize (4096) results in 8 times higher default c:copy transfer sizes (16MB-4096). So the c:copy default is 4095*blocksize.

Using eg. c:copy BUF=65536 increases transfer size from 2MB-512 to exactly 33MB. But the time between commands scales ~linearly with the transfer size. So the theoretical SFS/02 performance limit remains 425MB/s.
That is what I meant with equal relative overhead. Independant of transfer size.

Go to top


Re: NVMe device driver
Quite a regular
Quite a regular


@joerg

Quote:
Unless nvme.device includes own RDB parsing and partition mounting support, like the X1000, X5000 and Sam460 SATA drivers seem to do, you have to add nvme.device to diskboot.config 


I have no idea why parsing the RDB in a device driver should be necessary. Sounds like a hack.

Anyways, after succesfully initializing the NVMe drive, my driver announces the nvme.device to the mounter library with IMounter->AnnounceDeviceTags(). This mounts the partitions.

Go to top


Re: NVMe device driver
Quite a regular
Quite a regular


It's been almost a month since the first release so here's a little progress update:

Changes:
- Bug solved which halted the boot procedure when no NVMe drive is connected
- Host Memory buffer implemented for SSDs without embedded DDR cache. This means that up to 64MByte of DDR memory is used as buffer for those DDR-less drives.
- Small speed optimizations of the code itself
- Dropped Interrupt handler as preparation for multiple NVMe units -> MSI or MSI-X is not supported by the latest publicly available kernel. Simulated pin interrupt is always number 57 for the drives that I own. This would have meant sharing interrupt between multiple drives which would impact speed.
- Complete rewrite of the NVMe IO handler -> Command Queue depth (CQ) increased from 2 to 64.

Especially the last bullet should increase the speed a lot. Or so I thought....Unfortunately speed in the X4 slot is still stuck to about 345MB/s average. (averaged over multiple 1GB writes)

Finding the bottleneck:
The architecture of my driver is pretty basic. BeginIO() will differentiate between direct and indirect commands. Direct commands are handled immediately and the indirect commands (like read and write) are forwarded with a List structure (FIFO) to a separate "unit" task. The unit task calls the NVMe write handler in case of a write function. So basically the OS4 virtual textbook implementation.

In search for speed, I've decided to profile my driver. I've used a SFS/02 partition on my Samsung EVO 970 with a "blocksize" of 4096 and buffer size of 2048.
The benefit of a blocksize of 4096 instead of 512 is that the transfer size limit for a copy is increased to ~16MB instead of ~2MB for a blocksize of 512 bytes. Of course this can be controlled with the "BUF" option for "copy" but I've noticed that it is benifical for drag and drop copies as well. (would be nice if there would be a global OS4 variable to control this)

When I copy a 1GB file from RAM: to the SFS/02 partition, the file is divided in ~16MB transfers.
Execution time of the write handler for each ~16MB transfer is about 11ms. When I ignore the relatively small overhead for the handler itself, this means that the ~16MB data is transferred with ~1450MB/s from RAM to SSD.
This was a bit surprising to me because until now I could read on several forums that the X5000 would have slow DRAM and PCIe performance. But this is clearly not the case.
Another fun fact is that my Samsung drive is somehow using cache coherent magic. It knows when the source data hasn't changed because when I repeat the same copy action, the 16MB/s transfers suddenly complete in ~7ms each. This would mean ~2300MB/s which is faster than the theoretical maximum 2000MB/s of my PCIe2.0 x4 slot. My other two drives (WD and Solidigm) don't do this trick.

Anyways, back to profiling.
Next up was timing the complete write command. So including BeginIO(), NVMe write handler and replying the IO message to the caller (Filesystem).
When I subtract the time that the NVMe write handler needs then this is basically a benchmark of the message port system and scheduler. But to my big surprise, this overhead is only a couple of microseconds. So my driver alone can write with about 1450MB/s. This means the bottleneck is somewhere else in the system.

So finally I've measured the time between consecutive IO commands which are send by the filesystem to my driver. It turns out that this time between IO commands is huge and scales linearly with the transfer size. So the overhead percentage is more or less equal for each different transfer size. As a result the theoretical maximum transfer speed with SFS/02 is limited to ~425MB/s (when you calculate with zero overhead for processing and executing the actual IO command).
Playing with the BUF=xxxxx option for the copy command will increase and decrease the transfer size. But as stated above, the relative equal IO command interval overhead means that this always results in the same ~425MB/s limitation. Reformatting the partition with the recommended blocksize of 512 bytes made no difference.

If my driver would be able to transfer at the X5000 PCIe x4 bus limit and without overhead than the maximum speed (without Samsung cache trick) with SFS/02 would be ~352MB/s. Currently I can transfer at ~330MB/s (without Samsung cache trick).

Summarizing the profiling effort:
- My driver is already performing at the limit for SFS2/02
- Pipelining my driver (separate tasks for NVME command submission and completion) makes no sense because IO commands are send one at a time and only after the pervious one has been finished.

I've been told that the NGFilesystem is faster. Like SFS, the IO commands are send one at a time after the previous one has been completed. But the transfer size is limited to just 128kB. So I cannot determine if this filesystem is suffering from the same huge delay time between commands for larger transfers. And because of the small transfer size, the NVMe read/write setup overhead becomes dominant which is now the limiting factor.

Other observations:
NVMe SSD are created with multithreading and pipelining in mind. This xxxxMB/s figures that you can read on the box of your NVMe drive are only reached when the command queue of NVMe keeps filled with new commands. Unfortunately, AmigaOS filesystems are not multithreaded. This means that the NVMe command queue is empty for the most of the time. Therefore it doesn’t really matter what’s written on the box, most NVMe drives will perform more or less equal. The Samsung cache trick is a nice bonus but only beneficial in my testcase. And probably not so much in real use cases.
And last but not least Don’t buy a Solidigm P44 Pro NVMe SSD. (I did because this drive gets very good reviews.) This PCIe 4.0 drive fails to enumerate to PCIe 2.0 x4. Instead it falls back to PCIe 1.0 x4 (1000MB/s). At first I thought that this was simply a reporting issue of this drive. But benchmarks showed that it is indeed operating at just 1.0. On top of that, the maximum transfer size of the drive for each NVMe IO command is just 256kBytes (Compared to 2MB for the Samsung 970). While this is not an issue for NGfilesystem with its 128kB limit, it means more overhead for SFS2.


Edited by geennaam on 2023/4/20 21:39:33
Go to top


Re: Can anyone compile SimpleMail?
Quite a regular
Quite a regular


@ktadd

Do you get a SSL error with Simplemail?

According to that github page, simplemail can be compiled statically or non-statically with openssl.

In case you have the non-statically linked one and the OpenSSL library interface remains the same, a new openSSL library is enough.

If your provider uses a newer encryption protocol which SimpleMail needs to be aware of then a simnple recompile will not help you anyways.

Go to top


Re: NVMe device driver
Quite a regular
Quite a regular


@TSK

Quote:
I'm wondering if NVMe cards exists in 1x form factor ? Having the both long ones occupied there's no chance for dual gfx card setup.


Plenty. I have one myself. Amazon has lot's of them. Just search for "PCIe x1 NVMe adapter".
But beware that the X1000 has got only PCIe 1.0. So the x1 slot will be limited to just 250MByte/s

Go to top


Re: NVMe device driver
Quite a regular
Quite a regular


@TearsOfMe

Yes, snoopy works. Just don't run the driver without NVMe drive in your system at the moment.

Doesn't the solution from Sailor solve the issue with the GFX driver for you?

Go to top



TopTop
« 1 ... 16 17 18 (19) 20 21 22 ... 35 »




Powered by XOOPS 2.0 © 2001-2023 The XOOPS Project