Forums - All Posts - The Amigans website

Forum Index

Board index » All Posts (sailor)

Bottom

(1) 2 3 4 ... 20 »

sailor

Re: A1222 support in the SDK and problems

Posted on: Yesterday 16:20 #1

Not too shy to talk

@all
I have another question, about alignment:
this code is compilled with -mspe -mcpu=8540 -mfloat-gprs=double -mabi=spe ( i.e. for SPE)


double X __attribute__ ((aligned (64)));

...

main(){

X=0.499975;

When compilled without optimalization, allways generated guru "error of alignment type"
When I compilled it with -O1, it runs normally.

I tried with and without __attribite__, static, also X=(double)0.499975; ... allways with the same result.

Which from optimize parameters should cause the alignment?
Or are there another way howto allign it?

AmigaOS3: Amiga 1200
AmigaOS4: Micro A1-C, AmigaOne XE, Pegasos II, Sam440ep, Sam440ep-flex, AmigaOne X1000
MorphOS: Efika 5200b, Pegasos I, Pegasos II, Powerbook, Mac Mini, iMac, Powermac Quad

Topic | Forum

sailor

Re: Ragemem source code

Posted on: 4/30 5:56 #2

Not too shy to talk

@K-L
thank you.
It is a pitty, so I have to wrote some benchmark ourselves

Topic | Forum

sailor

Re: A1222 support in the SDK and problems

Posted on: 4/28 13:32 #3

Not too shy to talk

A1222+ results:

native SPE FPU:


8.System:> Work:Benchmark/stream-5.10-AOS/ 

8.Work:Benchmark/stream-5.10-AOS> stream_spe 

-------------------------------------------------------------

STREAM version $Revision: 5.10 $

-------------------------------------------------------------

This system uses 8 bytes per array element.

-------------------------------------------------------------

Array size = 10000000 (elements), Offset = 0 (elements)

Memory per array = 76.3 MiB (= 0.1 GiB).

Total memory required = 228.9 MiB (= 0.2 GiB).

Each kernel will be executed 10 times.

 The *best* time for each kernel (excluding the first iteration)

 will be used to compute the reported bandwidth.

-------------------------------------------------------------

Your clock granularity/precision appears to be 2 microseconds.

Each test below will take on the order of 293711 microseconds.

   (= 146855 clock ticks)

Increase the size of the arrays if this shows that

you are not getting at least 20 clock ticks per test.

-------------------------------------------------------------

WARNING -- The above is only a rough guideline.

For best results, please be sure you know the

precision of your system timer.

-------------------------------------------------------------

Function    Best Rate MB/s  Avg time     Min time     Max time

Copy:             787.1     0.204503     0.203269     0.208423

Scale:            492.9     0.326322     0.324588     0.329637

Add:              568.0     0.424966     0.422508     0.427871

Triad:            541.6     0.445014     0.443115     0.449225

-------------------------------------------------------------

Solution Validates: avg error less than 1.000000e-13 on all three arrays

standart powerpc FPU code with LTE emulator:


8.Work:Benchmark/stream-5.10-AOS> stream

-------------------------------------------------------------

STREAM version $Revision: 5.10 $

-------------------------------------------------------------

This system uses 8 bytes per array element.

-------------------------------------------------------------

Array size = 10000000 (elements), Offset = 0 (elements)

Memory per array = 76.3 MiB (= 0.1 GiB).

Total memory required = 228.9 MiB (= 0.2 GiB).

Each kernel will be executed 10 times.

 The *best* time for each kernel (excluding the first iteration)

 will be used to compute the reported bandwidth.

-------------------------------------------------------------

Your clock granularity/precision appears to be 2 microseconds.

Each test below will take on the order of 1032608 microseconds.

   (= 516304 clock ticks)

Increase the size of the arrays if this shows that

you are not getting at least 20 clock ticks per test.

-------------------------------------------------------------

WARNING -- The above is only a rough guideline.

For best results, please be sure you know the

precision of your system timer.

-------------------------------------------------------------

Function    Best Rate MB/s  Avg time     Min time     Max time

Copy:             788.5     0.204721     0.202919     0.208100

Scale:            148.0     1.081844     1.080804     1.085773

Add:              154.7     1.554502     1.551267     1.557342

Triad:            148.2     1.622773     1.619742     1.626540

-------------------------------------------------------------

Solution Validates: avg error less than 1.000000e-13 on all three arrays

LTE FPU emulation is very fast - more than 25% of SPE FPU native code.
Unfortunatelly majority of 3D games nor works with LTE and interpretative emulator must be used, and it is very slow.

Topic | Forum

sailor

Re: A1222 support in the SDK and problems

Posted on: 4/28 8:31 #4

Not too shy to talk

My first SPE-modified application stream is finished!

I need some apps for bechmarking of A1222+, and if nearly no exists, I have to it myselves

.
It is on OS4 depot now. It is only one small easy piece, but this is also my first c-code after 20+ years, so I am happy...

Topic | Forum

sailor

Re: Pegasos2 with RadeonHD/RX via bridge

Posted on: 4/27 20:01 #5

Not too shy to talk

@kas1eQuote:

kas1e wrote:@All

Is it correct datasheet for pegasos2's MartvelDiscovery2 MV64351 northbridge ?:

https://www.freecalypso.org/pub/PowerP ... eets/DS_64360_1_2.pdf.zip

Sounds like this, just need to be sure.

Yes! Pegasos 2 has MV64361 version:
Resized Image

Topic | Forum

sailor

Re: SDL2_image on A1222

Posted on: 4/27 19:49 #6

Not too shy to talk

@walkeroQuote:

walkero wrote:[quote]
SDK has multiple different versions which you can select with its installer. There is GCC 8, 10, 11 and 6 for SPE compatibility (A1222).

I am using gcc v.6 for SPE programming.
Please, are there any reason to use v.8 or v.10 for standard powerpc codes?
Had they some advantage / feature over v.11 ?

Topic | Forum

sailor

Re: Amiga X5000 and Sound Blaster Audigy FX problem

Posted on: 4/26 11:54 #7

Not too shy to talk

@geennaam
In any case, it will be a pitty, if you will stop your work with AmigaOS.
( I don't mean to say that spending more time with your wife and children is not good

)

I was always most looking forward to the third driver from your avatar...

Topic | Forum

sailor

Re: Pegasos2 with RadeonHD/RX via bridge

Posted on: 4/22 18:20 #8

Not too shy to talk

@kas1e
my xorg.confs are on hyperion site

So if you have in lspci: 0000:01:00.0
in xorg.conf in Device section probably should be:

BusID "PCI:1@0:0:0"
or
BusID "PCI:1:0:0"


Section "Device"

     Option     "Accel"    "yes"             # [<bool>]

     Identifier  "Radeon X1950 XT"

     Driver      "radeon"

     BusID       "PCI:1@0:0:0"

EndSection

Topic | Forum

sailor

Re: Pegasos2 with RadeonHD/RX via bridge

Posted on: 4/22 7:42 #9

Not too shy to talk

@kas1e

I just answered on hyperion website ( look here for example).
Try to setup xorg.conf with right BusID.

When I will be at home, I will send examples of config.

Topic | Forum

sailor

Re: A1222 support in the SDK and problems

Posted on: 4/18 15:47 #10

Not too shy to talk

@joergQuote:

- Build everything which doesn't use (much) float/double code with -msoft-float and use the soft-float C library.
- Put code which uses float/double calculations in separate sources compiled with -mabi=spe -mfloat-gprs=double instead.

And what if I need to use math library functions ( sin,cos..)? Do you know, what is faster? To call it newlib + standard powerpc way, i.e. it uses LTE emulator, or to use clib2 + integer emulation from here?
Of course, I cam measure it, I am asking just for case.

Quote:

- Make sure SPE functions called from soft-float code, and the other way round, are compatible, for example by only using pointers to float/double instead of direct float/double parameters. May not even be required if they are compatible anyway, as salass00 wrote.

at least printf, fprintf and sin() are not identical ( newlib.library 53.84 )- calling from SPE code returns nonsence. These I tested.

Topic | Forum

sailor

Re: A1222 support in the SDK and problems

Posted on: 4/18 15:25 #11

Not too shy to talk

@joerg
@salass00

Thank you for explanation.

Topic | Forum

sailor

Re: A1222 support in the SDK and problems

Posted on: 4/18 12:10 #12

Not too shy to talk

@flash

It is not so simple

From point of view of SPE embedded FPU it is HARD float, becouse it uses SPE-natural instructions and registers ( but these are not FPRs, but 64-bit GPRs ).
From point of view of powerpc code it is SOFT float, because it uses GPR registers and has no powerpc instructions.

From the point of view of gcc SPE code is SOFT float:


gcc -mcpu=8540 -mspe -mabi=spe -mfloat-gprs=double -c Stream2_mh.c -o Stream2_mh.o

gcc -mcpu=powerpc -c spe_float_transition.c -o spe_float_transition.o

gcc Stream2_mh.o spe_float_transition.o -o Stream2_mh

ld: Warning: spe_float_transition.o uses hard float, Stream2_mh uses soft float

Stream2_mh.c is benchmark compilled with SPE, spe_float_transition.c contains functions which should be called with powerpc float parameters, like printf.
gcc recognizes SPE code like soft-float.

If I remember good, comlilling with additional -mhard-float:


gcc -mcpu=8540 -mspe -mabi=spe -mfloat-gprs=double -mhard-float -c Stream2_mh.c -o Stream2_mh.o

generates some error - but I can check it again.

And gcc 6.4.0 online docs says:
Quote:

-msoft-float
-mhard-float
Generate code that does not use (uses) the floating-point register set. Software floating-point emulation is provided if you use the -msoft-float option, and pass the option to GCC when linking.

And as if embedded spe FPU has no floating-point register set, it is recognized like soft

It is only my explanation, and of course, I can be wrong. I start to play with this only after A1222+ arrive, so there is a lot of things more to study.

Topic | Forum

sailor

Re: A1222 support in the SDK and problems

Posted on: 4/18 8:50 #13

Not too shy to talk

@joergQuote:

joerg wrote:
You have to use gcc -msoft-float ...
But if you do not only your own code but all libraries, incl. the C library, you are using have to be compiled with -msoft-float as well. You can't mix FPU with soft-float code, at least not without using similar workarounds you are using for SPE now.

There is probably no soft-float version of newlib. There used to be soft-float versions of clib2, but in case it's no longer available rebuilding clib2 or clib4 with -msoft-float should be no big problem.
With a soft-float C library and building your own code with -msoft-float you don't need workarounds for functions like printf() either, but of course a SPE C library for the A1222 would be much better than a soft-float one which uses integer instructions and registers for float/double.

Thank you for detailed info.
And please, how to use soft-float C library?
Is it something like: "gcc -mcrt=clib2 -msoft-float .... -lm" ?
Or other way?

The spe code ( -mcpu=8540 -mabi=spe ) is allways soft-float, regardless of c library used.
And how is floating-point parameters passed when I used "-mcpu=powerpc -msoft-float" ? Via GPR registers? They are 32-bit in powerpc ABI. Or via stack? Or float via GPR and double via stack?
I read wiki.amigaos.net, but trere is not much about SPE.

Topic | Forum

sailor

Re: A1222 support in the SDK and problems

Posted on: 4/18 8:15 #14

Not too shy to talk

@flashQuote:

flash wrote:@sailor
As workaround you need to pass float parameters by reference and not by value.
Another solution is to pass them using heap space and not stack.
To do this you can use an array of floats and pass it's base address to function.
Another solution is to pass a struct with floats vars as members.

Anyway also printf function is bugged for A1222 and need to be fixed for floats.

I am using the first workaround, and I wrote my printf float function alternative for for print for spe.

@HansQuote:

Hans wrote:@sailor
How could I make it more accessible to "blones, who did C-coding 25 years before?"

I am sorry Hans, it was joke. It only means, that for me was not enough to read your article, and I had to make some examples and experiments before I understood it completely. Nothing other needed. Thanks.

Topic | Forum

sailor

Re: A1222 support in the SDK and problems

Posted on: 4/17 20:14 #15

Not too shy to talk

I have another question for SPE code:

If I need math library and using #include <math.h> and "-lm", it linked standart powerpc math code. I.e. it cannot be called directly from SPE code.
Now I am using the same workaround like in case of printf (above) - from SPE code call transition function by pointer, transition function is powerpc code and call math function normally.

But do exist in AmigaOS4 SDK some workaround?
I tried to use -lsoft-fp, but it not exists: "ld: cannot find -lsoft-fp"..

And please, what is "SDK:clib2/lib/soft-float/libm.a" for? Is it some preparation for soft-fp calls or something other? And how to use it?

Thanks for any tips and clues.

Topic | Forum

sailor

Re: Attempting to upgrade Sam 440 with an R7 240 or HD 7770

Posted on: 4/16 8:07 #16

Not too shy to talk

@Hypex

you not need powered PCI-PCIe adapter - geennaam's scheme works for me.

Working and proved connection is:

1. Sam440ep-flex PCI 66 MHz
2. PCI-PCIe bridge ( second photo in paragraph 3.1 - this is with chip P17C9X. You need no additional power here, power is drawn from PCI slot.
3. PCIe x1 -> x16 powered riser (common name on ebay is Powered USB3.0 GPU Riser Extender or mining rig adapter ) - see photo three in the same paragraph. You have to connect this adapter to power.
4. PCIe x16 graphics card

This should works for all PCIe cards upto 75W. More powerful cards need additional power ( PCIe power plug 6-pin, 8-pin ). Above mentioned HD 7770 ( 80W) probably needs 1x 6-pin.

Topic | Forum

sailor

Re: Pegasos2 with RadeonHD/RX via bridge

Posted on: 4/16 7:43 #17

Not too shy to talk

@kas1eQuote:

Can anybody help us with an idea of how to access properly the PCI configuration registers ? Or should be something done to make them accessible at all ? That what currently block Hans from progress in making working RadeonHD/RX cards via bridge on real pegasos2.

Unfortunatelly manual of Pegasos Smart Firmware is very short.
Only usable information I found is:
"SmartFirmware is an implementation of the OpenFirmware IEEE standard 1275-1994 plus errata changes"
And here is OpenFirmware Manual.
Maybe we can use some config-x writes...

Or maybe FCode helps?

@kas1eQuote:

kas1e wrote:@All
Does anyone know what the best PPC Linux which still support pegasos2 up-to-date , so we can see if bridge and some older HD cards supported by Linux are works on at all (so we can see what they do). Is Debian 8.2 is the last/best one to play with?

Yes, the last distro supporting Pegasos 2 is Debian 8. Or Ubuntu 16.04

Topic | Forum

sailor

Re: Ragemem source code

Posted on: 4/15 19:37 #18

Not too shy to talk

@K-L @Kamelito
Thank you very much!

Topic | Forum

sailor

Re: Trying to get a Radeon HD 7750 working in an AmigaOne XE

Posted on: 4/15 14:14 #19

Not too shy to talk

@mr2Quote:

mr2 wrote:Novabridge will not work because it need RadeonHD V5. And RadeonHD V5 is not available for the sam440.

Sailor said she wanted to try SAM460 RadeonHD v5 driver with SAM440. Both chips has the same core. I wonder what the result is....

Sam440ep + RadeonHD v5 (Sam460ex version) not works.
It is not exact - it boots to system, but performace is not good an va.libraty not works, so it is useless.

And regarding NovaBridge - I suppose that it works with RadeonHD v3.7. I will test it also and let you know.

Topic | Forum

sailor

Ragemem source code

Posted on: 4/14 15:40 #20

Not too shy to talk

Please, do anybody contact for Crisot ( uploader of ragemem on os4depot )? He was not years on amiga forums.
Or do anybody have ragemem code. If is it opensource, of course.

I am interested in ragemem - I am now playing with A1222+ and ragemem do some floating-point math, so I want to compile it for SPE.

I allready compiled stream v5.10 both for spe and standart powerpc, but it measures only memory + fpu calculation speed.
And ragemem measures L1 and L2 also.

Or if you have tip for other L1 L2 tool or algorithm...

thanks!

Topic | Forum

Top

(1) 2 3 4 ... 20 »