Login
Username:

Password:

Remember me



Lost Password?

Register now!

Sections

Who's Online
204 user(s) are online (121 user(s) are browsing Forums)

Members: 0
Guests: 204

more...

Headlines

Forum Index


Board index » All Posts (Futaura)




Re: gcc 9 and 10
Just popping in
Just popping in


@MigthyMax

Where can I find the old binutils elf32-amigaos.c file which has that dodgy patch in it? I need to read the surrounding code to understand exactly what areas it will affect, so I can handle it correctly in elf.library.

IBrowse, AmiSSL and Warp Datatype Developer
Go to top


Re: gcc 9 and 10
Just popping in
Just popping in


@kas1e

I think you are getting mixed up. There are two different situations - for executable programs, ctors/dtors are run by the clib2/newlib startup code - elf.library is not involved in that.

What Alfkil added in elf.library was for shared objects only - elf.library used to call __shlib_call_constructors and __shlib_call_destructors for shared objects, which in turn ran the ctors/dtors. Now elf.library doesn't call those functions and reads .ctors/.dtors itself and calls the functions directly. There is no startup code in a shared object as such (compared to an executable).

The .init_array/.fini_array handling in elf.library is again only for shared objects. For executables to support those sections, the startup code in clib2/newlib needs to be modified, as Joerg said.

IBrowse, AmiSSL and Warp Datatype Developer
Go to top


Re: gcc 9 and 10
Just popping in
Just popping in


@MigthyMax

Thanks for that. It is just more difficult to understand because elfpipe's fix was not done correctly and was partially broken, causing seemingly random crashes.

I've adapted my fix to ignore R_PPC_REL24 relocs when st_value is 0, because it was originally ignoring R_PPC_PLTREL24 (for ldso version 1) when in fact those relocs do need to proceed as they originally did - this stopped Timeberwolf from crashing. I'm still not sure elfpipe's fix is quite right as it leads to duplicated lookups, but I'm hoping to resolve that if I can rewrite and cleanup the whole reloc handling routine.

Regarding binutils, you are correct that changing the code to zero st_value, as it should do, will be incompatible with earlier elf.library releases. It is a bit annoying that this "fix" was placed there, instead of fixing elf.library. The plan is to push an elf.library public update out once all reported issues have been resolved.

IBrowse, AmiSSL and Warp Datatype Developer
Go to top


Re: gcc 9 and 10
Just popping in
Just popping in


@MigthyMax

Ok. The one thing I always have to check is that Timberwolf still starts up and unfortunately it is now crashing with the change I made regarding st_value. Need to do some more testing and checking as to why this is the case.

IBrowse, AmiSSL and Warp Datatype Developer
Go to top


Re: gcc 9 and 10
Just popping in
Just popping in


@joerg

Just to be clear, I wasn't the one who made those changes . However, elf.library only calls constructors/destructors directly for dynamic libraries - not executables. It does this only instead of using the old __shlib_call_constructors/destructors mechanism.

IBrowse, AmiSSL and Warp Datatype Developer
Go to top


Re: gcc 9 and 10
Just popping in
Just popping in


@kas1e

From what you say, regarding the different variations, I'd have to agree that it is going to be the new executable that is the problem. The crash is PLT related - either the PLT not being updated or the jumps in the code not reaching the PLT.

The obvious differences - the old binutils test_dyn:

Relocation section '.rela.plt' at offset 0x468 contains 3 entries:
 
Offset     Info    Type            Sym.Value  SymName Addend
01004048  00000315 R_PPC_JMP_SLOT    01004048   hello 0
01004050  00000415 R_PPC_JMP_SLOT    01004050   memcpy 
0
01004058  00000e15 R_PPC_JMP_SLOT    01004058   gettimeofday 0

Relocation section 
'.rela.text' at offset 0x1a334 contains 211 entries:
 
Offset     Info    Type            Sym.Value  SymName Addend
01003998  0000380a R_PPC_REL24       01004048   hello 0

Symbol table 
'.dynsym' contains 21 entries:
   
Num:    Value  Size Type    Bind   Vis      Ndx Name
     3
: 01004048     0 FUNC    GLOBAL DEFAULT  UND hello
     4
01004050     0 FUNC    GLOBAL DEFAULT  UND memcpy

Symbol table 
'.symtab' contains 99 entries:
   
Num:    Value  Size Type    Bind   Vis      Ndx Name
    56
: 01004048     0 FUNC    GLOBAL DEFAULT  UND hello
    57
01004050     0 FUNC    GLOBAL DEFAULT  UND memcpy


Compared with the new binutils:

Relocation section '.rela.plt' at offset 0x468 contains 3 entries:
 
Offset     Info    Type            Sym.Value  SymName Addend
01004048  00000315 R_PPC_JMP_SLOT    00000000   hello 0
01004050  00000415 R_PPC_JMP_SLOT    00000000   memcpy 
0
01004058  00000d15 R_PPC_JMP_SLOT    00000000   gettimeofday 0

Relocation section 
'.rela.text' at offset 0x25584 contains 211 entries:
 
Offset     Info    Type            Sym.Value  SymName Addend
01003998  00003a0a R_PPC_REL24       00000000   hello 0

Symbol table 
'.dynsym' contains 21 entries:
   
Num:    Value  Size Type    Bind   Vis      Ndx Name
     3
00000000     0 FUNC    GLOBAL DEFAULT  UND hello
     4
00000000     0 FUNC    GLOBAL DEFAULT  UND memcpy

Symbol table 
'.symtab' contains 99 entries:
   
Num:    Value  Size Type    Bind   Vis      Ndx Name
    58
00000000     0 FUNC    GLOBAL DEFAULT  UND hello
    59
00000000     0 FUNC    GLOBAL DEFAULT  UND memcpy


Not sure what is "correct", but it is probably related to https://www.amigans.net/modules/newbb/ ... id=136903#forumpost136903.

I think this is kind of linked to Alfkil's pointer equality "fix" in elf.library. I had to undo some of that for R_PPC_REL24 relocs because his change caused crashes due to out of range 24-bit branches. So, you might find your test doesn't crash with elf.library 53.46 (will crash with newer versions). Likewise if you re-add the patch that stops st_value being set to zero, it shouldn't crash.

However, I'm not sure any of that is a proper solution, so the big question is what is the correct solution? I'm happy to adjust elf.library, but how... Ideally, which I tried the other day, is that R_PPC_REL24 relocs could be ignored because the relative branches are already in the code usually (however, I found some cases where the reloc needed to be applied - newlib calls in some libraries). I suppose elf.library could be changed to ignore R_PPC_REL24 relocs completely if st_value is 0, which might solve these crashes.

Edit: Just tried my own suggestion above and your new binutils test_dyn is now working fine. If you can confirm your new binutils test_dyn doesn't crash with elf.library 53.46, it would make me a bit happier that this fix is the correct thing to do. FYI, once we can confirm that elf.library is handling things correctly and fully fixed, I'm going to bump it to v54.

IBrowse, AmiSSL and Warp Datatype Developer
Go to top


Re: gcc 9 and 10
Just popping in
Just popping in


Sorry for the delayed reply - have been busy attempting to rewrite the reloc code in elf.library, so that it's easier to understand and less easy to break in the future, which might not be possible!

@MigthyMax

This looks like a linker issue to me - segment 2 and 3 overlap, which shouldn't happen:

segment 2: 0x01000000-0x010050B0
segment 3: 0x01005000-0x010050A6

ELF.debug is levelled also - can be set 1-9 (maybe higher too). The higher the value, the more debug you get.

@kas1e

I don't have anything to test it with, as I don't think I have anything with a .fini_array and .init_array in them, or anything that can generate them. However, it looks like it should work fine, but the current logic involved is that it looks for .ctors first, and if .ctors is not present, it then looks for .init_array instead. Same for .dtors and .fini_array. So, if you were to have both in the file, .ctors/dtors will always get used.

IBrowse, AmiSSL and Warp Datatype Developer
Go to top


Re: gcc 9 and 10
Just popping in
Just popping in


@MigthyMax

Quote:

So instead of using the program headers (segments) to allocated memory
as described by the elf file, the elf.library allocates memory by sections?
I thought that this is one of the smart ideas of the elf file format, that
a linker just allocates memory as describes and than populate it from the
content, resp. fills its as described by the ABI.
Yes, well the normal case is that elf.library scans the program headers and allocates a memory block to hold each segment. Then, if/when it needs to load a section, the section is loaded into the defined location in the segment.

However, currently, segments will be ignored if that segment contains .rodata. Then, instead of the above, each section is allocated individually later when it needs to be loaded. This means even for the old binutils when .rodata is in its own segment, that segment is ignored too.

For the next elf.library, I'm tweaking this slightly, so that segments will always be loaded in their entirety, unless the segment contains .rodata and an executable section (i.e. old binutils where .rodata and .text were placed in the same segment).

Quote:

And you say, that is because of the support of M68k stubs stuff?
That's what the comment eluded too, but I've still yet to get my head around exactly why. I vaguely remember that in the old days the stubs, which we asm based back then, were maybe not placed in the correct section or something.

Quote:

Quote:

This makes the ctors/dtors crash (as elf.library tries to execute address
0xfffffffc instead of skipping this entry), when I try to recompile your
test code here.

Mhh, that could be case why stuff crashes with new binutils. But i read
further down that this is already fixed in new elf.library. And that
ctor/dtor stuff heavily even depends on the used c library.
Certainly if you are seeing ISI errors with the instruction pointer at 0xfffffffc, or thereabouts, that will be the reason. I think the whole reason Alfkil changed elf.library to use .ctors/dtors directly, instead of relying on __shlib_call_constructors(), was to make it independent of newlib/clib2 (and to handle broken apps that had unterminated ctor/dtor arrays). It wouldn't have been a problem if the recent change to newlib had been mirrored in elf.library at the same time.

The dtors not working correctly is almost certainly a bug in newlib.library and I've filed a report for that so hopefully it gets fixed soon.

(Un)fortunately, I got carried away with fixing these things in elf.library and have fixed and improved some other things too. I should be working in a different project, but whilst I have all these issues and inner workings of elf.library in my head, it's best to flush them all out before I forget .

IBrowse, AmiSSL and Warp Datatype Developer
Go to top


Re: gcc 9 and 10
Just popping in
Just popping in


@kas1e

Which version was the older elf.library that you were using (you quoted the newlib version instead)?

There are several elements involved here:

1. Prior to elf.library 53.37, __shlib_call_(de)constructors() was called (located in SDK:newlib/shcrtbegin.o). With 53.37 or higher, it doesn't use that anymore and instead directly reads .ctors/.dtors and executes the functions specified there.

2. No matter which binutils, compiler or newlib.library you are using, you also have to consider which version of the newlib dev stuff that you have installed in SDK:newlib/lib - for example, the new startup code supplied with 53.84 that sets the initial .ctors/.dtors entry to 0xffffffff instead of 0, which causes elf.library to crash. If you update only "newlib.library.kmod" and leave SDK:newlib/lib as some older version, it won't crash. Likewise if you use some older "newlib.library.kmod", but have SDK:newlib/lib at 53.84, it will still crash.

In my case, I only have the official native compilers and binutils installed, from SDK 54.6, including GCC 8 and 11, plus 4.0.4 and 4.2.4 from older SDKs. No matter what I use, dtor and dtor2 will not print when using newlib (latest SDK:newlib from 53.84), only clib2 works. I'm going to assume this is a bug and that the deconstructors are being called too late in newlib, so I'll file a Bugzilla ticket for that if there isn't already one there.

Regarding elf.library, I've already fixed the 0xffffffff .ctor/.dtor handling and I've tested that it doesn't crash anymore, which will be available when I upload 53.43 for beta testing.

For the new binutils, the change that is required is to ensure the .rodata section is placed in its own read-only segment by itself for dynamic objects, just as the old binutils does. It is not a good idea to put any other sections in the segment holding .rodata, due to previously discussed reasons.

Let me know if I have missed anything?

IBrowse, AmiSSL and Warp Datatype Developer
Go to top


Re: gcc 9 and 10
Just popping in
Just popping in


@kas1e

I think you'll find dtor and dtor2 are actually being successfully called - by that point, it looks like the c lib has possibly been shut down beforehand, so there is no output with puts/printf. Add DebugPrintF to dtor and dtor2 and you will see that.

I'm finding this to be the case with a simple static test and no dynamic objects. No dtor/dtor2 output when compiling with newlib, but works fine with clib2.

I also note that in newlib 53.84, there was this change:

Quote:
Changed the initial .ctors/.dtors sentinel entries from NULL to 0xffffffff to conform with the ELF standard.

This makes the ctors/dtors crash (as elf.library tries to execute address 0xfffffffc instead of skipping this entry), when I try to recompile your test code here.

IBrowse, AmiSSL and Warp Datatype Developer
Go to top


Re: gcc 9 and 10
Just popping in
Just popping in


@kas1e

Yes, .rodata must not be in the same segment as the .text/.plt segment. Also, .text and .plt must be placed together in the same segment - it shouldn't matter where .plt is in the segment, just as long as the branches into the PLT are correct.

Obviously, .rodata should be placed in read-only segment, not a writable segment - it doesn't need to be executable.

So, with your "rodata_rw" test case, what exactly is the issue? Here, with elf.library 53.42, I get the output:

6ram:rodata_rw/test_dyn
ctor
ctor2
function 123
main result 123
6
>

Slightly curious that the output from the dtors functions does not appear, yet with ELF.debug set to 1, it is clear that it is running the dtors functions.

IBrowse, AmiSSL and Warp Datatype Developer
Go to top


Re: gcc 9 and 10
Just popping in
Just popping in


Having modified elf.library to always load the .text/.plt segment in one go, rather than each section separately, the OS won't boot, so looks like I won't be changing that behaviour . Perhaps could be an alignment issue, although not really sure - in both cases, .rodata is marked as read-only, as you would expect.

What matters is that the behaviour was specifically changed to load the .text/.plt segment in one go, mainly to make the PLT stuff work for dynamic objects. So, we know what the correct thing is for binutils to do now.

If there are still issues with ctors/dtors not working, even when .rodata placed in a separate read only segment, I'm happy to check the relevant code in the latest elf.library (I believe Alfkil changed it so that it calls the ctors/dtors directly, right?). Just point me in the direction of a compiled test case, so I can reproduce the issue here.

IBrowse, AmiSSL and Warp Datatype Developer
Go to top


Re: gcc 9 and 10
Just popping in
Just popping in


@kas1e

Yes, I'm using your test case that had .rodata in the same segment as .text/.plt. This was perfect to exactly understand what was really causing elf.library to crash the code. It had nothing to do with .rodata being in a writable, read-only or executable segment. It was simply the placement - for dynamic objects, .rodata must not be placed in the same segment as .text/.plt with the current elf.library. As you say, .rodata should obviously be read-only. That must be why the old binutils did it that way.

However, I'm pondering whether elf.library needs to do this anymore. So far, I'm struggling to understand what the original reason was, so reluctant to change anything. The change happened in elf.library 53.4:

Quote:
Program header loadable segments are now loaded as segment even if they are executable. This caused some problems with older binaries that have .text and .rodata in the same segment; if the loader detects such a segment, it is ignored and the sections are loaded separately. This behaviour is required because the PLT must be correctly placed relative to the text segment.


Edited by Futaura on 2023/9/8 18:50:12
Edited by Futaura on 2023/9/8 19:50:15
IBrowse, AmiSSL and Warp Datatype Developer
Go to top


Re: gcc 9 and 10
Just popping in
Just popping in


Annoyingly I had early ignored "Ignoring this program header because we have an .rodata segment" in the debug output, which you'll see with ELF.debug set to 1, thinking it wasn't relevant. But, this would appear to be the reason for the behaviour I described in my previous message. That is why it works fine when .rodata is in its own segment, because then the segment containing the .text/.plt sections is loaded into a single block of memory, making the PLT branches correct.

Why does elf.library need to do this - to make 68k cross calls work, according to the comments (need to do some more research to understand this!).

So, does the old binutils always put .rodata in its own segment for dynamic objects? I'm sure .text and .rodata have always gone in the same segment for normal non-dynamic stuff.

IBrowse, AmiSSL and Warp Datatype Developer
Go to top


Re: gcc 9 and 10
Just popping in
Just popping in


@kas1e

I'm afraid I don't have the time to start compiling binutils, so I'm just using your example files that you posted to the beta list.

@MigthyMax

You're right, of course - serves me right for starting to look at this late at night! I've traced through the PLT setup and it does actually look to be working correctly. Instead, the problem appears to be that elf.library is allocating and loading all the sections in segment 1 separately (which includes .text, .plt and .rodota), which I haven't quite figured out yet - it only seems to be seeing the data segment during these searches. The end result is that because R_PPC_PLTREL24 relocs are intentionally ignored, the in place branches jump to the wrong address. For it to work .text and .plt must be located in the same block of memory, so the offsets to the .plt jump table are correct. Just trying to figure out why this isn't happening.

The PLT jump table is created at load time, with the jumps poked into the table when R_PPC_JMP_SLOT relocs are resolved (after other necessary objects have been loaded). The jump table is part of the .plt section, which the fixed branches in .text point to. I don't think the problem is directly related to this now.

IBrowse, AmiSSL and Warp Datatype Developer
Go to top


Re: gcc 9 and 10
Just popping in
Just popping in


@MigthyMax

Fortunately, I can build elf.library and add extra debug to try to figure out what is going on...

I'm not an expert on dynamic objects, but looking at the .rela.plt section, all the R_PPC_JMP_SLOT symbol values are 0 (readelf -r) with the new binutils. The elf lib uses these values to calculate the "real address", which it then puts in the PLT. I don't think they should be 0 - they need to point to the location of the respective function in .text (at least, that's the case in libc.so).

IBrowse, AmiSSL and Warp Datatype Developer
Go to top


Re: gcc 9 and 10
Just popping in
Just popping in


The elf.library relnotes in question appear to indicate that there may have been special modifications to binutils to make the PLT work:

- R_PPC_PLTREL24 relocs are now ignored for version 2 shared objects, since the
jumps are already correctly targeted at the PLT entries.

- R_PPC_JMP_SLOT relocs now correctly update the dynamic symbol table. On top
of that, the original PLT setup is no longer written here but rather when
setting up the PLT section.

- Lazy binding works now that the PLT is actually used. Note that for all of
the above to work, the latest binutils are required.

This is all from 2009.

I might be wrong, but it looks like the puts@plt jump is not targeted at a PLT entry.

IBrowse, AmiSSL and Warp Datatype Developer
Go to top


Re: gcc 9 and 10
Just popping in
Just popping in


@kas1e

Here it is crashing when calling the .ctors functions. I've double-checked all this by disassembling and peeking at the crashed code in memory and it seems .ctors is being relocated correctly. Although the crashlog might not always indicate it, it looks like it is the puts@plt call which crashes, because that jumps to the wrong address due to the reloc not having been applied. R_PPC_PLTREL24 relocs are always ignored by elf.library (there's a note about it in the relnotes).

I'm not up to speed with shared objects, so my question is where should the jump offset be corrected? Is this something that elf.library should be doing or is the wrong code being generated by the compiler?

Certainly, I've not seen any crashes that indicate that .rodata is being written to, so it is probably related to the physical position of .rodata - relocs will be different then too.

(apologies for the heavy re-edit - hopefully, nobody read the first version )


Edited by Futaura on 2023/8/30 17:22:40
Edited by Futaura on 2023/8/30 17:24:47
IBrowse, AmiSSL and Warp Datatype Developer
Go to top


Re: gcc 9 and 10
Just popping in
Just popping in


I'd say .rodata being writable may be a red herring. Does the output before the crash, with "setenv ELF.debug 1", give any clues?

IBrowse, AmiSSL and Warp Datatype Developer
Go to top


Re: gcc 9 and 10
Just popping in
Just popping in


Quote:
joerg wrote:

Edit2: You can close https://github.com/migthymax/binutils-gdb/issues/5
I thought that there would have been (at least) 3 segments (read-only executable, read-only non-executable and writable), but according to Futaura it always were only 2 segments (read-only and writable).

Of course, nobody should take my word for it - however, "readelf -l" on most (if not all) OS components will show this to be the case. Two segments, unless there is no rw data at all, in which case a single read-only executable segment (e.g. "C:Version").

IBrowse, AmiSSL and Warp Datatype Developer
Go to top



TopTop
« 1 (2) 3 4 5 ... 7 »




Powered by XOOPS 2.0 © 2001-2023 The XOOPS Project