Thread overview
What does rt/sections_elf_shared.d do? (Porting dmd to musl)
Dec 17, 2017
Yuxuan Shui
Dec 17, 2017
Yuxuan Shui
Dec 17, 2017
Joakim
Dec 18, 2017
David Nadlinger
December 17, 2017
I'm trying to get dmd and phobos working with musl. Right now I have a bootstrapped compiler built with musl, which seems to work fine. However user applications will segmentation fault before even reaches main.

I investigated a bit. Looks like musl is not happy with how druntime uses dlopen related functions. When a D library loads, it tries to call _d_dso_registry, which will try to get a handle of the library using dlopen. Meaning dlopen will be called on the library itself while it's still loading. This seems to break musl. Although this might also be a bug on musl side: it tries to call init functions even when RTLD_NOLOAD is passed to dlopen.

However, going through sections_elf_shared.d, it makes me feel it's doing some magic tricks with dl functions, but I don't know what for?

If my understand is correct, it's used to register TLS storage to GC. If that's the case, there must be simpler ways to do that.
December 17, 2017
On Sunday, 17 December 2017 at 12:45:58 UTC, Yuxuan Shui wrote:
> However, going through sections_elf_shared.d, it makes me feel it's doing some magic tricks with dl functions, but I don't know what for?
>

Looks like it's also repeating some work that is already done by the dynamic linker...

December 17, 2017
On Sunday, 17 December 2017 at 12:45:58 UTC, Yuxuan Shui wrote:
> I'm trying to get dmd and phobos working with musl. Right now I have a bootstrapped compiler built with musl, which seems to work fine. However user applications will segmentation fault before even reaches main.
>
> I investigated a bit. Looks like musl is not happy with how druntime uses dlopen related functions. When a D library loads, it tries to call _d_dso_registry, which will try to get a handle of the library using dlopen. Meaning dlopen will be called on the library itself while it's still loading. This seems to break musl. Although this might also be a bug on musl side: it tries to call init functions even when RTLD_NOLOAD is passed to dlopen.
>
> However, going through sections_elf_shared.d, it makes me feel it's doing some magic tricks with dl functions, but I don't know what for?
>
> If my understand is correct, it's used to register TLS storage to GC. If that's the case, there must be simpler ways to do that.

It does various things to setup the ELF executable for BSD and linux/Glibc, including support for using the stdlib as a shared library: take a look at the much simpler sections_android or sections_solaris for the minimum of what's required.

You can use sections_elf_shared with the shared library support turned off, by adding the SHARED=0 flag when building druntime.  I'd do that first before trying to modify the source for Musl.
December 18, 2017
On Sunday, 17 December 2017 at 12:45:58 UTC, Yuxuan Shui wrote:
> Although this might also be a bug on musl side: it tries to call init functions even when RTLD_NOLOAD is passed to dlopen.

Ah, interesting. Might be worth reporting as a bug indeed; without looking too hard, I didn't see anything to indicate that trying to get a handle during initialization would be forbidden (undefined behaviour/...).

> However, going through sections_elf_shared.d, it makes me feel it's doing some magic tricks with dl functions, but I don't know what for?

The module is responsible for everything related to loading/unloading images (that is, shared libraries and the main executable itself) that contain D code, for those loaded at runtime (dl{open, close}() etc.) as well as those linked into a program or dragged in as a dependency of another shared library.

This involves registering global data and TLS segments with the garbage collector, as you point out, but also running global and per-thread constructors and destructors (e.g. shared static this), running GC finalizers defined in shared libraries that are about to be unloaded, etc.

All these things also need to work across multiple threads that might be loading and unloading the same libraries concurrently, and for libraries loaded indirectly as dependencies of another shared library. These two considerations are where a lot of the complexity comes from (since there are per-thread constructors, libraries can be initialized on some threads but not on others, and if a thread spawns another one, the libraries from the parent thread should also be available in the child thread, even if the parent thread later dies, etc.).

> If that's the case, there must be simpler ways to do that.

Patches are welcome – a significant amount of work (mostly by Martin Nowak, some by me on the LDC side of things) has gone into this, and we have been unable to come up with a simpler solution so far. Note that even if a less complex implementation was indeed possible, I wouldn't expect to make such a change without spending several days on testing and fixing the fallout due to e.g. linker bugs. All this needs to work in a variety of scenarios ({C, D} programs using {C, D} shared libraries at {link, run}-time; static linking with --gc-sections, etc.).

That being said, from what I understand, D shared libraries might not be very interesting for many users of musl libc. In that case, you might it worthwhile to simply switch back to the old module registration code for your target. The latter doesn't support libraries, but is less complex. For LDC, the most general option would be

https://github.com/ldc-developers/druntime/blob/5afd536d25ed49286d441396f75791e54a95c593/src/rt/sections_ldc.d

which requires no runtime/linker support apart from global (static) constructors and a few widely-used magic symbols. There are also implementations that use bracketing sections for various other platforms. (You'll need to change the corresponding support code in the compiler to match; for LDC switching to the old mechanism would be a line or two, not sure about DMD.)

Also, it would be awesome if someone could write proper documentation for this core part of druntime. I've been meaning to draft an article about it for some quite time, but by now it has been on my to-do list for so long that the chances I'll ever get around to it are rather slim.

 – David