May 24, 2015
On Sun, 24 May 2015 15:40:04 -0400, bitwise <bitwise.pvt@gmail.com> wrote:

> [snip]
>
> So at this point, it seems like these two fixes work as expected, but now, I'm having some new and very strange problems.
>
> I have a simple shared library and program I've been using to test this:
>
> [main.d]
> module main;
> import std.stdio;
> import std.conv;
> import std.string;
> import core.sys.posix.dlfcn;
>
> void main(string[] args)
> {
>      alias void function() fnType;
>
>      void *handle = dlopen("myShared.dylib", RTLD_NOW);
>      assert(handle);
>
>      fnType init = cast(fnType)dlsym(handle, "initLib");
>      assert(init);
>      init();
>
>      fnType term = cast(fnType)dlsym(handle, "termLib");
>      assert(term);
>      term();
>
>      dlclose(handle);
>      writeln("done");
> }
>
> [myShared.d]
> module myShared;
> import core.runtime;
> import std.stdio;
>
> extern(C) void initLib() {
>      writeln("Initializing Runtime");
>      Runtime.initialize();
> }
>
> extern(C) void termLib() {
>      writeln("Terminating Runtime");
>      Runtime.terminate();
> }
>
>
> So, when I run the above program, rt_init() should be called once for the main program, and once for the shared library. However, when I run the above program, rt_init() from the main program seems to get called twice. To clarify, I mean that when I retrieve "initLib()" with dlsym() and call it, rt_init() from the main module gets called.
>
> This seems to prove the above:
>
> In dmain2.d, I have modified rt_init() as follows:
>
> extern (C) int rt_init()
> {
>      import core.sys.posix.dlfcn;
>      Dl_info info;
>      if(dladdr(&rt_init, &info))
>          fprintf(stdout, "RT INIT: %s\n", info.dli_fname);   // this prints "main" for both calls
>
>      if (atomicOp!"+="(_initCount, 1) > 1)
>      {
>          fprintf(stdout, "RT ALREADY INITIALIZED\n");
>          return 1;
>      }
>
> // ...
>
>      fprintf(stdout, "RT INIT COMPLETE\n");
> }
>
> When the main program calls rt_init(), the output correctly reads "RT INIT COMPLETE".
> When I load the dynamic library however, I get the output "RT ALREADY INITIALIZED"
>
> How is this possible? I am not using a shared druntime afaik..

I just found that gcc has "-fvisibility=hidden" and "-fvisibility-ms-compat" which are equivalent. The lack of this functionality for dmd seems to explain the above problem.

Is this correct?
Any ideas?

  Bit
May 25, 2015
I've been reading through the Mach-O docs[1], and it seems that dynamic libs are treated the same as static libs in that exported symbols can only be defined once, even across dynamically loaded libraries. This is why calling rt_init from my dylib ended up calling the one that was already defined in the application image. I was able to call the rt_init from the dylib by specifically requesting it through dlsym, but then, all the global variables it used(_isRuntimeInitialized, etc) still ended up resolving to the ones in the application image.

At this point, my impression is that it would be very impractical, if not impossible to have separate druntimes for each shared library. Even when you do link separate runtimes, dyld still treats all the exported symbols as shared.


@Martin
So correct me if I'm wrong, but it seems the _only_ choice is a shared druntime for osx.
Could you elaborate a bit on how you've managed to do this for linux?


[1] https://developer.apple.com/library/mac/documentation/DeveloperTools/Conceptual/MachOTopics/0-Introduction/introduction.html

Thanks,

  Bit
May 25, 2015
On 2015-05-25 16:33, bitwise wrote:
> I've been reading through the Mach-O docs[1], and it seems that dynamic
> libs are treated the same as static libs in that exported symbols can
> only be defined once, even across dynamically loaded libraries. This is
> why calling rt_init from my dylib ended up calling the one that was
> already defined in the application image. I was able to call the rt_init
> from the dylib by specifically requesting it through dlsym, but then,
> all the global variables it used(_isRuntimeInitialized, etc) still ended
> up resolving to the ones in the application image.
>
> At this point, my impression is that it would be very impractical, if
> not impossible to have separate druntimes for each shared library. Even
> when you do link separate runtimes, dyld still treats all the exported
> symbols as shared.
>
>
> @Martin
> So correct me if I'm wrong, but it seems the _only_ choice is a shared
> druntime for osx.
> Could you elaborate a bit on how you've managed to do this for linux?

If I recall correctly, this is how it works:

On Linux the compiler will generate a function which is placed in a special section in the binary, same as annotating a C function with __attribute__((constructor)). This function calls a function in the druntime. This will give similar properties as _dyld_register_func_for_add_image on OS X but without the issues with registered callbacks

Each image then becomes responsible to initialize itself. The image updates the shared data structure containing the necessary data (TLS, exception handling tables, ...). As you noticed yourself, all global symbols are shared across all images.

-- 
/Jacob Carlborg
May 25, 2015
On Mon, 25 May 2015 14:39:37 -0400, Jacob Carlborg <doob@me.com> wrote:

> On 2015-05-25 16:33, bitwise wrote:
>> I've been reading through the Mach-O docs[1], and it seems that dynamic
>> libs are treated the same as static libs in that exported symbols can
>> only be defined once, even across dynamically loaded libraries. This is
>> why calling rt_init from my dylib ended up calling the one that was
>> already defined in the application image. I was able to call the rt_init
>> from the dylib by specifically requesting it through dlsym, but then,
>> all the global variables it used(_isRuntimeInitialized, etc) still ended
>> up resolving to the ones in the application image.
>>
>> At this point, my impression is that it would be very impractical, if
>> not impossible to have separate druntimes for each shared library. Even
>> when you do link separate runtimes, dyld still treats all the exported
>> symbols as shared.
>>
>>
>> @Martin
>> So correct me if I'm wrong, but it seems the _only_ choice is a shared
>> druntime for osx.
>> Could you elaborate a bit on how you've managed to do this for linux?
>
> If I recall correctly, this is how it works:
>
> On Linux the compiler will generate a function which is placed in a special section in the binary, same as annotating a C function with __attribute__((constructor)). This function calls a function in the druntime. This will give similar properties as _dyld_register_func_for_add_image on OS X but without the issues with registered callbacks
>
> Each image then becomes responsible to initialize itself. The image updates the shared data structure containing the necessary data (TLS, exception handling tables, ...). As you noticed yourself, all global symbols are shared across all images.
>

So then I think I have a full solution:
1) _dyld_register_func_for_add_image should be taken care of with the above two fixes
2) __attribute__((constructor/destructor)) can be added to druntime when building for osx like in the file dylib_fixes.c [1]
3) copy paste rt_init/rt_term, rename them to dylib_init/dylib_term and remove everything except whats needed to initialize a shared lib's image.

Does this make sense?

Thanks,
  Bit


[1] https://github.com/D-Programming-Language/druntime/blob/61ba4b8d3c0052065c17ffc8eef4f11496f3db3e/src/rt/dylib_fixes.c
May 25, 2015
On Monday, 25 May 2015 at 14:33:43 UTC, bitwise wrote:
> At this point, my impression is that it would be very impractical, if not impossible to have separate druntimes for each shared library. Even when you do link separate runtimes, dyld still treats all the exported symbols as shared.

Yes, you can't mix them with a D executable.
May 25, 2015
On Monday, 25 May 2015 at 19:40:52 UTC, bitwise wrote:
> 1) _dyld_register_func_for_add_image should be taken care of with the above two fixes

You still cannot unregister the callback, so it can't be used for dynamically loading druntime. Last time we talked about this problem, we found some undocumented function that could be deregistered.

> 2) __attribute__((constructor/destructor)) can be added to druntime when building for osx like in the file dylib_fixes.c [1]

For linux we let the compiler emit comdat constructors into every D object, so you'll end up with exactly a single function for any binary containing D code.
I don't think you need ctors/dtors on OSX if you already have the dylib callback.

> 3) copy paste rt_init/rt_term, rename them to dylib_init/dylib_term and remove everything except whats needed to initialize a shared lib's image.

Not sure what you want to copy, but for sure you need to extend sections_osx to handle multiple dylibs. It should be very similar to _d_dso_registry for ELF, except that you iterate over the sections of a dylib image to get EH tables and ModuleInfo.
May 26, 2015
On 2015-05-25 22:58, Martin Nowak wrote:
> On Monday, 25 May 2015 at 19:40:52 UTC, bitwise wrote:
>> 1) _dyld_register_func_for_add_image should be taken care of with the
>> above two fixes
>
> You still cannot unregister the callback, so it can't be used for
> dynamically loading druntime. Last time we talked about this problem, we
> found some undocumented function that could be deregistered.
>
>> 2) __attribute__((constructor/destructor)) can be added to druntime
>> when building for osx like in the file dylib_fixes.c [1]
>
> For linux we let the compiler emit comdat constructors into every D
> object, so you'll end up with exactly a single function for any binary
> containing D code.
> I don't think you need ctors/dtors on OSX if you already have the dylib
> callback.

I'm not sure if there is any other solution. There is one private function, "dyld_register_image_state_change_handler", that should work. I think it works because the image is never unloaded. I have not seen any function for deregistering a callback, private or public.

Isn't it better to avoid private undocumented functions?

-- 
/Jacob Carlborg
May 26, 2015
On 2015-05-25 21:40, bitwise wrote:

> So then I think I have a full solution:
> 1) _dyld_register_func_for_add_image should be taken care of with the
> above two fixes
> 2) __attribute__((constructor/destructor)) can be added to druntime when
> building for osx like in the file dylib_fixes.c [1]
> 3) copy paste rt_init/rt_term, rename them to dylib_init/dylib_term and
> remove everything except whats needed to initialize a shared lib's image.
>
> Does this make sense?

You plan to use  __attribute__((constructor)) instead of _dyld_register_func_for_add_image should?

As Marin said, you need to look at sections_osx.d and _d_dso_registry.

What do you plan to do about TLS?

-- 
/Jacob Carlborg
May 26, 2015
On Tue, 26 May 2015 02:28:14 -0400, Jacob Carlborg <doob@me.com> wrote:

> On 2015-05-25 22:58, Martin Nowak wrote:
>> On Monday, 25 May 2015 at 19:40:52 UTC, bitwise wrote:
>>> 1) _dyld_register_func_for_add_image should be taken care of with the
>>> above two fixes
>>
>> You still cannot unregister the callback, so it can't be used for
>> dynamically loading druntime. Last time we talked about this problem, we
>> found some undocumented function that could be deregistered.
>>
>>> 2) __attribute__((constructor/destructor)) can be added to druntime
>>> when building for osx like in the file dylib_fixes.c [1]
>>
>> For linux we let the compiler emit comdat constructors into every D
>> object, so you'll end up with exactly a single function for any binary
>> containing D code.
>> I don't think you need ctors/dtors on OSX if you already have the dylib
>> callback.
>
> I'm not sure if there is any other solution. There is one private function, "dyld_register_image_state_change_handler", that should work. I think it works because the image is never unloaded. I have not seen any function for deregistering a callback, private or public.
The

I think Martin is right. We don't need ctors/dtors or any compiler fanciness. All we need is the two callbacks, which can be registered when druntime is initialized.

_dyld_register_func_for_add_image
_dyld_register_func_for_remove_image

At this point, we would only be registering the callbacks once in the main image, and not from the shared library. Since all global functions and symbols are shared between images anyways, receiving the callback in the main image would be fine. So in this case, unregistering the callbacks is no longer needed.

> Isn't it better to avoid private undocumented functions?
Not only better, but mandatory, otherwise Apple will reject the app from the app store.
I am certain this is the case for iOS, and I assume it would be the same for desktop.

On Tue, 26 May 2015 02:30:51 -0400, Jacob Carlborg <doob@me.com> wrote:
>
> What do you plan to do about TLS?

How would loading shared libraries change this? Couldn't TLS, however it's implemented now, be applied to shared libraries as well?

  Bit
May 27, 2015
On 2015-05-26 18:25, bitwise wrote:

> I think Martin is right. We don't need ctors/dtors or any compiler
> fanciness. All we need is the two callbacks, which can be registered
> when druntime is initialized.
>
> _dyld_register_func_for_add_image
> _dyld_register_func_for_remove_image
>
> At this point, we would only be registering the callbacks once in the
> main image, and not from the shared library. Since all global functions
> and symbols are shared between images anyways, receiving the callback in
> the main image would be fine. So in this case, unregistering the
> callbacks is no longer needed.

What about using a D dynamic library in a C application? The C application would initialize the runtime which would register the callback. Then it would be undefined to unload druntime?

> How would loading shared libraries change this? Couldn't TLS, however
> it's implemented now, be applied to shared libraries as well?

I'm not sure. The ___tls_get_addr function [1] is used when accessing a TLS variable on OS X. In all native implementations, both on OS X and Linux, the parameter is not just a void* but struct containing the image header as well.

Looking at SectionGroup [2] and how its data is initialized [3] you can see that there's only one set of TLS data (__tls_data and __tlscoal_nt) in SectionGroup. It would be straight forward to change that to an array but then you would not know which index to access in getTLSBlockAlloc [4] which is used by ___tls_get_addr.

[1] https://github.com/D-Programming-Language/druntime/blob/master/src/rt/sections_osx.d#L115

[2] https://github.com/D-Programming-Language/druntime/blob/master/src/rt/sections_osx.d#L26

[3] https://github.com/D-Programming-Language/druntime/blob/master/src/rt/sections_osx.d#L227-L239

[4] https://github.com/D-Programming-Language/druntime/blob/master/src/rt/sections_osx.d#L172

-- 
/Jacob Carlborg