Jump to page: 1 2 3
Thread overview
[D-runtime] DLL initialization in Druntime
Jan 16, 2011
Walter Bright
Jan 16, 2011
Rainer Schuetze
Jan 16, 2011
Walter Bright
Jan 16, 2011
Rainer Schuetze
Jan 16, 2011
Walter Bright
Jan 16, 2011
Rainer Schuetze
Jan 16, 2011
Walter Bright
Jan 17, 2011
Brad Roberts
Jan 17, 2011
Walter Bright
Jan 17, 2011
Sean Kelly
Jan 17, 2011
Walter Bright
Jan 17, 2011
Sean Kelly
Jan 17, 2011
Brad Roberts
Jan 18, 2011
Walter Bright
Jan 18, 2011
Sean Kelly
Jan 18, 2011
Rainer Schuetze
Jan 18, 2011
Sean Kelly
Jan 18, 2011
Rainer Schuetze
Jan 18, 2011
Jacob Carlborg
Jan 18, 2011
Walter Bright
Jan 17, 2011
Walter Bright
January 15, 2011
An HTML attachment was scrubbed...
URL: <http://lists.puremagic.com/pipermail/d-runtime/attachments/20110115/76e3156f/attachment.html>
January 16, 2011
Hi,

I have not updated to the latest version yet, but it seems that it's the first time that gc_addRange is called during initialization, and this hits the bug in the gc stub code (obviously, the return statements are missing for the "proxy is null" case).

Even if fixed, the proxy has the problem that anything that has been allocated until the proxy is switched, uses a different heap, because the C-Runtime is not shared between the DLLs. This will cause problems when trying to scan/collect objects.

I've proposed using a phobos.dll shared between DLLs, but it involves some trickery: http://d.puremagic.com/issues/show_bug.cgi?id=4071

BTW: there is a regression in 2.051 regarding DLLs in a multi-threading environment: http://d.puremagic.com/issues/show_bug.cgi?id=5382

Rainer

Walter Bright wrote:
> It was failing hard with the latest changes on loading Windows DLLs, so I looked into it.
> 
> I don't see how it ever could have worked.
> 
> Note that DLLs link with gcstub.obj, this is because DLLs share the gc with the caller's gc, instead of having a separate gc that fights the caller's. The general idea is that upon initialization the DLL sets "proxy" to point to the caller's gc, and then all gc calls are routed through the proxy.
> 
> First, in our DLL's DllMain(), we call:
> 
>         case DLL_PROCESS_ATTACH:
>             dll_process_attach(hInstance);
> 
> In dll_process_attach(), druntime calls:
> 
>     Runtime.initialize()
> 
> which calls:
> 
>     rt_init(null)
> 
> which calls:
> 
>    gc_init();
>    initStaticDataGC();
> 
> which calls:
> 
>     gcstub.gc.gc_addRange()
> 
> which looks like:
> 
> extern (C) void gc_addRange( void* p, size_t sz )
> {
>     printf("gcstub::gc_addRange() proxy = %p\n", proxy);
>     if( proxy is null )
>     {
>         Range* r = cast(Range*) realloc( ranges,
>                                          (nranges+1) * ranges[0].sizeof );
>         if( r is null )
>             onOutOfMemoryError();
>         r[nranges].pos = p;
>         r[nranges].len = sz;
>         ranges = r;
>         ++nranges;
>     }
>     return proxy.gc_addRange( p, sz );
> }
> 
> which will ALWAYS crash because proxy is null. But we never notice the crash, because rt_init() ignores exceptions when dg is null, as in:
> 
>     catch (Throwable e)
>     {
>         if (dg)
>             dg(e);
>     }
> 
> and things then proceed with a half-crashed uninitialized runtime.
> 
> *Bugs:*
> 
> 1. proxy is null. It is supposed to be initialized by rt_loadLibrary(). But, sadly, it dll_process_attach() gets called first by LoadLibrary()! Disaster.
> 
> 2. gcstub's gc_addRange() and gc_addRoot() will always crash if proxy is null.
> 
> 3. gc_isCollecting() needs to be added to gcstub.gc. Otherwise, the linker mixes in gcstub with the regular gc in a truly frankensteinian mess.
> 
> 4. rt_init() should not ignore exceptions when dg is null, it should rethrow them.
> 
> 
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> D-runtime mailing list
> D-runtime at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/d-runtime


January 16, 2011

Rainer Schuetze wrote:
> Hi,
>
> I have not updated to the latest version yet, but it seems that it's the first time that gc_addRange is called during initialization, and this hits the bug in the gc stub code (obviously, the return statements are missing for the "proxy is null" case).
>
> Even if fixed, the proxy has the problem that anything that has been allocated until the proxy is switched, uses a different heap, because the C-Runtime is not shared between the DLLs. This will cause problems when trying to scan/collect objects.

Right.

>
> I've proposed using a phobos.dll shared between DLLs, but it involves some trickery: http://d.puremagic.com/issues/show_bug.cgi?id=4071
>
> BTW: there is a regression in 2.051 regarding DLLs in a multi-threading environment: http://d.puremagic.com/issues/show_bug.cgi?id=5382

This is a mystery, as there is no such variable in src/core/dll_helper.d

>
> Rainer
>
> Walter Bright wrote:
>> It was failing hard with the latest changes on loading Windows DLLs, so I looked into it.
>>
>> I don't see how it ever could have worked.
>>
>> Note that DLLs link with gcstub.obj, this is because DLLs share the gc with the caller's gc, instead of having a separate gc that fights the caller's. The general idea is that upon initialization the DLL sets "proxy" to point to the caller's gc, and then all gc calls are routed through the proxy.
>>
>> First, in our DLL's DllMain(), we call:
>>
>>         case DLL_PROCESS_ATTACH:
>>             dll_process_attach(hInstance);
>>
>> In dll_process_attach(), druntime calls:
>>
>>     Runtime.initialize()
>>
>> which calls:
>>
>>     rt_init(null)
>>
>> which calls:
>>
>>    gc_init();
>>    initStaticDataGC();
>>
>> which calls:
>>
>>     gcstub.gc.gc_addRange()
>>
>> which looks like:
>>
>> extern (C) void gc_addRange( void* p, size_t sz )
>> {
>>     printf("gcstub::gc_addRange() proxy = %p\n", proxy);
>>     if( proxy is null )
>>     {
>>         Range* r = cast(Range*) realloc( ranges,
>>                                          (nranges+1) *
>> ranges[0].sizeof );
>>         if( r is null )
>>             onOutOfMemoryError();
>>         r[nranges].pos = p;
>>         r[nranges].len = sz;
>>         ranges = r;
>>         ++nranges;
>>     }
>>     return proxy.gc_addRange( p, sz );
>> }
>>
>> which will ALWAYS crash because proxy is null. But we never notice the crash, because rt_init() ignores exceptions when dg is null, as in:
>>
>>     catch (Throwable e)
>>     {
>>         if (dg)
>>             dg(e);
>>     }
>>
>> and things then proceed with a half-crashed uninitialized runtime.
>>
>> *Bugs:*
>>
>> 1. proxy is null. It is supposed to be initialized by rt_loadLibrary(). But, sadly, it dll_process_attach() gets called first by LoadLibrary()! Disaster.
>>
>> 2. gcstub's gc_addRange() and gc_addRoot() will always crash if proxy is null.
>>
>> 3. gc_isCollecting() needs to be added to gcstub.gc. Otherwise, the linker mixes in gcstub with the regular gc in a truly frankensteinian mess.
>>
>> 4. rt_init() should not ignore exceptions when dg is null, it should rethrow them.
>>
>>
>>
>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> D-runtime mailing list
>> D-runtime at puremagic.com
>> http://lists.puremagic.com/mailman/listinfo/d-runtime
>
>
> _______________________________________________
> D-runtime mailing list
> D-runtime at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/d-runtime
>
>
January 16, 2011
Walter Bright wrote:
>> BTW: there is a regression in 2.051 regarding DLLs in a multi-threading environment: http://d.puremagic.com/issues/show_bug.cgi?id=5382
> 
> This is a mystery, as there is no such variable in src/core/dll_helper.d
> 

Maybe the diff shows the wrong direction patched <-> original. Using _moduleinfo_tlsdtors_i from object_.d no longer works, so the patch adds tlsCtorRun.

January 16, 2011

Rainer Schuetze wrote:
> Walter Bright wrote:
>>> BTW: there is a regression in 2.051 regarding DLLs in a multi-threading environment: http://d.puremagic.com/issues/show_bug.cgi?id=5382
>>
>> This is a mystery, as there is no such variable in src/core/dll_helper.d
>>
>
> Maybe the diff shows the wrong direction patched <-> original. Using _moduleinfo_tlsdtors_i from object_.d no longer works, so the patch adds tlsCtorRun.
>

If you're using multithreaded dlls in Windows, have you not seen the failed initialization issues? As I posted before, it's hard to see how it ever could have worked.
January 16, 2011
Walter Bright wrote:
> 
> 
> Rainer Schuetze wrote:
>> Walter Bright wrote:
>>>> BTW: there is a regression in 2.051 regarding DLLs in a multi-threading environment: http://d.puremagic.com/issues/show_bug.cgi?id=5382
>>>
>>> This is a mystery, as there is no such variable in src/core/dll_helper.d
>>>
>>
>> Maybe the diff shows the wrong direction patched <-> original. Using _moduleinfo_tlsdtors_i from object_.d no longer works, so the patch adds tlsCtorRun.
>>
> 
> If you're using multithreaded dlls in Windows, have you not seen the failed initialization issues? As I posted before, it's hard to see how it ever could have worked.

I'm not using the gc-proxy, so I have not hit the problem. My DLL works "stand-alone", i.e. it uses the standard GC without sharing between DLLs. It loads into a C++/C# environment using COM interfaces (but uses http://d.puremagic.com/issues/show_bug.cgi?id=4092 ).

The phobos.dll stuff also just uses the standard GC, exporting the symbols to other DLLs to link against. It has been proof of feasibility, but I'm not really using it.

January 16, 2011

Rainer Schuetze wrote:
> Walter Bright wrote:
>>
>> If you're using multithreaded dlls in Windows, have you not seen the failed initialization issues? As I posted before, it's hard to see how it ever could have worked.
>
> I'm not using the gc-proxy, so I have not hit the problem. My DLL works "stand-alone", i.e. it uses the standard GC without sharing between DLLs. It loads into a C++/C# environment using COM interfaces (but uses http://d.puremagic.com/issues/show_bug.cgi?id=4092 ).
>

That makes sense. The trouble comes when trying to share a gc.
January 16, 2011
I just saw bug 5320, also directly related to this area of the code:

http://d.puremagic.com/issues/show_bug.cgi?id=5320
  gcstub/gc.d: SEGV because of missing returns

On 1/16/2011 9:54 AM, Rainer Schuetze wrote:
> Hi,
> 
> I have not updated to the latest version yet, but it seems that it's the first time that gc_addRange is called during initialization, and this hits the bug in the gc stub code (obviously, the return statements are missing for the "proxy is null" case).
> 
> Even if fixed, the proxy has the problem that anything that has been allocated until the proxy is switched, uses a different heap, because the C-Runtime is not shared between the DLLs. This will cause problems when trying to scan/collect objects.
> 
> I've proposed using a phobos.dll shared between DLLs, but it involves some trickery: http://d.puremagic.com/issues/show_bug.cgi?id=4071
> 
> BTW: there is a regression in 2.051 regarding DLLs in a multi-threading environment: http://d.puremagic.com/issues/show_bug.cgi?id=5382
> 
> Rainer
> 
> Walter Bright wrote:
>> It was failing hard with the latest changes on loading Windows DLLs, so I looked into it.
>>
>> I don't see how it ever could have worked.
>>
>> Note that DLLs link with gcstub.obj, this is because DLLs share the gc with the caller's gc, instead of having a separate gc that fights the caller's. The general idea is that upon initialization the DLL sets "proxy" to point to the caller's gc, and then all gc calls are routed through the proxy.
>>
>> First, in our DLL's DllMain(), we call:
>>
>>         case DLL_PROCESS_ATTACH:
>>             dll_process_attach(hInstance);
>>
>> In dll_process_attach(), druntime calls:
>>
>>     Runtime.initialize()
>>
>> which calls:
>>
>>     rt_init(null)
>>
>> which calls:
>>
>>    gc_init();
>>    initStaticDataGC();
>>
>> which calls:
>>
>>     gcstub.gc.gc_addRange()
>>
>> which looks like:
>>
>> extern (C) void gc_addRange( void* p, size_t sz )
>> {
>>     printf("gcstub::gc_addRange() proxy = %p\n", proxy);
>>     if( proxy is null )
>>     {
>>         Range* r = cast(Range*) realloc( ranges,
>>                                          (nranges+1) * ranges[0].sizeof );
>>         if( r is null )
>>             onOutOfMemoryError();
>>         r[nranges].pos = p;
>>         r[nranges].len = sz;
>>         ranges = r;
>>         ++nranges;
>>     }
>>     return proxy.gc_addRange( p, sz );
>> }
>>
>> which will ALWAYS crash because proxy is null. But we never notice the crash, because rt_init() ignores exceptions when dg is null, as in:
>>
>>     catch (Throwable e)
>>     {
>>         if (dg)
>>             dg(e);
>>     }
>>
>> and things then proceed with a half-crashed uninitialized runtime.
>>
>> *Bugs:*
>>
>> 1. proxy is null. It is supposed to be initialized by rt_loadLibrary(). But,
>> sadly, it dll_process_attach() gets called first by LoadLibrary()! Disaster.
>>
>> 2. gcstub's gc_addRange() and gc_addRoot() will always crash if proxy is null.
>>
>> 3. gc_isCollecting() needs to be added to gcstub.gc. Otherwise, the linker mixes in gcstub with the regular gc in a truly frankensteinian mess.
>>
>> 4. rt_init() should not ignore exceptions when dg is null, it should rethrow
>> them.
>>
>>
>>
>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> D-runtime mailing list
>> D-runtime at puremagic.com
>> http://lists.puremagic.com/mailman/listinfo/d-runtime
> 
> 
> _______________________________________________
> D-runtime mailing list
> D-runtime at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/d-runtime

January 16, 2011
Yup, that's bug #2. I'll get it fixed & closed.

Brad Roberts wrote:
> I just saw bug 5320, also directly related to this area of the code:
>
> http://d.puremagic.com/issues/show_bug.cgi?id=5320
>   gcstub/gc.d: SEGV because of missing returns
>
> On 1/16/2011 9:54 AM, Rainer Schuetze wrote:
> 
>> Hi,
>>
>> I have not updated to the latest version yet, but it seems that it's the first time that gc_addRange is called during initialization, and this hits the bug in the gc stub code (obviously, the return statements are missing for the "proxy is null" case).
>>
>> Even if fixed, the proxy has the problem that anything that has been allocated until the proxy is switched, uses a different heap, because the C-Runtime is not shared between the DLLs. This will cause problems when trying to scan/collect objects.
>>
>> I've proposed using a phobos.dll shared between DLLs, but it involves some trickery: http://d.puremagic.com/issues/show_bug.cgi?id=4071
>>
>> BTW: there is a regression in 2.051 regarding DLLs in a multi-threading environment: http://d.puremagic.com/issues/show_bug.cgi?id=5382
>>
>> Rainer
>>
>> Walter Bright wrote:
>> 
>>> It was failing hard with the latest changes on loading Windows DLLs, so I looked into it.
>>>
>>> I don't see how it ever could have worked.
>>>
>>> Note that DLLs link with gcstub.obj, this is because DLLs share the gc with the caller's gc, instead of having a separate gc that fights the caller's. The general idea is that upon initialization the DLL sets "proxy" to point to the caller's gc, and then all gc calls are routed through the proxy.
>>>
>>> First, in our DLL's DllMain(), we call:
>>>
>>>         case DLL_PROCESS_ATTACH:
>>>             dll_process_attach(hInstance);
>>>
>>> In dll_process_attach(), druntime calls:
>>>
>>>     Runtime.initialize()
>>>
>>> which calls:
>>>
>>>     rt_init(null)
>>>
>>> which calls:
>>>
>>>    gc_init();
>>>    initStaticDataGC();
>>>
>>> which calls:
>>>
>>>     gcstub.gc.gc_addRange()
>>>
>>> which looks like:
>>>
>>> extern (C) void gc_addRange( void* p, size_t sz )
>>> {
>>>     printf("gcstub::gc_addRange() proxy = %p\n", proxy);
>>>     if( proxy is null )
>>>     {
>>>         Range* r = cast(Range*) realloc( ranges,
>>>                                          (nranges+1) * ranges[0].sizeof );
>>>         if( r is null )
>>>             onOutOfMemoryError();
>>>         r[nranges].pos = p;
>>>         r[nranges].len = sz;
>>>         ranges = r;
>>>         ++nranges;
>>>     }
>>>     return proxy.gc_addRange( p, sz );
>>> }
>>>
>>> which will ALWAYS crash because proxy is null. But we never notice the crash, because rt_init() ignores exceptions when dg is null, as in:
>>>
>>>     catch (Throwable e)
>>>     {
>>>         if (dg)
>>>             dg(e);
>>>     }
>>>
>>> and things then proceed with a half-crashed uninitialized runtime.
>>>
>>> *Bugs:*
>>>
>>> 1. proxy is null. It is supposed to be initialized by rt_loadLibrary(). But,
>>> sadly, it dll_process_attach() gets called first by LoadLibrary()! Disaster.
>>>
>>> 2. gcstub's gc_addRange() and gc_addRoot() will always crash if proxy is null.
>>>
>>> 3. gc_isCollecting() needs to be added to gcstub.gc. Otherwise, the linker mixes in gcstub with the regular gc in a truly frankensteinian mess.
>>>
>>> 4. rt_init() should not ignore exceptions when dg is null, it should rethrow
>>> them.
>>>
>>>
>>>
>>>
>>> ------------------------------------------------------------------------
>>>
>>> _______________________________________________
>>> D-runtime mailing list
>>> D-runtime at puremagic.com
>>> http://lists.puremagic.com/mailman/listinfo/d-runtime
>>> 
>> _______________________________________________
>> D-runtime mailing list
>> D-runtime at puremagic.com
>> http://lists.puremagic.com/mailman/listinfo/d-runtime
>> 
>
> _______________________________________________
> D-runtime mailing list
> D-runtime at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/d-runtime
>
>
> 
January 16, 2011
Put the remaining issue on bugzilla: http://d.puremagic.com/issues/show_bug.cgi?id=5457

« First   ‹ Prev
1 2 3