Thread overview
Segfault when adding a static destructor in druntime/src/rt/sections_elf_shared.d
Jan 08, 2019
RazvanN
Jan 08, 2019
Johan Engelen
Jan 10, 2019
RazvanN
Jan 10, 2019
RazvanN
Jan 11, 2019
RazvanN
January 08, 2019
Hi all,

I am working on issue 14650 [1] and I would like to implement a solution where static destructors are destroying global variables. However, I have the following problem in druntime/src/rt/sections_elf_shared:

struct ThreadDSO
{
    DSO* _pdso;
    static if (_pdso.sizeof == 8) uint _refCnt, _addCnt;
    else static if (_pdso.sizeof == 4) ushort _refCnt, _addCnt;
    else static assert(0, "unimplemented");
    void[] _tlsRange;
    alias _pdso this;
    // update the _tlsRange for the executing thread
    void updateTLSRange() nothrow @nogc
    {
        _tlsRange = _pdso.tlsRange();
    }
}
Array!(ThreadDSO) _loadedDSOs;


For this code, I would have to create the following static destructor:

static ~this() { _loadedDSOs.__dtor(); }

Because Array defines a destructor which sets its length to 0.

However this code leads to segfault when compiling any program with the runtime (betterC code compiles successfully). In my attempt to debug it, I dropped my patch and added the above mentioned static destructor manually in druntime which lead to the same effect. Interestingly, _loadedDSOs.__dtor runs successfully, the segmentation fault comes from somewhere higher in the call path (outside of the _d_run_main function (in rt/dmain2.d)). I'm thinking that the static destructor somehow screws up the object which is later referenced after the main program finished executing. Does someone well versed in druntime has any ideas what's happening?

Cheers,
RazvanN



[1] https://issues.dlang.org/show_bug.cgi?id=14650
January 08, 2019
On Tuesday, 8 January 2019 at 12:54:11 UTC, RazvanN wrote:
> Hi all,
>
> I am working on issue 14650 [1]

Great!
(I am _extremely_ surprised that dtors are not called for globals.)

and I would like to implement a
> solution where static destructors are destroying global variables. However, I have the following problem in druntime/src/rt/sections_elf_shared:
>
> struct ThreadDSO
> {
>     DSO* _pdso;
>     static if (_pdso.sizeof == 8) uint _refCnt, _addCnt;
>     else static if (_pdso.sizeof == 4) ushort _refCnt, _addCnt;
>     else static assert(0, "unimplemented");
>     void[] _tlsRange;
>     alias _pdso this;
>     // update the _tlsRange for the executing thread
>     void updateTLSRange() nothrow @nogc
>     {
>         _tlsRange = _pdso.tlsRange();
>     }
> }
> Array!(ThreadDSO) _loadedDSOs;
>
>
> For this code, I would have to create the following static destructor:
>
> static ~this() { _loadedDSOs.__dtor(); }
>
> Because Array defines a destructor which sets its length to 0.
>
> However this code leads to segfault when compiling any program with the runtime (betterC code compiles successfully). In my attempt to debug it, I dropped my patch and added the above mentioned static destructor manually in druntime which lead to the same effect. Interestingly, _loadedDSOs.__dtor runs successfully, the segmentation fault comes from somewhere higher in the call path (outside of the _d_run_main function (in rt/dmain2.d)). I'm thinking that the static destructor somehow screws up the object which is later referenced after the main program finished executing. Does someone well versed in druntime has any ideas what's happening?

This is my guess:

Have a look at `_d_dso_registry` and it's description. The function is also called upon shutdown and it accesses `_loadedDSOs`.
As part of shutdown, `_d_dso_registry` calls `runModuleDestructors` (which will run your compiler-inserted static dtor), but _after_ that `_d_dso_registry` accesses `_loadedDSOs`. I don't know exactly why the segfault happens, but the code assumes in several places that `_loadedDSOs` is non-empty. For example, `popBack` is called and `popBack` is invalid for length=0 (it will set the length to `size_t.max` !).

I think the solution is to not have `_loadedDSOs` be of type `Array!T` but of a special type that explicitly has no dtor (i.e. the "dtor" should only be called explicitly such that the data needed for shutdown survives `runModuleDestructors`). This probably applies to more of these druntime low-level arrays and other data structures.

-Johan

[1] The dtor of Array calls reset, and reset has a bug in rt.util.container.Array. Note the invariant: `assert(!_ptr == !_length);`, which triggers when `_length` is set to 0, but `_ptr` is not set to `null`. !!!
January 10, 2019
On Tuesday, 8 January 2019 at 14:30:24 UTC, Johan Engelen wrote:
> On Tuesday, 8 January 2019 at 12:54:11 UTC, RazvanN wrote:
>> [...]
>
> Great!
> (I am _extremely_ surprised that dtors are not called for globals.)
>
> [...]

Thanks! This is really helpful!

RazvanN
January 10, 2019
On 1/8/19 7:54 AM, RazvanN wrote:
> Hi all,
> 
> I am working on issue 14650 [1] and I would like to implement a solution where static destructors are destroying global variables. However, I have the following problem in druntime/src/rt/sections_elf_shared:
> 
> struct ThreadDSO
> {
>      DSO* _pdso;
>      static if (_pdso.sizeof == 8) uint _refCnt, _addCnt;
>      else static if (_pdso.sizeof == 4) ushort _refCnt, _addCnt;
>      else static assert(0, "unimplemented");
>      void[] _tlsRange;
>      alias _pdso this;
>      // update the _tlsRange for the executing thread
>      void updateTLSRange() nothrow @nogc
>      {
>          _tlsRange = _pdso.tlsRange();
>      }
> }
> Array!(ThreadDSO) _loadedDSOs;
> 
> 
> For this code, I would have to create the following static destructor:
> 
> static ~this() { _loadedDSOs.__dtor(); }
> 
> Because Array defines a destructor which sets its length to 0.
> 
> However this code leads to segfault when compiling any program with the runtime (betterC code compiles successfully). In my attempt to debug it, I dropped my patch and added the above mentioned static destructor manually in druntime which lead to the same effect. Interestingly, _loadedDSOs.__dtor runs successfully, the segmentation fault comes from somewhere higher in the call path (outside of the _d_run_main function (in rt/dmain2.d)). I'm thinking that the static destructor somehow screws up the object which is later referenced after the main program finished executing. Does someone well versed in druntime has any ideas what's happening?

That is a thread-local static destructor. Are any shared static destructors accessing the array?

You might be able to determine this by printf debugging between calling unshared and shared destructors.

-Steve
January 10, 2019
On Thursday, 10 January 2019 at 15:04:25 UTC, Steven Schveighoffer wrote:
> On 1/8/19 7:54 AM, RazvanN wrote:
>> [...]
>
> That is a thread-local static destructor. Are any shared static destructors accessing the array?
>
No, there aren't. Indeed, the problem is as Johan as said: the loadedDSOs should not be wrapped in an array with a destructor because it is manually destroyed.
> You might be able to determine this by printf debugging between calling unshared and shared destructors.
>
> -Steve

January 10, 2019
On 1/10/19 5:12 PM, RazvanN wrote:
> On Thursday, 10 January 2019 at 15:04:25 UTC, Steven Schveighoffer wrote:
>> On 1/8/19 7:54 AM, RazvanN wrote:
>>> [...]
>>
>> That is a thread-local static destructor. Are any shared static destructors accessing the array?
>>
> No, there aren't. Indeed, the problem is as Johan as said: the loadedDSOs should not be wrapped in an array with a destructor because it is manually destroyed.

Hm... is this a sign of how things will be once the (necessary IMO) change to destroying globals is deployed?

-Steve
January 11, 2019
On Thursday, 10 January 2019 at 23:04:37 UTC, Steven Schveighoffer wrote:
> On 1/10/19 5:12 PM, RazvanN wrote:
>> On Thursday, 10 January 2019 at 15:04:25 UTC, Steven Schveighoffer wrote:
>>> On 1/8/19 7:54 AM, RazvanN wrote:
>>>> [...]
>>>
>>> That is a thread-local static destructor. Are any shared static destructors accessing the array?
>>>
>> No, there aren't. Indeed, the problem is as Johan as said: the loadedDSOs should not be wrapped in an array with a destructor because it is manually destroyed.
>
> Hm... is this a sign of how things will be once the (necessary IMO) change to destroying globals is deployed?
>
> -Steve

At least for this specific situation, yes.