Thread overview
[D-runtime] Trouble with thread attaching/detaching on Linux
Jan 20, 2012
Alex
Jan 20, 2012
Sean Kelly
Jan 20, 2012
Martin Nowak
January 20, 2012
Hi,

(Sorry if this is posted in the wrong place.)

Consider the following code:

import core.sys.posix.pthread,
       core.memory,
       core.thread,
       std.stdio;

extern (C) void rt_moduleTlsCtor();
extern (C) void rt_moduleTlsDtor();

__gshared pthread_key_t key;

static this() { writefln("TLS ctor"); }
static ~this() { writefln("TLS dtor"); }

static extern(C) void* threadMain(void *arg)
{
    pthread_setspecific(key, cast(void*)0xbadc0de);

    thread_attachThis();
    rt_moduleTlsCtor();

    writefln("Hello from thread");

    return null;
}

private static extern (C) void threadExit(void *cd)
{
    writefln("Bye from thread");

    GC.disable();
    thread_detachThis();
    rt_moduleTlsDtor();
    GC.enable();

    pthread_setspecific(key, null);
}

int main()
{
    pthread_t thread;
    pthread_key_create(&key, &threadExit);
    pthread_create(&thread, null, &threadMain, null);
    pthread_join(thread, null);

    return 0;
}

Running it gives:

$ rdmd test.d
TLS ctor
TLS ctor
Hello from thread
Bye from thread
Segmentation fault

Valgrind says:

$ valgrind ./test
==3145== Memcheck, a memory error detector
==3145== Copyright (C) 2002-2010, and GNU GPL'd, by Julian Seward et al.
==3145== Using Valgrind-3.6.1-Debian and LibVEX; rerun with -h for
copyright info
==3145== Command: ./test
==3145==
TLS ctor
TLS ctor
Hello from thread
Bye from thread
==3145== Thread 2:
==3145== Invalid read of size 8
==3145==    at 0x452C72:
_D4core6thread6Thread6removeFPS4core6thread6Thread7ContextZv (in
/home/zor/test)
==3145==    by 0x449F8A:
_D4core6thread6Thread6removeFC4core6thread6ThreadZv (in
/home/zor/test)
==3145==    by 0x44A106: thread_detachThis (in /home/zor/test)
==3145==    by 0x44226E: threadExit (in /home/zor/test)
==3145==    by 0x4E36CE2: __nptl_deallocate_tsd (pthread_create.c:155)
==3145==    by 0x4E36F09: start_thread (pthread_create.c:311)
==3145==    by 0x533589C: clone (clone.S:112)
==3145==  Address 0x78 is not stack'd, malloc'd or (recently) free'd
==3145==
==3145==
==3145== Process terminating with default action of signal 11 (SIGSEGV)
==3145==  Access not within mapped region at address 0x78
==3145==    at 0x452C72:
_D4core6thread6Thread6removeFPS4core6thread6Thread7ContextZv (in
/home/zor/test)
==3145==    by 0x449F8A:
_D4core6thread6Thread6removeFC4core6thread6ThreadZv (in
/home/zor/test)
==3145==    by 0x44A106: thread_detachThis (in /home/zor/test)
==3145==    by 0x44226E: threadExit (in /home/zor/test)
==3145==    by 0x4E36CE2: __nptl_deallocate_tsd (pthread_create.c:155)
==3145==    by 0x4E36F09: start_thread (pthread_create.c:311)
==3145==    by 0x533589C: clone (clone.S:112)
==3145==  If you believe this happened as a result of a stack
==3145==  overflow in your program's main thread (unlikely but
==3145==  possible), you can try to increase the size of the
==3145==  main thread stack using the --main-stacksize= flag.
==3145==  The main thread stack size used in this run was 8388608.
==3145==
==3145== HEAP SUMMARY:
==3145==     in use at exit: 50,752 bytes in 16 blocks
==3145==   total heap usage: 23 allocs, 7 frees, 52,811 bytes allocated
==3145==
==3145== LEAK SUMMARY:
==3145==    definitely lost: 0 bytes in 0 blocks
==3145==    indirectly lost: 0 bytes in 0 blocks
==3145==      possibly lost: 288 bytes in 1 blocks
==3145==    still reachable: 50,464 bytes in 15 blocks
==3145==         suppressed: 0 bytes in 0 blocks
==3145== Rerun with --leak-check=full to see details of leaked memory
==3145==
==3145== For counts of detected and suppressed errors, rerun with: -v
==3145== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 4 from 4)
Killed


Does anyone have a clue what's happening here? Am I doing something very wrong, or might there be a bug in druntime?

Thanks,
Alex
January 20, 2012
This looks roughly correct.  I hadn't thought about adding calls to TlsCtor and TlsDtor, though perhaps I should.  Comments inline.

On Jan 20, 2012, at 6:20 AM, Alex wrote:

> Hi,
> 
> (Sorry if this is posted in the wrong place.)
> 
> Consider the following code:
> 
> import core.sys.posix.pthread,
>       core.memory,
>       core.thread,
>       std.stdio;
> 
> extern (C) void rt_moduleTlsCtor();
> extern (C) void rt_moduleTlsDtor();
> 
> __gshared pthread_key_t key;
> 
> static this() { writefln("TLS ctor"); }
> static ~this() { writefln("TLS dtor"); }
> 
> static extern(C) void* threadMain(void *arg)
> {
>    pthread_setspecific(key, cast(void*)0xbadc0de);
> 
>    thread_attachThis();
>    rt_moduleTlsCtor();
> 
>    writefln("Hello from thread");
> 
>    return null;
> }
> 
> private static extern (C) void threadExit(void *cd)
> {
>    writefln("Bye from thread");
> 
>    GC.disable();
>    thread_detachThis();
>    rt_moduleTlsDtor();

You might want to reverse the detach and TlsDtor calls here, since it's possible that static dtors might want access to the current Thread object.
January 20, 2012
>     thread_detachThis();
>     rt_moduleTlsDtor();
Destruction order should always be reverse of construction order.