Jump to page: 1 2
Thread overview
Druntime and non-D threads
Dec 08, 2017
Ali Çehreli
Dec 08, 2017
Kagamin
Dec 11, 2017
Ali Çehreli
Dec 08, 2017
Nemanja Boric
Dec 11, 2017
Ali Çehreli
Dec 11, 2017
Mengu
Dec 12, 2017
Ali Çehreli
Dec 08, 2017
Guillaume Piolat
Dec 11, 2017
Ali Çehreli
Dec 12, 2017
Joakim
Dec 12, 2017
Ali Çehreli
December 08, 2017
I'm trying to use D as a library to be called from a non-D environment e.g. Java runtime. If I'm not mistaken, it's quite difficult and perhaps impossible to use GC in such a scenario. It works as long as attached threads don't go away either by themselves or by thread_detachThis.

My setup is Linux (Ubuntu-based), dmd 2.077.1, 64-bit build. D is used in a shared library that is called by non-D threads. (Tested with C and Java.)

1) The following newsgroup topic is about calling thread_attachThis() for threads created outside of D:

  http://forum.dlang.org/post/ounui4$171a$1@digitalmars.com

As suggested in that thread, I think I have to call thread_detachThis but I'm not sure when that can be safely done. One idea was to attach and detach in every api function something to the effect of

extern(C) my_api_func() {
    thread_attachThis();
    scope(exit) thread_detachThis();

    // Do work, potentially producing garbage...
}

Does that make sense? Wouldn't garbage produced by that thread leaked after detaching? However, failing to detach would be bad as well as the calling thread can terminate without our knowledge. (More on that below.)

2) Obviously, Runtime.initialize() must be called for Druntime to work at all. Question: Is the thread that calls Runtime.initialize() special compared to the other threads? Can this thread disappear and the Druntime still work?

3) An attached non-D thread can exit without any notice (gracefully or otherwise) while it's still attached to D's GC, causing segmentation faults or deadlock.

I failed to find a way for Druntime to be resilient when such threads disappear. For example, the registered cleanup handler in thread.d is called only for cancelled threads, not the ones that exit simply by returning from their thread functions. (This is according to cleanup handler spec.)

4) Druntime uses pthread_kill to signal threads to suspend (and resume) threads. However, successful return of this function does not mean that the thread will respond to that signal. So, we have a couple of bugs in Druntime as the number of sem_wait() calls we make depends on the unreliable return value of pthread_kill. Perhaps that's the reason for bugs like the following:

  https://issues.dlang.org/show_bug.cgi?id=15939

I don't see a way out of this POSIX limitation. (pthread_key_create may help as a "thread destructor" but I haven't played with it yet. thread.d beat me up pretty bad for more than two days; I'm too tired to do anything else right now. :) )

5) We depend on SIGUSR1 (and SIGUSR2, which may not be necessary but it's a different topic) to suspend non-D threads. Does that work with all threads? What if the calling framework has other uses for those signals? Would we be interfering with them?

So, what are the rules of using D as a library for a non-D framework? I have the following so far but I'm not sure on all points:

- SURE: One thread must make a call to Runtime.initialize()

- SURE: Every D api call must call thread_attachThis

- SURE: Attached threads must *not* terminate gracefully, due to error, or by cancellation. (As there is no way of guaranteeing this in POSIX, I think using D as a library in a framework is best-effort at best.)

- NOT SURE: thread_detachThis must *not* be called as the thread may have uncollected garbage.

- NOT SURE: SIGUSR1 and SIGUSR2 should be available.

Ali
December 08, 2017
You can create a D thread an send request to it.
December 08, 2017
On Friday, 8 December 2017 at 09:33:03 UTC, Ali Çehreli wrote:
> 5) We depend on SIGUSR1 (and SIGUSR2, which may not be necessary but it's a different topic) to suspend non-D threads. Does that work with all threads? What if the calling framework has other uses for those signals? Would we be interfering with them?
>

As the signal handlers are setup per-process, having the non-D threads setup `SIGUSR1/2` will probably screw the entire GC, not just for these threads. I feel you must ensure that the non-D threads don't try to setup these handlers after the `rt_init` (which in turns calls `thread_init`) is called, otherwise you're screwed. This is also valid in inverse - you shouldn't use SIGUSR1/2 in non-D threads, since after calling `rt_init` the signal handlers will be replaced with druntime's ones.
December 08, 2017
On Friday, 8 December 2017 at 09:33:03 UTC, Ali Çehreli wrote:
> One idea was to attach and detach in every api function something to the effect of
>
> extern(C) my_api_func() {
>     thread_attachThis();
>     scope(exit) thread_detachThis();
>
>     // Do work, potentially producing garbage...
> }
>
> Does that make sense? Wouldn't garbage produced by that thread leaked after detaching?

It makes sense.
AFAIK Detaching a thread deregisters its stack (local holding GC objects). And if the data was referred only by the thread stack and this range was removed, then why couldn't the data be reclaimed?

I don't know what this implies for TLS roots though.

> 2) Obviously, Runtime.initialize() must be called for Druntime to work at all. Question: Is the thread that calls Runtime.initialize() special compared to the other threads? Can this thread disappear and the Druntime still work?

Don't know for sure, but I believe it's not a special thread and can disappear.


> 3) An attached non-D thread can exit without any notice (gracefully or otherwise) while it's still attached to D's GC, causing segmentation faults or deadlock.

Isn't this an unrecoverable error?
  - either you have failed deregistering the thread which got killed outside your dynlib
  - either you killed it while it was attached

> So, what are the rules of using D as a library for a non-D framework? I have the following so far but I'm not sure on all points:
>
> - SURE: One thread must make a call to Runtime.initialize()
>
> - SURE: Every D api call must call thread_attachThis

I advise to make a RAII struct you will put in any accessible callback, which deals with this. Runtime finalization needs a special place though => harder.

The remaining problem is races, having interlocked singleton initialization without the runtime is a mystery to me.

> - SURE: Attached threads must *not* terminate gracefully, due to error, or by cancellation. (As there is no way of guaranteeing this in POSIX, I think using D as a library in a framework is best-effort at best.)

Since you get those threads from the outside it's certainly impolite to terminate them.

> - NOT SURE: thread_detachThis must *not* be called as the thread may have uncollected garbage.

IMHO thread_detachThis *must* be called at entry-point exit.
Detach these threads at scope(exit), and avoid sorrow and call stacks with pthread_kill inside.


December 11, 2017
On 12/08/2017 02:53 AM, Kagamin wrote:
> You can create a D thread an send request to it.

That's a good idea. Thanks.

Ali

December 11, 2017
On 12/08/2017 04:23 AM, Guillaume Piolat wrote:

>> Every D api call must call thread_attachThis

> I advise to make a RAII struct you will put in any accessible callback, which deals with this

Of course. :) That's how I've been trying to use.

> IMHO thread_detachThis *must* be called at entry-point exit.
> Detach these threads at scope(exit), and avoid sorrow and call stacks
> with pthread_kill inside.

Agreed. My troubles turned out to be due to a druntime bug, which I think I've managed to fix:

  https://issues.dlang.org/show_bug.cgi?id=18063

Ali

December 11, 2017
On 12/08/2017 02:54 AM, Nemanja Boric wrote:
> On Friday, 8 December 2017 at 09:33:03 UTC, Ali Çehreli wrote:
>> 5) We depend on SIGUSR1 (and SIGUSR2, which may not be necessary but it's a different topic) to suspend non-D threads. Does that work with all threads? What if the calling framework has other uses for those signals? Would we be interfering with them?
>>
> 
> As the signal handlers are setup per-process, having the non-D threads setup `SIGUSR1/2` will probably screw the entire GC, not just for these threads. I feel you must ensure that the non-D threads don't try to setup these handlers after the `rt_init` (which in turns calls `thread_init`) is called, otherwise you're screwed. This is also valid in inverse - you shouldn't use SIGUSR1/2 in non-D threads, since after calling `rt_init` the signal handlers will be replaced with druntime's ones.

So, in cases where D is just a portable library, the only sane thing to do seems to be what Kagamin suggested: create a D thread and send requests to it.

That way, we would be in total control of our threads, making entry-attach/exit-detach calls unnecessary. Agreed?

Ali
December 11, 2017
On Monday, 11 December 2017 at 16:25:42 UTC, Ali Çehreli wrote:
> On 12/08/2017 02:54 AM, Nemanja Boric wrote:
>> [...]
>
> So, in cases where D is just a portable library, the only sane thing to do seems to be what Kagamin suggested: create a D thread and send requests to it.
>
> That way, we would be in total control of our threads, making entry-attach/exit-detach calls unnecessary. Agreed?
>
> Ali

care to explain what exactly that means for the rest of us who are n00bs? :-)

December 11, 2017
On 12/11/2017 08:58 AM, Mengu wrote:
> On Monday, 11 December 2017 at 16:25:42 UTC, Ali Çehreli wrote:
>> On 12/08/2017 02:54 AM, Nemanja Boric wrote:
>>> [...]
>>
>> So, in cases where D is just a portable library, the only sane thing
>> to do seems to be what Kagamin suggested: create a D thread and send
>> requests to it.
>>
>> That way, we would be in total control of our threads, making
>> entry-attach/exit-detach calls unnecessary. Agreed?
>>
>> Ali
>
> care to explain what exactly that means for the rest of us who are
> n00bs? :-)

A recent issue made me spend quite a bit of time in core/thread.d, which improved my understanding of that code. As soon as I feel confident, I would like to write a document about my understanding. (Not necessarily thread.d's implementation but how to use D runtime with non-D threads.)

In the case of a D library that will be called by user threads with unknown attributes (e.g. some detachable threads some not; some joinable threads, some not), it's clear that a D function must attach and detach upon entry and exit to the API function:

// D library function, called on a non-D thread:
extern(C) void foo() {
    // Both of these calls involve locks:
    thread_attachThis();
    scope(exit) thread_detachThis();

    // Do D work by freely using the GC ...
}

We have to detach because we don't know whether we will ever be called from the same thread again or even whether the thread is about to terminate or not.

What Kagamin recommended is another way: Create threads in the D code, which obviates attach/detach calls and removes all questions about thread lifetimes. So, the API function could be the following:

void foo() @nogc {
    // Dispatch work to one of the D threads without doing
    // any complicated work here:
    enqueue_task();
}

Ali

December 12, 2017
On Friday, 8 December 2017 at 09:33:03 UTC, Ali Çehreli wrote:
> I'm trying to use D as a library to be called from a non-D environment e.g. Java runtime. If I'm not mistaken, it's quite difficult and perhaps impossible to use GC in such a scenario. It works as long as attached threads don't go away either by themselves or by thread_detachThis.

I've been doing this for some time, by running all the D stdlib tests in a shared library that's called from Android's Java runtime, no problem with the GC or threads, if I set it up right and with a tweak or two:

https://wiki.dlang.org/Build_D_for_Android#Changes_for_Android

However, I go the other way and call Java methods from D, so it does depend on whether the process running the D shared library is long-running or not, as I've had issues when a D function or two are called periodically from a Java app instead.

> My setup is Linux (Ubuntu-based), dmd 2.077.1, 64-bit build. D is used in a shared library that is called by non-D threads. (Tested with C and Java.)
>
> 1) The following newsgroup topic is about calling thread_attachThis() for threads created outside of D:
>
>   http://forum.dlang.org/post/ounui4$171a$1@digitalmars.com
>
> As suggested in that thread, I think I have to call thread_detachThis but I'm not sure when that can be safely done. One idea was to attach and detach in every api function something to the effect of
>
> extern(C) my_api_func() {
>     thread_attachThis();
>     scope(exit) thread_detachThis();
>
>     // Do work, potentially producing garbage...
> }
>
> Does that make sense? Wouldn't garbage produced by that thread leaked after detaching? However, failing to detach would be bad as well as the calling thread can terminate without our knowledge. (More on that below.)
>
> 2) Obviously, Runtime.initialize() must be called for Druntime to work at all. Question: Is the thread that calls Runtime.initialize() special compared to the other threads? Can this thread disappear and the Druntime still work?
>
> 3) An attached non-D thread can exit without any notice (gracefully or otherwise) while it's still attached to D's GC, causing segmentation faults or deadlock.
>
> I failed to find a way for Druntime to be resilient when such threads disappear. For example, the registered cleanup handler in thread.d is called only for cancelled threads, not the ones that exit simply by returning from their thread functions. (This is according to cleanup handler spec.)

I haven't had to try all these thread registration methods, perhaps because the apps I'm testing are much simpler or because I'm going the other way from D to Java most of the time.

> 4) Druntime uses pthread_kill to signal threads to suspend (and resume) threads. However, successful return of this function does not mean that the thread will respond to that signal. So, we have a couple of bugs in Druntime as the number of sem_wait() calls we make depends on the unreliable return value of pthread_kill. Perhaps that's the reason for bugs like the following:
>
>   https://issues.dlang.org/show_bug.cgi?id=15939
>
> I don't see a way out of this POSIX limitation. (pthread_key_create may help as a "thread destructor" but I haven't played with it yet. thread.d beat me up pretty bad for more than two days; I'm too tired to do anything else right now. :) )
>
> 5) We depend on SIGUSR1 (and SIGUSR2, which may not be necessary but it's a different topic) to suspend non-D threads. Does that work with all threads? What if the calling framework has other uses for those signals? Would we be interfering with them?

Those signals are used for D threads, should work fine unless they're being intercepted somewhere, as they are by the Android runtime.  However, you can always change the signals used, as I did by swapping them on Android, and as others are trying to for other reasons:

https://github.com/dlang/druntime/pull/1851#discussion_r123886260
https://github.com/dlang/druntime/pull/1565

> So, what are the rules of using D as a library for a non-D framework? I have the following so far but I'm not sure on all points:
>
> - SURE: One thread must make a call to Runtime.initialize()
>
> - SURE: Every D api call must call thread_attachThis
>
> - SURE: Attached threads must *not* terminate gracefully, due to error, or by cancellation. (As there is no way of guaranteeing this in POSIX, I think using D as a library in a framework is best-effort at best.)
>
> - NOT SURE: thread_detachThis must *not* be called as the thread may have uncollected garbage.
>
> - NOT SURE: SIGUSR1 and SIGUSR2 should be available.

I have tried to avoid all these problems by having the D shared library be the starting point of the app and calling Java functions occasionally instead, so haven't delved into all this.
« First   ‹ Prev
1 2