Thread overview
Calling D library from other languages on linux using foreign threads
Mar 22, 2019
tchaloupka
Mar 23, 2019
Andre Pany
Mar 23, 2019
tchaloupka
Mar 23, 2019
Ali Çehreli
Mar 23, 2019
tchaloupka
Mar 25, 2019
tchaloupka
March 22, 2019
I've searched a lot at it should be working at least on linux, but apparently is not or I'm doing something totally wrong..

Our use case is to call shared D library from C# (.Net Core) and from different threads.

What I've read about this, is that foreign thread should be registered by `thread_attachThis()`.

But even with this, there is a trouble with GC being called in that thread which ends up with:

```D
Thread 3 "main" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff7aeb700 (LWP 5850)]
0x000000000045bed7 in _D4core6thread15scanAllTypeImplFNbMDFNbEQBmQBk8ScanTypePvQcZvQgZv ()
(gdb) bt
#0  0x000000000045bed7 in _D4core6thread15scanAllTypeImplFNbMDFNbEQBmQBk8ScanTypePvQcZvQgZv ()
#1  0x000000000045be7f in _D4core6thread18thread_scanAllTypeUNbMDFNbEQBpQBn8ScanTypePvQcZvZ__T9__lambda2TQvZQoMFNbQBeZv ()
#2  0x000000000044dc10 in core.thread.callWithStackShell(scope void(void*) nothrow delegate) ()
#3  0x000000000045be56 in thread_scanAllType ()
#4  0x00000000004599aa in thread_scanAll ()
#5  0x00000000004582a9 in _D2gc4impl12conservativeQw3Gcx__T7markAllS_DQBqQBqQBoQBzQBe16markConservativeMFNbNlPvQcZvZQCfMFNbbZv ()
#6  0x0000000000453f1d in _D2gc4impl12conservativeQw3Gcx11fullcollectMFNbbZm ()
#7  0x000000000045732f in _D2gc4impl12conservativeQw14ConservativeGC__T9runLockedS_DQCeQCeQCcQCnQBs11fullCollectMFNbZ2goFNbPSQDtQDtQDrQEc3GcxZmTQvZQCyMFNbKQBgZm ()
#8  0x000000000045167d in _D2gc4impl12conservativeQw14ConservativeGC11fullCollectMFNbZm ()
#9  0x000000000045165e in _D2gc4impl12conservativeQw14ConservativeGC7collectMFNbZv ()
#10 0x000000000042f2bd in gc_collect ()
#11 0x000000000042ca11 in core.memory.GC.collect() ()
#12 0x000000000042c8a6 in entry_point2 (_param_0=0x0) at worker.d:36
#13 0x000000000042c62e in threadFun (arg=0x42c864 <entry_point2>) at main.d:24
#14 0x00007ffff7f6e58e in start_thread () from /usr/lib64/libpthread.so.0
#15 0x00007ffff7cee6a3 in clone () from /usr/lib64/libc.so.6

```

I've tried to compile phobos with debug symbols and ended up with this:

```D
Thread 3 "main" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff7aeb700 (LWP 11470)]
0x000000000043d7e3 in invariant._d_invariant(Object) (o=0x7ffff6eeb000) at src/rt/invariant.d:27
27	    c = typeid(o);
(gdb) bt
#0  0x000000000043d7e3 in invariant._d_invariant(Object) (o=0x7ffff6eeb000) at src/rt/invariant.d:27
#1  0x000000000043927a in _D4core6thread6Thread6removeFNbNiCQBgQBeQBaZv (t=0x7ffff6eeb000) at src/core/thread.d:1864
#2  0x000000000043947b in thread_detachThis () at src/core/thread.d:2263
#3  0x00000000004378b0 in entry_point2 (_param_0=0x0) at worker.d:39
#4  0x000000000043762e in threadFun (arg=0x437864 <entry_point2>) at main.d:24
#5  0x00007ffff7f6e58e in start_thread () from /usr/lib64/libpthread.so.0
#6  0x00007ffff7cee6a3 in clone () from /usr/lib64/libc.so.6
```

So actually after manual GC.collect call in test method which is really strange.

I've tested this with dmd-2.085.0 because of this fix: https://github.com/dlang/druntime/commit/f60eb358ccbc14a1a5fc1774eab505ed0132e999 which seemed to be exactly what is needed for this to work.

I've even created a github repo with my tests using 4 variations:

* static lib from D
* dynamic lib from D
* static lib from C
* dynamic lib from D

all using pthread as a foreign thread on linux.

You can see it here: https://github.com/tchaloupka/dlangsharedlib

All tests fails with the same result.
C tests only differs in that they explicitly calls `rt_init()` to initialize DRuntime. Should be the same otherwise.

I wonder where is the problem.
Is GC supposed to work with foreign threads when registered?
March 23, 2019
On Friday, 22 March 2019 at 19:34:14 UTC, tchaloupka wrote:
> I've searched a lot at it should be working at least on linux, but apparently is not or I'm doing something totally wrong..
>
> [...]

I just noticed another issue which might be related but I doubt. https://issues.dlang.org/show_bug.cgi?id=18815

Just to make sure, could you test it with dmd 2.78?

2 years ago I think I got it working calling a D dll from C# (windows). It should work in general.

Kind regards
Andre
March 23, 2019
On Saturday, 23 March 2019 at 09:47:55 UTC, Andre Pany wrote:
> On Friday, 22 March 2019 at 19:34:14 UTC, tchaloupka wrote:
> Just to make sure, could you test it with dmd 2.78?

Actually when I remove the explicit GC call within unregistered thread (which is fixed in https://github.com/dlang/druntime/commit/42b4e0a9614ac794d4549ed5b2455fd0f805e123) then it works with dmd-2.078.1 but not with 2.079.1.

Well almost. Because sometimes it just hangs for me on:

```
#0  0x00007fd5887d48ee in sigsuspend () from /usr/lib64/libc.so.6
#1  0x00000000004526ec in core.thread.thread_suspendHandler(int).op(void*) ()
#2  0x000000000045274c in core.thread.callWithStackShell(scope void(void*) nothrow delegate) ()
#3  0x0000000000452679 in thread_suspendHandler ()
#4  <signal handler called>
#5  0x00007fd588b1aacb in __pthread_timedjoin_ex () from /usr/lib64/libpthread.so.0
#6  0x00000000004277ad in D main () at main.d:89
```

So thread is unregistered with:
```
rt_moduleTlsDtor();
thread_detachThis();
```

But hangs on pthread_join. Isn't it suspended by GC?

Manu's https://issues.dlang.org/show_bug.cgi?id=18815 seems to be related but for me it passes thread_attachThis() fine, but crash on GC during the work.
March 23, 2019
On 03/22/2019 12:34 PM, tchaloupka wrote:
> I've searched a lot at it should be working at least on linux, but
> apparently is not or I'm doing something totally wrong..
>
> Our use case is to call shared D library from C# (.Net Core) and from
> different threads.

We needed to do the same from Java. I opened this discussion:

  https://forum.dlang.org/thread/p0dm8f$ij5$1@digitalmars.com

and abandoned this pull request:

  https://github.com/dlang/druntime/pull/1989

We didn't need to pursue this further because the person who was pushing for D was leaving the company, so he rewrote the library in C++ before doing so.

I don't think it's possible to call into D's GC from another thread in a safe way. If I remember correctly, there is no absolute way in POSIX (or just Linux?) of knowing that a foreign thread has died. D's GC would be holding on to an already dead thread.

Ali

March 23, 2019
On Saturday, 23 March 2019 at 15:28:34 UTC, Ali Çehreli wrote:
> On 03/22/2019 12:34 PM, tchaloupka wrote:
> > I've searched a lot at it should be working at least on
> linux, but
> > apparently is not or I'm doing something totally wrong..
> >
> > Our use case is to call shared D library from C# (.Net Core)
> and from
> > different threads.
>
> We needed to do the same from Java. I opened this discussion:
>
>   https://forum.dlang.org/thread/p0dm8f$ij5$1@digitalmars.com
>
> and abandoned this pull request:
>
>   https://github.com/dlang/druntime/pull/1989
>
> We didn't need to pursue this further because the person who was pushing for D was leaving the company, so he rewrote the library in C++ before doing so.
>
> I don't think it's possible to call into D's GC from another thread in a safe way. If I remember correctly, there is no absolute way in POSIX (or just Linux?) of knowing that a foreign thread has died. D's GC would be holding on to an already dead thread.
>
> Ali

That's pretty unfortunate.
I know about your thread and PR, that's why I've tried to solve this by calling thread_attachThis()/thread_detachThis() in every worker function call.

But that doesn't work in compilers > dmd-2.078.1.

Actually after fiddling with this some more, I've discovered that when I call this method:

```D
void* entry_point2(void*)
{
	printf("+entry_point2\n");
	scope (exit) printf("-entry_point2\n");

	// This thread gets registered in druntime, does some work and gets
	// unregistered to be cleaned up manually
	if (!thread_isMainThread()) // thread_attachThis will hang otherwise
	{
		printf("+entry_point2 - thread_attachThis()\n");
		thread_attachThis();
		rt_moduleTlsCtor();
	}

	// simulate GC work
	auto x = new int[100];
	GC.collect();

	if (!thread_isMainThread())
	{
		printf("+entry_point2 - thread_detachThis()\n");
		rt_moduleTlsDtor();
		thread_detachThis();
	}
	return null;
}
```

from the main thread, it works too. I've observed, that:

* auto x = new int[100]; needs to be there as it doesn't work otherwise
* but it works only with some array lengths (I guess due to some GC decision when to kick in) - so is unreliable
* it still hangs sometimes on thread_detachThis()


I've no idea what should be done with C's main thread as it can't be attached because it'll hang.
In one of forum threads I've also read the idea to not using foreign threads with GC but somehow delegate their work to D's thread.
Can this work somehow? But `new Thread()` would still be needed to call from C side or static this() but from C's thread.

March 25, 2019
On Saturday, 23 March 2019 at 17:33:31 UTC, tchaloupka wrote:
> I've no idea what should be done with C's main thread as it can't be attached because it'll hang.
> In one of forum threads I've also read the idea to not using foreign threads with GC but somehow delegate their work to D's thread.
> Can this work somehow? But `new Thread()` would still be needed to call from C side or static this() but from C's thread.

I've tried this approach here: https://github.com/tchaloupka/dlangsharedlib/tree/master/workaround.

I must say, it's an ugly and error prone hack.. :(

It seems to be working, but questions remain:
* is it ok to initialize D runtime and then use GC in the same C thread (to create worker Thread)?
* is this reliable?
* should foreign threads work with D's runtime at all?

In the current state, D's libs seems to be pretty useless to be used from other languages if GC is needed.