Jump to page: 1 2 3
Thread overview
Leave GC collection to the user of the D library?
May 09, 2021
Ali Çehreli
May 09, 2021
Ali Çehreli
May 09, 2021
Ali Çehreli
May 10, 2021
Ali Çehreli
May 09, 2021
Daniel N
May 09, 2021
Ali Çehreli
May 09, 2021
IGotD-
May 09, 2021
Vladimir Panteleev
May 09, 2021
IGotD-
May 09, 2021
IGotD-
May 09, 2021
Vladimir Panteleev
May 09, 2021
Ali Çehreli
May 10, 2021
Imperatorn
May 11, 2021
Vladimir Panteleev
May 08, 2021
tl;dr I am scared of non-D programs calling my D library functions from foreign threads. So, I am planning on asking the user to trigger collection themselves by calling a collection function of my library. Crazy?

I've had serious issues bringing up a D library in a foreign environment: Python modules loading a C++ library, which in turn uses our D library. There were segmentation failures when loading this .so.

One workaround was to start the GC in disabled state with the following global variable defined in the library.

  extern(C) __gshared string[] rt_options = [ "gcopt=disable:1" ];

After that, the GC is enabled inside the library's initialization function with the following command.

    GC.enable();

That workaround seemed to be sufficient to load the library successfully. Unfortunately, that was not enough to weed out all issues related to libraries because this library itself loads other D libraries. All of this caused sporadic issues. (My brain is too fried to even remember what was a cause, what was a usable workaround, etc. Sometimes I wasted days chasing a solution while using a test, which had nothing to do with the solution. I would change the code, test, no go; repeat, no go. It turns out, my test was unrelated. Argh!)

So, we came up with a drastic solution: Since all this code works just fine in a pure D environment, make the library as thin as possible; the library starts a daemon that is written in D with all the functionality. The library merely dispatches the requests to that daemon.

The library starts the daemon with pipeProcess(); pipes are used for dispatching requests and shared memory is used for large data. This idea "worked like a charm." Phew!

However, dispatching of the requests to the daemon is performed by a single library thread in a blocked manner: When a request is written to the pipe, the response is read back (blocked) and the result is returned to the user of the library function.

So now we want to use this functionality from multiple threads. Yikes! I had so much trouble with foreign threads calling D libraries in the past that I get scared. (In one case it was Java threads.) There are so many dimensions to play with, hypothesizing a correct solution has been exhausting. I was never sure whether the issues were e.g. with threadAttachThis() or my misusing it.

Ok... How about this idea that would allow this library to be used from multiple threads: Leave the GC disabled with that 'rt_options' variable above and don't enable it in the library initialization function (this is not init(); rather, a function that the user calls explicitly). Instead, add yet another library API function for collecting garbage. I can document that no other thread is allowed to call any other function of the library when this collection function is called. They can do this either at strategic points that they know no other thread is using the library or they can use a mutex.

Another trivial function that I add can relay GC stats to the user so that they can decide to call the GC if the allocations have been high enough.

This would allow the user start as many foreign threads as possible. Right? Is this sane? Is collection the only issue here? Do foreign threads still need to call threadAttachThis()? What happens if they don't?

I feel so hopeless that in the past, I even thought about and experimented with banning the user from starting threads on their own. Rather, they would call my library on a posix compatible thread API and create their threads through me, which happens to be a D thread, so no thread would be a "foreign thread" and everything would work just fine. I haven't deployed this crazy idea (yet).

Ali
May 09, 2021
On Sunday, 9 May 2021 at 03:25:06 UTC, Ali Çehreli wrote:
> The library starts the daemon with pipeProcess(); pipes are used for dispatching requests and shared memory is used for large data. This idea "worked like a charm." Phew!

Why don't you do this in a manner that works with multiple threads? Block calling threads with semaphores, wake them up when results are ready.

(Sidenote, if gc was task limited then this would not have been an issue...)


May 09, 2021
On Sunday, 9 May 2021 at 03:25:06 UTC, Ali Çehreli wrote:
> tl;dr I am scared of non-D programs calling my D library functions from foreign threads. So, I am planning on asking the user to trigger collection themselves by calling a collection function of my library. Crazy?
>

Since this is a "thin" library, feels like the sane solution is to make it 100% nogc and keep the GC only in the server.

But might not be possible if the design of those other libraries are all-in on GC
"this library itself loads other D libraries"

May 09, 2021
On 5/9/21 1:09 AM, Daniel N wrote:

> On Sunday, 9 May 2021 at 03:25:06 UTC, Ali Çehreli wrote:
>> tl;dr I am scared of non-D programs calling my D library functions
>> from foreign threads. So, I am planning on asking the user to trigger
>> collection themselves by calling a collection function of my library.
>> Crazy?
>>
>
> Since this is a "thin" library, feels like the sane solution is to make
> it 100% nogc and keep the GC only in the server.

May not be possible because the library creates MmFile objects as needed (the data length to be placed on shared memory is not known in advance). Being a class, MmFile objects are naturally created with 'new' but perhaps placement new could work. I haven't investigated that.

> But might not be possible if the design of those other libraries are
> all-in on GC
> "this library itself loads other D libraries"

I wasn't clear on that part: Now that the library is thin, loading other D libraries has already been pushed to the backend daemon.

So, my worry is based on my unconfidence in managing foreign threads with thread_attachThis and thread_detachThis. Should I use a thread-local Boolean to keep track? Should I call those inside 'static this' and 'static ~this' blocks? What if the thread dies? I see that even the name of the functions gained a "_tpl":


https://dlang.org/phobos/core_thread_threadbase.html#.thread_attachThis_tpl

I wonder what "_tpl" means. It mentions rt_moduleTlsCtor() there.

Failing to find my way through all of that is the reason why I am hoping that I can push a GC collection cycle to the caller. Although embarrassing, it seems to be a more reliable design because the user is in a better position to know they are not executing any of the D library functions when they call the collection function.

Ali


May 09, 2021
On 5/9/21 12:59 AM, Ola Fosheim Grostad wrote:

> On Sunday, 9 May 2021 at 03:25:06 UTC, Ali Çehreli wrote:
>> The library starts the daemon with pipeProcess(); pipes are used for
>> dispatching requests and shared memory is used for large data. This
>> idea "worked like a charm." Phew!
>
> Why don't you do this in a manner that works with multiple threads?

That's what I want to do but those threads are created by the user, unknown to the D GC. Although there are ways of dealing with that case[1], I am questioning whether the library can disable GC and lets the user manage GC collections explicitly (cooperatively?).

Ali

[1] https://dlang.org/phobos/core_memory.html


May 09, 2021
On Sunday, 9 May 2021 at 10:27:04 UTC, Ali Çehreli wrote:
> That's what I want to do but those threads are created by the user, unknown to the D GC. Although there are ways of dealing with that case[1], I am questioning whether the library can disable GC and lets the user manage GC collections explicitly (cooperatively?).

Wouldn't this be annoying for the user of the library? (Or maybe you only have a handful users?).

I have to admit I haven't read about IPC in a decade or so, but it seems to me that there are many options? Like, if you have N cores, create N worker threads in the daemon and create N pipes? Then set the count of the semaphore to N, so when the semaphore hits 0 all pipes are in use, and the calling thread will wait in the API stub (wrapper function) until a worker thread is available? Kinda clunky, but fits your model... I guess.

But maybe you want to get rid of the daemon? In that case I probably would just use mailboxes and wrap the API in futures if the work isn't fine-grained, but I don't know what kind if library you are making...


May 09, 2021
On Sunday, 9 May 2021 at 10:42:19 UTC, Ola Fosheim Grøstad wrote:
> semaphore hits 0 all pipes are in use, and the calling thread will wait in the API stub (wrapper function) until a worker thread is available? Kinda clunky, but fits your model... I guess.

So I imagine you could structure your API wrapper something like this:

semaphore.init(N)

wrapped_api_call(){
  semaphore.down(1)

  dostuff()
  api_call()

  semaphore.up(1)

  if (memory pressure){
    semaphore.down(N)
    gc_collect()
    semaphore.up(N)
  }
}

I dunno.

May 09, 2021
On Sunday, 9 May 2021 at 03:25:06 UTC, Ali Çehreli wrote:
>
> That workaround seemed to be sufficient to load the library successfully. Unfortunately, that was not enough to weed out all issues related to libraries because this library itself loads other D libraries.

Can you statically link the library so that no other D library is needed? Might duplicate code but whatever.

However, disable GC in D is difficult as soon some library function use an array for example you need the GC.

>
> This would allow the user start as many foreign threads as possible. Right? Is this sane? Is collection the only issue here? Do foreign threads still need to call threadAttachThis()? What happens if they don't?
>
> I feel so hopeless that in the past, I even thought about and experimented with banning the user from starting threads on their own. Rather, they would call my library on a posix compatible thread API and create their threads through me, which happens to be a D thread, so no thread would be a "foreign thread" and everything would work just fine. I haven't deployed this crazy idea (yet).
>
> Ali

This is why I'm very opposed to any thread specific GC optimization. Allocated memory on GC and malloc/free must always be global and can operated on from any thread. Also, TLS must be removed from phobos/druntime so that there is basically no need for tracking the threads from a memory point of view, at least it shouldn't crash if D doesn't know about the thread unless using thread primitives.

Highest priority should be removing TLS totally. Second, make sure there is no connection between memory management and threads.

May 09, 2021

On Sunday, 9 May 2021 at 03:25:06 UTC, Ali Çehreli wrote:

>

That workaround seemed to be sufficient to load the library successfully. Unfortunately, that was not enough to weed out all issues related to libraries because this library itself loads other D libraries. All of this caused sporadic issues. (My brain is too fried to even remember what was a cause, what was a usable workaround, etc. Sometimes I wasted days chasing a solution while using a test, which had nothing to do with the solution. I would change the code, test, no go; repeat, no go. It turns out, my test was unrelated. Argh!)

In my experience, calling D from C/C++ works fine as long as 1) the D runtime is allowed to initialize, and 2) all threads which execute D code are registered with the D runtime.

If C/C++ code is allowed to hold the only reference to an object in the D GC heap, then the second rule needs to be extended to all threads which may hold a reference to said objects, but it may be practical to copy D objects at the C/D barrier either to caller-owned memory, or malloc-allocated memory that the caller can free by calling the standard C free function.

>

So, we came up with a drastic solution: Since all this code works just fine in a pure D environment, make the library as thin as possible; the library starts a daemon that is written in D with all the functionality. The library merely dispatches the requests to that daemon.

The library starts the daemon with pipeProcess(); pipes are used for dispatching requests and shared memory is used for large data. This idea "worked like a charm." Phew!

However, dispatching of the requests to the daemon is performed by a single library thread in a blocked manner: When a request is written to the pipe, the response is read back (blocked) and the result is returned to the user of the library function.

If the threads don't need to share state, you could just as well spawn one subprocess per thread, and let it do its own data processing.

Another approach would be to listen on a UNIX socket instead of using a pipe, which allows using accept to open new communication channels on-demand.

>

I feel so hopeless that in the past, I even thought about and experimented with banning the user from starting threads on their own. Rather, they would call my library on a posix compatible thread API and create their threads through me, which happens to be a D thread, so no thread would be a "foreign thread" and everything would work just fine. I haven't deployed this crazy idea (yet).

Perhaps it would be simpler to just write the library part in C / C++ / -betterC D. std.mmfile has many lines, but the work it has to do is actually quite simple. The same is true about std.socket. This will completely avoid your headache with getting the D runtime / GC to play well with the host process's threading model.

May 09, 2021
On Sunday, 9 May 2021 at 13:42:48 UTC, IGotD- wrote:
> Also, TLS must be removed from phobos/druntime so that there is basically no need for tracking the threads from a memory point of view, at least it shouldn't crash if D doesn't know about the thread unless using thread primitives.

I don't think that would be enough. The garbage collector needs to know of all threads running D code, so that it can scan their stack and registers, so that it can know about objects the pointer to which exists only in that thread's stack or registers.

« First   ‹ Prev
1 2 3