Thread overview | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
January 23, 2021 Initializing D runtime and executing module and TLS ctors for D libraries | ||||
---|---|---|---|---|
| ||||
tl;dr I know enough to sense there are important stuff that I don't know. Even though I sometimes act[1] like someone who knows stuff, there are many fuzzy areas for me especially in the runtime. Things work great when D code is inside a D program. The runtime and module states are magically initialized and everything works. It is not clear when it comes to writing a D library and especially when that library may be used by other language runtimes, necessarily on foreign threads. Here are the essential points that I do and don't understand. - Initialize the runtime: This is automatically done for a D program as described on the wiki[2]. This must be done by calling rt_init[3] for a D shared library. I handle this by calling rt_init from a pragma(crt_constructor) function[4]. Luckily, this is easy and works for all cases that I have. - Execute module constructors ("ctor" for short, i.e. 'shared static this' blocks). This is done automatically for a D program and when the D library is loaded by other language code like C++ and Python. However, I've encountered a case[5] where module ctors were not being called. This could be due to runtime bugs or something that I don't understand with loading shared libraries. (My workaround is very involved: I grep the output of 'nm' to determine the symbol for the module ctor, call it after dlsym'ing, and because 'nm | grep' is a slow process, I cache this information in a file along with the ~2K libraries that I may load conditionally.) - Loading D libraries from D code: I call loadLibrary[6] to load a D library so that "[its] D runtime [...] will be integrated with the current runtime". Sounds promising; assuming that rt_init is already called for the calling library, I assume loadLibrary will handle everything, and all code will use a single runtime and things will work fine. This works flawlessly for my D and C++ programs that load my D library that loads the other D libraries. - Attaching foreign threads: D runtime needs to know about all threads that are running D code so that it will know what threads consist of "the world" for it to "stop the world" when performing garbage collection. The function to do this is thread_attachThis[7]. One question I have is, does rt_init already do thread_attachThis? I ask because I have a library that is loaded by Python and things work even *without* calling thread_attachThis. - Execute thread local storage (TLS) ctors: Again, this happens automatically for most cases. However, thread_attachThis says "[if] full functionality as a D thread is desired, [rt_moduleTlsCtor] must be called after thread_attachThis". Ok. When would I not want "full functionality" anyway? Another question: Are TLS ctors executed when I do loadLibrary? And when they are executed, which modules are involved? The module that is calling rt_moduleTlsCtor or all modules? What are "all modules"? - Detaching foreign threads: Probably even more important than thread_attachThis is thread_detachThis[8]. As its documentation says, one should call rt_moduleTlsDtor as well for "full functionality". This is very important because when the GC collection kick in, it will stop all threads that makes up its world. If one of those threads has already been terminated, we will crash. (Related, I have an abandoned PR[9] that tried to fix issues with thread_detachThis, which stalled due to failing unit tests for the 32-bit Apple operating system, which D stopped supporting since then.) (And I stopped working on that issue mostly because the company I used to work for stopped using D and rewrote their library in C++.) I have questions regarding thread_attachThis and thread_detachThis: When should they be called? Should the library expose a function that the users must call from *each thread* that they will be using? This may not be easy because a user may not know what thread they are running on. For example, the user of our library may be on a framework where threads may come and go, where the user may not have an opportunity to call thread_detachThis when a thread goes away. For example, the user may provide callback functions (which call us) to a framework that is running on a thread pool. For that reason, my belief has been to call thread_attachThis upon entering an API function and calling thread_detachThis upon leaving it because I may not know whether this thread will survive or die soon. (thread_detachThis is so important because the next GC cycle will try to stop this thread and may crash.) More questions: Can I thread_detachThis the thread that called rt_init? Can I call rt_moduleTlsCtor more than once? I guess it depends on each module. It will be troubling if a TLS ctor reinitializes an module state. :/ While trying to sort all of these out, I am facing a bug[10], which will force me to move away from std.parallelism and perhaps use std.concurrency. Even though that bug is reported for OS X, I think both that case and my "called from Python" case are related to an undefined behavior in thread management of runtime, which is exposed by std.parallelism. (?) As you can see, even though I can list many references to act like I know stuff, I really don't and have many questions. :) The trouble is, when there are so many dimensions to test to be sure, it is extremely difficult to learn when a seg-fault bug is intermixed with all this, which hits sporadically. :( I want to learn. Thank you, Ali [1] https://www.youtube.com/watch?v=FNL-CPX4EuM [2] https://wiki.dlang.org/Runtime_internals [3] https://dlang.org/library/core/runtime/rt_init.html [4] https://dlang.org/spec/pragma.html#crtctor [5] https://forum.dlang.org/thread/rucm30$1lgk$1@digitalmars.com [6] https://dlang.org/library/core/runtime/runtime.load_library.html [7] https://dlang.org/library/core/thread/osthread/thread_attach_this.html [8] https://dlang.org/library/core/thread/threadbase/thread_detach_this.html [9] https://github.com/dlang/druntime/pull/1989 [10] https://issues.dlang.org/show_bug.cgi?id=11736 |
January 24, 2021 Re: Initializing D runtime and executing module and TLS ctors for D libraries | ||||
---|---|---|---|---|
| ||||
Posted in reply to Ali Çehreli | On Sunday, 24 January 2021 at 00:24:55 UTC, Ali Çehreli wrote: > > One question I have is, does rt_init already do thread_attachThis? I ask because I have a library that is loaded by Python and things work even *without* calling thread_attachThis. > During rt_init in the main thread, thread_attachThis is performed what I have seen. > > Another question: Are TLS ctors executed when I do loadLibrary? > > And when they are executed, which modules are involved? The module that is calling rt_moduleTlsCtor or all modules? What are "all modules"? The TLS standard (at least the ELF standard) does not have ctors. Only simple initialization are allowed meaning the initial data is stored as .tdata which is copied to the specific memory area for each thread. There is also a .tbss which is zero memory just like the .bss section. Actual ctor code that runs for each TLS thread is language specific and not part of the ELF standard therefore no such TLS ctor code are being run in the lower level API. The initialization (only copy and zeroing) of TLS data is being done when each thread starts. This can even be done in a lazy manner when the first TLS variable is being accessed. > > I have questions regarding thread_attachThis and thread_detachThis: When should they be called? Should the library expose a function that the users must call from *each thread* that they will be using? This may not be easy because a user may not know what thread they are running on. For example, the user of our library may be on a framework where threads may come and go, where the user may not have an opportunity to call thread_detachThis when a thread goes away. For example, the user may provide callback functions (which call us) to a framework that is running on a thread pool. > I call thread_attachThis as soon the thread is supposed to call a D function. For example a callback from a thread in a thread pool. This usually happens when there is a function or delegate involved as any jump to D code would use them. I have to make a generic API and then a D API on top of that. In practice this means there is a trampoline function involved where and thread_attachThis and thread_detachThis is being called. Also this is where I call TLS ctors/dtors. It is an effect that delegates is language specific and it falls natural that way. Avoid extern(C) calls directly into D code. In practice you can do this for any thread even if there are several delegates during the thread lifetime. You can simply have a TLS bool variable telling if the thread_attachThis and rt_moduleTlsCtor have already been run. > > More questions: Can I thread_detachThis the thread that called rt_init? Can I call rt_moduleTlsCtor more than once? I guess it depends on each module. It will be troubling if a TLS ctor reinitializes an module state. :/ > I have brought up this question before because like it is right now I haven't seen any "rt_uninit" or "rt_close" function. This is bit limiting for me as the main thread can exit while the process lives on. In general the main thread that goes into main must also be the last one returning the entire line of functions that was called during entry of the process. What will happen is that you possibly do a thread_detachThis twice. Short answer is just park the main thread while the bulk is being done by other threads. Unfortunately that's how many libraries work today. |
January 23, 2021 Re: Initializing D runtime and executing module and TLS ctors for D libraries | ||||
---|---|---|---|---|
| ||||
Posted in reply to IGotD- | Thank you very much for your answers. I think I've been on the right track and the following bug that I've mentioned has been messing up by hitting me randomly: https://issues.dlang.org/show_bug.cgi?id=11736 On 1/23/21 5:18 PM, IGotD- wrote: > During rt_init in the main thread, thread_attachThis is performed what I > have seen. That explains why everything just works on most cases. > Actual ctor code that runs for each TLS thread is language specific and > not part of the ELF standard therefore no such TLS ctor code are being > run in the lower level API. The initialization (only copy and zeroing) > of TLS data is being done when each thread starts. That must be the case for threads started by D runtime, right? It sounds like I must call rt_moduleTlsCtor explicitly for foreign threads. It's still not clear to me which modules' TLS variables are initialized (copied over). Only this module's or all modules that are in the program? I don't know whether it's possible to initialize one module; rt_moduleTlsCtor does not take any parameter. > This can even be done > in a lazy manner when the first TLS variable is being accessed. I hope that's the case. > I have to make a generic API and then a D > API on top of that. Did you mean a generic API, which makes calls to D? That's how I have it: an extern(C) API function calling proper D code. > In practice this means there is a trampoline > function involved where and thread_attachThis and thread_detachThis is > being called. Also this is where I call TLS ctors/dtors. That's what I will be doing. > It is an effect > that delegates is language specific and it falls natural that way. Avoid > extern(C) calls directly into D code. I hope I am misunderstanding you there. All I have are extern(C) function on the library API. > In practice you can do this for any thread even if there are several > delegates during the thread lifetime. You can simply have a TLS bool > variable telling if the thread_attachThis and rt_moduleTlsCtor have > already been run. I've already experimented with it but it didn't work likely because of the bug mentioned above. > In general the main thread that goes into main must also be the last one > returning the entire line of functions that was called during entry of > the process. Main entry belongs to another language, so I have to document that this library can only work in such "well behaved" cases. > What will happen is that you possibly do a > thread_detachThis twice. Sounds like I can track that with a bool variable as well, no? > Short answer is just park the main thread while the bulk is being done > by other threads. Unfortunately that's how many libraries work today. Agreed. That's for me to specify in the library documentation. I should revive my old PR and see whether it is needed at all: https://github.com/dlang/druntime/pull/1989 I am surprised how much I had learned at that time and how much I've already forgotten. :/ For example, my PR involves thread_setThis, which seems to be history now: https://docarchives.dlang.io/v2.068.0/phobos/core_thread.html#.thread_setThis And thread_detachThis seems to be missing now: https://dlang.org/phobos/core_thread.html https://dlang.org/phobos/core_thread_osthread.html Documentation issue or is it not needed anymore? Ali |
January 24, 2021 Re: Initializing D runtime and executing module and TLS ctors for D libraries | ||||
---|---|---|---|---|
| ||||
Posted in reply to Ali Çehreli | On Sunday, 24 January 2021 at 03:59:26 UTC, Ali Çehreli wrote: > > That must be the case for threads started by D runtime, right? It sounds like I must call rt_moduleTlsCtor explicitly for foreign threads. It's still not clear to me which modules' TLS variables are initialized (copied over). Only this module's or all modules that are in the program? I don't know whether it's possible to initialize one module; rt_moduleTlsCtor does not take any parameter. > Any threads started by druntime has proper initialization of course. Any thread started by any module written in another language will not do D the thread initialization. All TLS variables in all loaded modules are being initialized (only copying and zeoring) by the OS system code for each thread that the OS system knows about. After that it is up to each library for each language to do further initialization. Next time __tls_get_addr is being called after loading a library, the TLS variables of any new module will be found and initialized. It is a mystery to me why the TLS standard never included a ctor/dtor vector for TLS variables. It is in practice possible but they didn't do it. The whole TLS design is like a swiss scheese. > > Did you mean a generic API, which makes calls to D? That's how I have it: an extern(C) API function calling proper D code. > I have a lot of system code written in C++ which also include callbacks from that code. In order to support D a layer is necessary to catch all callbacks in a trampoline and invoke D delegates. Calling D code directly with extern(C) should be avoided because 1. D delegates are so much more versatile. 2. You must use a trampoline in order to do D specific thread initialization anyway. Since std::function cannot be used in a generic interface I actually use something like this, http://blog.coldflake.com/posts/C++-delegates-on-steroids/. Which is more versatile than plain extern(C) but simple enough so that it can be used by any language. In the case of D the "this pointer" can be used to a pointer of a D delegate. Creating language agnostic interfaces require more attention than usual as I have experienced. Strings for example complicates things further as they are different for every language. |
January 28, 2021 Re: Initializing D runtime and executing module and TLS ctors for D libraries | ||||
---|---|---|---|---|
| ||||
Posted in reply to Ali Çehreli | On Sunday, 24 January 2021 at 03:59:26 UTC, Ali Çehreli wrote: > I am surprised how much I had learned at that time and how much I've already forgotten. :/ For example, my PR involves thread_setThis, which seems to be history now: > > > https://docarchives.dlang.io/v2.068.0/phobos/core_thread.html#.thread_setThis > > And thread_detachThis seems to be missing now: > > https://dlang.org/phobos/core_thread.html > > https://dlang.org/phobos/core_thread_osthread.html > > Documentation issue or is it not needed anymore? The documentation build on dlang.org is broken. Check the source code or Adam D. Ruppe's dpldocs.info for the complete documentation: http://dpldocs.info/experimental-docs/core.thread.osthread.html You'll find thread_setThis and thread_detachThis are still there. |
January 28, 2021 Re: Initializing D runtime and executing module and TLS ctors for D libraries | ||||
---|---|---|---|---|
| ||||
Posted in reply to tsbockman | On 28/01/2021 1:16 PM, tsbockman wrote: > The documentation build on dlang.org is broken. Check the source code or Adam D. Ruppe's dpldocs.info for the complete documentation: > http://dpldocs.info/experimental-docs/core.thread.osthread.html Fixed: https://issues.dlang.org/show_bug.cgi?id=21309 |
January 28, 2021 Re: Initializing D runtime and executing module and TLS ctors for D libraries | ||||
---|---|---|---|---|
| ||||
Posted in reply to rikki cattermole | On Thursday, 28 January 2021 at 00:58:17 UTC, rikki cattermole wrote:
> On 28/01/2021 1:16 PM, tsbockman wrote:
>> The documentation build on dlang.org is broken. Check the source code or Adam D. Ruppe's dpldocs.info for the complete documentation:
>> http://dpldocs.info/experimental-docs/core.thread.osthread.html
>
> Fixed: https://issues.dlang.org/show_bug.cgi?id=21309
I still don't see thread_setThis and thread_detachThis anywhere on the dlang.org copy.
|
January 29, 2021 Re: Initializing D runtime and executing module and TLS ctors for D libraries | ||||
---|---|---|---|---|
| ||||
Posted in reply to IGotD- | On 1/24/21 2:28 AM, IGotD- wrote: > Any threads started by druntime has proper initialization of course. Any > thread started by any module written in another language will not do D > the thread initialization. And that of course has been what I've been trying to deal with. Bugs in the uses of thread_attachThis and thread_detachThis, and most importantly, not having a guaranteed opportunity to call thread_detachThis (think a foreign thread dies on its own without calling us and the runtime crashes attempting to stop a nonexisting thread during a GC cycle) finally made me realize that D shared library functions cannot be called on foreign threads. At least not today... Or, they can only be used under unusual conventions like the rule below. So, that's the golden rule: If you want to call functions of a D shared library (I guess static library as well) you must create your thread by our library's create_thread() function and join that thread by our library's join_thread() function. Works like a charm! Luckily, it is trivial to determine whether we are being called on a foreign thread or a D thread through a module scoped bool variable... > Since std::function cannot be > used in a generic interface I actually use something like this, > http://blog.coldflake.com/posts/C++-delegates-on-steroids/. If I understand that article correctly, and by pure coincidence, the very shared libraries that are the subject of this discussion, which I load at run time, happen to register themselves by providing function pointers. Like in the article, those function pointers are of template instances, each of which know exactly what to do for their particular types but the registry keeps opaque functions. Pseudo code: ``` // Shared library: struct A { // ... } struct B { // ... } shared static this() { register("some key", &serializer!A, // <-- Takes e.g. void* but knows about A &deserializer!B); // ditto for B } ``` And one of my issues has been that module constructors not being called when the library is loaded as a dependency of a C++ library, which is loaded by a Python module, which is imported by another Python module. :) As I said earlier, I solved that issue by parsing and persisting the output of 'nm mylib.so' to identify the module ctor and to call it after dlsym'ing. Pretty hacky but works... Getting back to my main issue: I am about to write a mixin template where any library's interface .d file will do the following and get the create_thread and join_thread functions automatically: // mylib's extern(C) functions: // This one provides mylib_create_thread() and mylib_join_thread(): mixin LibAPI!"mylib"(); // Other extern(C) functions of the library: extern(C) nothrow int foo(int) { // ... } The .h file must still be maintained by hand. Ali |
January 30, 2021 Re: Initializing D runtime and executing module and TLS ctors for D libraries | ||||
---|---|---|---|---|
| ||||
Posted in reply to Ali Çehreli | On Saturday, 30 January 2021 at 05:44:37 UTC, Ali Çehreli wrote:
> On 1/24/21 2:28 AM, IGotD- wrote:
>
> > [...]
> course. Any
> > [...]
> not do D
> > [...]
>
> [...]
Hmm, interesting, or what you should call it 😅
With this knowledge we have now, what changes could and/or should be made to make this process easier? 🤔
(Btw, I just "forced" my boss to buy your and Adam's book for me. I'm trying to sneak in D @thecompany)
|
January 30, 2021 Re: Initializing D runtime and executing module and TLS ctors for D libraries | ||||
---|---|---|---|---|
| ||||
Posted in reply to Imperatorn | On 1/30/21 1:34 AM, Imperatorn wrote: > With this knowledge we have now, what changes could and/or should be > made to make this process easier? 🤔 I wonder whether doing something in the runtime is possible. For example, it may be more resilient and not crash when suspending a thread fails because the thread may be dead already. However, studying the runtime code around thread_detachThis three years ago, I had realized that like many things in computing, the whole stop-the-world is wishful thinking because there is no guarantee that your "please suspend this thread" request to the OS has succeeded. You get a success return code back but it means your request succeeded not that the thread was or will be suspended. (I may be misremembering this point but I know that the runtime requests things where OS does not give full guarantee for.) (Going off-topic, even clicking on a user interface is wishful thinking because a few times a year I attempt to click on something but another window element pops under my mouse pointer and I unintentionally click something else, commonly on web pages as they are being rendered by a browser: links move around on the page. This used to bother me but not anymore. Life is not perfect and I appreciate it. :) ) > (Btw, I just "forced" my boss to buy your and Adam's book for me Cool! :) It makes me a little sad that my online version is ahead of the paper version by a couple of years now. I want to update the paper as well but I want to work on work stuff like the topic of this discussion. :) (Related note: the ebook versions on the web page are more up-to-date than ones that you can buy especially because the versions on my web site include a table of contents section. Consider updating your ebook here: http://ddili.org/ders/d.en/index.html ) > I'm trying to sneak in D @thecompany) I still think D is a great tool but some use cases can be tough and sometimes embarrassing. :/ Ali |
Copyright © 1999-2021 by the D Language Foundation