Thread overview
D & threads
Dec 07, 2004
gbatyan
Dec 07, 2004
Stewart Gordon
Dec 07, 2004
Sean Kelly
Dec 07, 2004
gbatyan
Dec 07, 2004
gbatyan
Dec 07, 2004
Sean Kelly
Dec 07, 2004
gbatyan
Dec 07, 2004
pragma
December 07, 2004
Greetings to D community!

As far as I understand, threading interface is one of the most important aspects of a garbage collected system. I'm lurking at D since a while now, and feel confused about the fact that so far I haven't seen mutch information about working with threads in D. Somewhere on the site there is a sentence like "garbage collector will only search for roots on stacks of threads _created with D threading interface_". Neither I have seen any info here on the site about the _D threading interface_ nor a word about threads in general.

Here are the questions / thoughts, I'd appreciate to get corrected where I'm wrong in thinking:

- every garbage collector must have some sort of accounting for threads it is "aware of", for "stop the world" must be implemented somehow.

- Does D use some standard threading library? or maybe there is a preferred library?

- is there some generic possibility to explicitely inform the D runtime
about thread creation / destruction from one side, and make D runtime
be able to "ask" the threading library to stop this or that thread on
the other side. (this way one would be free to use virtually any
threading library without necessarily being a specialist in D internals).

- One should be able to choose upon thread creation if the thread should be a subject to GC or a "non-collected" thread if there is a need for a thread guaranteed to never freeze because of GC.

- I'd appreciate any info on how "stop the world"
is / must / should / may
be implemented in D. I think there are 2 ways to make such a thing,
(this is purely my suggestion)
- non-intrusively, where GC tries to stop the threads and get
contents of registers using some OS tricks. (like Boehm GC library
for C/C++). I personally theoretically preceive this way to be ugly
non-portable maybe inefficient and probably insecure.
- intrusively, where some code is emitted throughout the whole program
and provides means for GC to lock the threads.
(may surely be inefficient as well)

- Threads and DLL's. Does / must / should / may one GC be shared by the program and all DLLs it uses, or is it rather a nonsence and every DLL MUST take care about it's own garbage? Are there some hidden caveats here?








December 07, 2004
gbatyan@yahoo.com wrote:
<snip>
> Somewhere on the site there is a sentence like "garbage collector will only search for roots on stacks of threads _created with D threading interface_". Neither I have seen any info here on the site about the _D threading interface_ nor a word about threads in general.

AIUI "D threading interface" simply means the std.thread.Thread class.

<snip>
> - Does D use some standard threading library? or maybe there is a preferred library?

It is part of Phobos.

> - is there some generic possibility to explicitely inform the D runtime about thread creation / destruction from one side,

AIUI this is all automatically handled by std.thread.

> and make D runtime be able to "ask" the threading library to stop this or that thread on the other side. (this way one would be free to use virtually any threading library without necessarily being a specialist in D internals).

The only threading library one could use with D is std.thread itself or one that is a layer on top of this.

> - One should be able to choose upon thread creation if the thread should be a subject to GC or a "non-collected" thread if there is a need for a thread guaranteed to never freeze because of GC.
<snip>

Presumably all threads share one memory pool, so there's no such thing as a piece of memory belonging to one thread or another (apart from the thread's stack, of course).  Maybe someone can confirm this....

Stewart.

-- 
My e-mail is valid but not my primary mailbox.  Please keep replies on
the 'group where everyone may benefit.
December 07, 2004
Oops, missed std.thread package :-)

But still, it might be usefull to be able to use the phobos thread package to reate a thread not being checked by GC.



December 07, 2004
In article <cp3oma$1kj$1@digitaldaemon.com>, gbatyan@yahoo.com says...
>
>Greetings to D community!

Greets!

>- Threads and DLL's. Does / must / should / may one GC be shared by the program and all DLLs it uses, or is it rather a nonsence and every DLL MUST take care about it's own garbage? Are there some hidden caveats here?

Alas, D does not provide anything special to deal with this issue.

Each DLL gets its own GC just as every application does.  As you say, each DLL "must take care of it's own garbage".  At present, there is no (implicit) way to share a GC between DLLs and applications.

- pragma [ ericanderton at yahoo ]
December 07, 2004
In article <cp417h$g6i$1@digitaldaemon.com>, Stewart Gordon says...
>
>gbatyan@yahoo.com wrote:
><snip>
>> Somewhere on the site there is a sentence like "garbage collector will only search for roots on stacks of threads _created with D threading interface_". Neither I have seen any info here on the site about the _D threading interface_ nor a word about threads in general.
>
>AIUI "D threading interface" simply means the std.thread.Thread class.

Yes.

here some generic possibility to explicitely inform the D
>> runtime about thread creation / destruction from one side,
>
>AIUI this is all automatically handled by std.thread.

Yup.  Right now the thread module maintains a list of running threads.

>> - One should be able to choose upon thread creation if the thread should be a subject to GC or a "non-collected" thread if there is a need for a thread guaranteed to never freeze because of GC.
><snip>
>
>Presumably all threads share one memory pool, so there's no such thing as a piece of memory belonging to one thread or another (apart from the thread's stack, of course).  Maybe someone can confirm this....

Assuming the memory was all allocated via 'new' then this is true, though the garbage collector can really do whatever it wants so long as it can provide and collect memory as needed.  If you will never call 'new' or reference garbage collected memory from a thread, then you are free to create your own threading model.  Using the Thread class is necessary because the gc needs to be able to pause threads and reference call stacks during its collection cycle.

FWIW, I'm working on a slightly modified thread implementation as part of Ares. If you have any suggestions, please post them in the appropriate DSource forum.


Sean


December 07, 2004
In article <cp4qg2$1qd2$1@digitaldaemon.com>, Sean Kelly says...
>
>In article <cp417h$g6i$1@digitaldaemon.com>, Stewart Gordon says...
>>
>>gbatyan@yahoo.com wrote:
>><snip>
>>> Somewhere on the site there is a sentence like "garbage collector will only search for roots on stacks of threads _created with D threading interface_". Neither I have seen any info here on the site about the _D threading interface_ nor a word about threads in general.
>>
>>AIUI "D threading interface" simply means the std.thread.Thread class.
>
>Yes.
>
>here some generic possibility to explicitely inform the D
>>> runtime about thread creation / destruction from one side,
>>
>>AIUI this is all automatically handled by std.thread.
>
>Yup.  Right now the thread module maintains a list of running threads.
>
>>> - One should be able to choose upon thread creation if the thread should be a subject to GC or a "non-collected" thread if there is a need for a thread guaranteed to never freeze because of GC.
>><snip>
>>
>>Presumably all threads share one memory pool, so there's no such thing as a piece of memory belonging to one thread or another (apart from the thread's stack, of course).  Maybe someone can confirm this....
>
>Assuming the memory was all allocated via 'new' then this is true, though the garbage collector can really do whatever it wants so long as it can provide and collect memory as needed.  If you will never call 'new' or reference garbage collected memory from a thread, then you are free to create your own threading model.  Using the Thread class is necessary because the gc needs to be able to pause threads and reference call stacks during its collection cycle.

Dealing with low-level thread stuff (aware of portability issues, etc) is IMHO too expensive cost for just the little option to create a thread guaranteed to be not interesting for the GC.


>FWIW, I'm working on a slightly modified thread implementation as part of Ares. If you have any suggestions, please post them in the appropriate DSource forum.

hmm, definitely interesting...

>
>Sean
>
>


December 07, 2004
In article <cp4qg2$1qd2$1@digitaldaemon.com>, Sean Kelly says...
>
>In article <cp417h$g6i$1@digitaldaemon.com>, Stewart Gordon says...
>>
>>gbatyan@yahoo.com wrote:
>><snip>
>>> Somewhere on the site there is a sentence like "garbage collector will only search for roots on stacks of threads _created with D threading interface_". Neither I have seen any info here on the site about the _D threading interface_ nor a word about threads in general.
>>
>>AIUI "D threading interface" simply means the std.thread.Thread class.
>
>Yes.
>
>here some generic possibility to explicitely inform the D
>>> runtime about thread creation / destruction from one side,
>>
>>AIUI this is all automatically handled by std.thread.
>
>Yup.  Right now the thread module maintains a list of running threads.
>
>>> - One should be able to choose upon thread creation if the thread should be a subject to GC or a "non-collected" thread if there is a need for a thread guaranteed to never freeze because of GC.
>><snip>
>>
>>Presumably all threads share one memory pool, so there's no such thing as a piece of memory belonging to one thread or another (apart from the thread's stack, of course).  Maybe someone can confirm this....
>
>Assuming the memory was all allocated via 'new' then this is true, though the garbage collector can really do whatever it wants so long as it can provide and collect memory as needed.  If you will never call 'new' or reference garbage collected memory from a thread, then you are free to create your own threading model.  Using the Thread class is necessary because the gc needs to be able to pause threads and reference call stacks during its collection cycle.

What if I want to use some C SDK extensively relying on say pthreads from my D program? If the SDK creates a thread, will it be 'catched' by D?

How can this be done? Would it be enough to guarantee that SDK threads never access GC-tracked data?

any suggestions?




December 07, 2004
In article <cp4ugj$21vk$1@digitaldaemon.com>, gbatyan@yahoo.com says...
>
>In article <cp4qg2$1qd2$1@digitaldaemon.com>, Sean Kelly says...
>>
>>Assuming the memory was all allocated via 'new' then this is true, though the garbage collector can really do whatever it wants so long as it can provide and collect memory as needed.  If you will never call 'new' or reference garbage collected memory from a thread, then you are free to create your own threading model.  Using the Thread class is necessary because the gc needs to be able to pause threads and reference call stacks during its collection cycle.
>
>What if I want to use some C SDK extensively relying on say pthreads from my D program? If the SDK creates a thread, will it be 'catched' by D?

No.  D does not attempt to query the OS or anything like that when it looks for threads.  It currently relies on an internal list that is maintained by the Thread class.

>How can this be done? Would it be enough to guarantee that SDK threads never access GC-tracked data?

Yes that would be sufficient.  The more complicated option would be to maintain an array of pointers that the SDK thread is referencing.  Pin them and add them to this list before passing them tot he SDK and unpin/remove then when you're sure the SDK is done with them--so long as the array lives in GC memory then you're set.  The current GC doesn't support pinning, but it's not a compacting GC so that doesn't matter.

This is actually something that's probably worth building into the GC itself. Allow regions to be marked as external, which would effectively pin them and set them as uncollectable.  To get really fancy you could write a free() function that SDKs could use which would clean up this memory correctly.  In the presence of multiple GC's however (as in the case of D DLLs), things will get a bit more complicated.


Sean