Jump to page: 1 2 3
Thread overview
A separate GC idea - multiple D GCs
Jan 21, 2022
Chris Katko
Jan 21, 2022
rikki cattermole
Jan 22, 2022
rikki cattermole
Jan 22, 2022
rikki cattermole
Jan 22, 2022
rikki cattermole
Jan 22, 2022
rikki cattermole
Jan 22, 2022
Elronnd
Jan 22, 2022
Elronnd
Jan 23, 2022
max haughton
Jan 22, 2022
IGotD-
Jan 21, 2022
Adam Ruppe
Jan 21, 2022
Tejas
Jan 21, 2022
H. S. Teoh
Jan 21, 2022
Adam D Ruppe
Jan 21, 2022
Ali Çehreli
Jan 21, 2022
Walter Bright
Jan 21, 2022
H. S. Teoh
Jan 22, 2022
Tejas
Jan 22, 2022
Sebastiaan Koppe
Jan 22, 2022
rikki cattermole
January 21, 2022

So I was going to ask this related question in my other thread but I now see it's hit over 30 replies (!) and the discussion is interesting but a separate topic.

So a related question: Has anyone ever thought about "thread-local garbage collection" or some sort of "multiple pool [same process/thread] garbage collection"? The idea here is, each thread [or thread collection] would have its own garbage collector, and, be the only thread that pauses during a collection event.

Basically, you cordon off your memory use like you would with say a memory pool, so that when a GC freezes it only freezes that section. Smaller, independent sections means less freeze time and the rest of the threads keep running on multiple core CPUs.

There are obviously some issues:

1 - Is the D GC designed to allow something like this? Can it be prevented from walking into areas it shouldn't? I know you can make malloc non-system memory it "shouldn't" touch if you never make pointers to it?

2 - It will definitely be more complex, and D can be too complex and finnicky already. (SEGFAULT TIME.) So patterns to prevent that will be needed. "Am I reading in my thread, or thread 37?"

3 - Any boundary where threads "touch" will be either a nightmare or at least need some sort of mechanism to cross and synchronize across said boundary. And a frozen thread, synchronized, will freeze any reads by non-frozen threads. [Though one might be able to schedule garbage collections only when massive reads/writes aren't currently needed.]

I think the simple, easiest way to try this would be to just spawn multiple processes [each having their own D GC collector] and somehow share memory between them. But I have no idea if it's easy to work around the inherent "multiple process = context switch" overhead making it actually slower.

As for the "why?". I'm not sure if there are huge benefits, mild benefits, benefits in only niche scenarios (like games, which is the kind I play with). But I'm just curious about the prospect.

Because while an "iterative GC" requires... an entire new GC. Splitting off multiple existing D garbage collectors into their own fields could maybe work. [And once again, without benchmarks a single D GC may be more than fast for my needs.] But I'm curious and like to know my options, suggestions, and people's general thoughts on it.

Thanks, have a great day!

January 22, 2022
On 22/01/2022 2:56 AM, Chris Katko wrote:
> So a related question: Has anyone ever thought about "thread-local garbage collection" or some sort of "multiple pool [same process/thread] garbage collection"? The idea here is, each thread [or thread collection] would have its own garbage collector, and, be the only thread that pauses during a collection event.

Indeed we have thought about this.

What I want is a fiber aware GC and that implicitly means thread-local too.

But I'm not sure it would be any use with the existing GC to have the hooks, it would need to be properly designed to take advantage of it.

> Because while an "iterative GC" requires... an entire new GC.

The existing GC has absolutely horrible code.

I tried a while back to get it to support snapshotting (Windows specific concurrency for GC's) and I couldn't find where it even did the scanning for pointers... yeah.

https://github.com/dlang/druntime/blob/master/src/core/internal/gc/impl/conservative/gc.d

After a quick look it does look like it has been improved somewhat with more comments since then.



What we need is a full reimplementation of the GC that is easy to dig into. After that, forking, precise, generational should all be pretty straight forward to implement.
January 21, 2022

On Friday, 21 January 2022 at 13:56:09 UTC, Chris Katko wrote:

>

So a related question: Has anyone ever thought about "thread-local garbage collection"

The big problem is that data isn't thread local (including TLS); nothing stops one thread from pointing into another thread. So any GC that depends on a barrier there is liable to look buggy.

So your point #3 is easier said than done in the general case. You can do it for special cases with fork like you said, or with unregistering threads like Guillaume does

>

I think the simple, easiest way to try this would be to just spawn multiple processes [each having their own D GC collector] and somehow share memory between them. But I have no idea if it's easy to work around the inherent "multiple process = context switch" overhead making it actually slower.

I actually do exactly this with my web server, but it is easy there since web requests are supposed to be independent anyway.

re context switches btw, processes and threads both have them. they aren't that different on the low level, it is just a matter of how much of the memory space is shared. default shared = thread, default unshared = process.

January 21, 2022

On Friday, 21 January 2022 at 14:37:04 UTC, Adam Ruppe wrote:

>

On Friday, 21 January 2022 at 13:56:09 UTC, Chris Katko wrote:

>

So a related question: Has anyone ever thought about "thread-local garbage collection"

The big problem is that data isn't thread local (including TLS); nothing stops one thread from pointing into another thread. So any GC that depends on a barrier there is liable to look buggy.

Is this intended behaviour? Or was it accidental? Is it atleast guaranteed that only one thread can hold a muta le reference?

January 21, 2022
On 1/21/22 05:56, Chris Katko wrote:

> some sort of "multiple pool [same process/thread]
> garbage collection"?

I ended up with that design by accident: My D library had to spawn multiple D processes instead of multiple threads. This was a workaround to my inability to execute initialization functions of D shared libraries that were dynamically loaded by foreign runtimes (Python and C++ were in the complex picture).

In the end, I realized that my library was using multiple D runtimes on those multiple D processes. :)

I've never gotten to profiling whether my necessary inter-process communication was hurting performance. Even if it did, the now-unstopped worlds might be better in the end. I even thought about making a DConf presentation about the findings but it never happened. :)

Ali

January 21, 2022
On Fri, Jan 21, 2022 at 03:16:32PM +0000, Tejas via Digitalmars-d wrote:
> On Friday, 21 January 2022 at 14:37:04 UTC, Adam Ruppe wrote:
> > On Friday, 21 January 2022 at 13:56:09 UTC, Chris Katko wrote:
> > > So a related question: Has anyone ever thought about "thread-local garbage collection"
> > 
> > The big problem is that data isn't thread local (including TLS); nothing stops one thread from pointing into another thread. So any GC that depends on a barrier there is liable to look buggy.
> 
> 
> Is this intended behaviour? Or was it accidental? Is it atleast guaranteed that only one thread can hold a muta le reference?

Immutable data is shared across threads by default.  So any immutable data would need to be collected by a global GC.  However, you don't always know if some data is going to end up being immutable when you allocate, e.g.:

	int[] createSomeData() pure {
		return [ 1, 2, 3 ];	// mutable when we allocate
	}

	void main() {
		// Implicitly convert from mutable to immutable because
		// createSomeData() is pure and the data is unique.
		immutable int[] constants = createSomeData();

		shareMyData(constants);

		// Now who should collect, per-thread GC or global GC?
		constants = null;
	}

So assume we allocate the initial array on the per-thread heap. Then it gets implicitly cast to immutable because it was unique, and shared with another thread. Now who should collect, the allocating thread's GC or the GC of the thread the data was shared with?  Neither are correct: you need a global GC.  At the very least, you need to synchronize across individual threads' GCs, which at least partially negates the benefit of a per-thread GC.

This is just one example of several that show why it's a hard problem in the current language.


T

-- 
One Word to write them all, One Access to find them, One Excel to count them all, And thus to Windows bind them. -- Mike Champion
January 21, 2022
On Friday, 21 January 2022 at 15:16:32 UTC, Tejas wrote:
> Is this intended behaviour?

Yes.

> Is it atleast guaranteed that only one thread can hold a muta le reference?

No.
January 21, 2022
On 1/21/2022 5:56 AM, Chris Katko wrote:
> So a related question: Has anyone ever thought about "thread-local garbage collection" or some sort of "multiple pool [same process/thread] garbage collection"? The idea here is, each thread [or thread collection] would have its own garbage collector, and, be the only thread that pauses during a collection event.

Yes. The trouble is what happens when a pointer in one pool is cast to a pointer in another pool.
January 21, 2022
On Fri, Jan 21, 2022 at 02:43:31PM -0800, Walter Bright via Digitalmars-d wrote:
> On 1/21/2022 5:56 AM, Chris Katko wrote:
> > So a related question: Has anyone ever thought about "thread-local garbage collection" or some sort of "multiple pool [same process/thread] garbage collection"? The idea here is, each thread [or thread collection] would have its own garbage collector, and, be the only thread that pauses during a collection event.
> 
> Yes. The trouble is what happens when a pointer in one pool is cast to a pointer in another pool.

It almost makes one want to tag pointer types at compile-time as thread-local or global. Casting from thread-local to global would emit a call to some druntime hook to note the transfer (which, presumably, should only occur rarely). Stuff with only global references will be collected by the global GC, which can be scheduled to run less frequently (or disabled if you never do such casts).

But yeah, this is a slippery slope on the slide down towards Rust... :-P It seems we just can't get any farther from where we are without starting to need managed pointer types.


T

-- 
Who told you to swim in Crocodile Lake without life insurance??
January 22, 2022

On Friday, 21 January 2022 at 22:56:46 UTC, H. S. Teoh wrote:

>

On Fri, Jan 21, 2022 at 02:43:31PM -0800, Walter Bright via Digitalmars-d wrote:

>

[...]

It almost makes one want to tag pointer types at compile-time as thread-local or global. Casting from thread-local to global would emit a call to some druntime hook to note the transfer (which, presumably, should only occur rarely). Stuff with only global references will be collected by the global GC, which can be scheduled to run less frequently (or disabled if you never do such casts).

But yeah, this is a slippery slope on the slide down towards Rust... :-P It seems we just can't get any farther from where we are without starting to need managed pointer types.

T

Isn't going global from thread local done via cast(shared) though? Is that not enough to notify the compiler?

« First   ‹ Prev
1 2 3