Thread overview
Concurrent No-Pause GC Option (idea)
Aug 21, 2022
jordan4ibanez
Aug 21, 2022
frame
Aug 21, 2022
Sergey
Aug 22, 2022
frame
Aug 26, 2022
frame
August 21, 2022

I had an idea on the GC for CPUs that have more than one core. So basically all of them.

So I learned on the Discord that the current GC runs parallel if this machine setup is detected, but it still pauses everything to make sure nothing is changed, thanks to adr.

Now I had the idea that perhaps this could be fixed. It is simple in idea, but I am sure that I am about to find out is is horrifically complex in implementation.

So the current assumption is that the memory addresses for objects do not become moved around in the heap and it grows and shrinks as needed. So this opens a new opportunity to create a concurrent no-pause garbage collector. My literal idea was this:

So like, when an object, let's say, in the main thread has it's destructor called, or the thread notices that it no longer talks to this piece of data ever again, basically tell that external gc thread "hey this piece of data and all of it's children are done" and just keep on going doing it's execution routine

So the main thread creates an object like so:

Object data = new Object();

This is great, that means that data has it's own little house on the heap and it will never move from this spot until it "has to move out".

So we make it move out:

Object data = new Object();
data = new Object();

Well there we go, data is now a new data. But what happened to the old data? Well with the concurrent GC the main thread has silently told the GC to collect all of data's data.

The same thing happens if it goes out of scope:

{
    Object data = new Object();
    // data destroy is sent to concurrent gc to free space
}

The same thing will happen if the piece of data is in an array, or if it's held in another object. (In theory) But this was just an extremely simple example. Ideally the pointer table in the thread will never be able to see the data location of the object that is destroyed even if it still exists in memory.

The concurrent gc would simply always be listening and the only micro pauses I could see would be from having to add those memory addresses to the gc's stack. Perhaps that could be avoided by having a concurrent linked list somehow.

This is coming from someone new to D and I am still learning much about it and trying to help build the ecosystem and languages as much as I can. I would love to know your thoughts.

August 21, 2022

On Sunday, 21 August 2022 at 00:11:55 UTC, jordan4ibanez wrote:

>

This is coming from someone new to D and I am still learning much about it and trying to help build the ecosystem and languages as much as I can. I would love to know your thoughts.

You probably thinking of RC: (Ref(erence) Counted) GC, where memory can be collected if the last reference is gone.

The drawbacks of this approach are that you will need more memory for each allocated variable since you will need to track the references. And to increase or decrease those reference counts has a runtime impact whereever a reference occurs: eg. pointer to data is passed to another object and the original pointer is going out of scope. Or just a common slice ([]) as seen in string that is handled from function to function maybe even recursively.

The current GC doesn't do this, keeping the impact low. It only needs to stop the threads if it need to collect in a cycle which doesn't always happen and you can even help organize the GC's cleanup work by manually calling GC.collect() when it's only suitable for your program.

If you are interested in this stuff, I suggest you to dig in the General forum section where cons & pros of other GC implementations and their side effects are heavily discussed :D

August 21, 2022

On Sunday, 21 August 2022 at 00:11:55 UTC, jordan4ibanez wrote:

>

I had an idea on the GC for CPUs that have more than one core. So basically all of them.

So I learned on the Discord that the current GC runs parallel if this machine setup is detected, but it still pauses everything to make sure nothing is changed, thanks to adr.

Isn’t it the same as already proposed earlier https://dlang.org/blog/2019/07/22/symmetry-autumn-of-code-experience-report-porting-a-fork-based-gc/

August 22, 2022

On Sunday, 21 August 2022 at 21:42:20 UTC, Sergey wrote:

>

On Sunday, 21 August 2022 at 00:11:55 UTC, jordan4ibanez wrote:

>

I had an idea on the GC for CPUs that have more than one core. So basically all of them.

So I learned on the Discord that the current GC runs parallel if this machine setup is detected, but it still pauses everything to make sure nothing is changed, thanks to adr.

Isn’t it the same as already proposed earlier https://dlang.org/blog/2019/07/22/symmetry-autumn-of-code-experience-report-porting-a-fork-based-gc/

The mentioned fork variant would be near the same as the current GC, it tracks nothing. It suggests to scan in the forked process and then tell the mutator thread parent what blocks are free again.

I don't know why a fork is necessary. That can be done in a normal background thread also? And it doesn't eliminate the problem that data can be used in another thread after the scanner marked it as free.

Actually there is already a fork based implementation merged in the runtime as an alternative to parallel marking, but it still stops other threads, of course.

August 25, 2022

On Monday, 22 August 2022 at 18:00:42 UTC, frame wrote:

>

I don't know why a fork is necessary. That can be done in a normal background thread also? And it doesn't eliminate the problem that data can be used in another thread after the scanner marked it as free.

AFAIK forking does indeed eliminate the problem of another thread using the data after scanner marks it as free.

Forked process does not share address space with the parent process (it gets a copy of the parent's address space), so any memory accesses done in the application process after the fork() will not influence the scanning process. If the scanner process marks a block as free, then that means it was definitely unreachable in the application process prior to the fork(), so the application process has no way to access the block (apart from cases where you hide the pointer from the GC on purpose, but that's a problem for almost all GC implementations, not just the fork() one)

August 26, 2022

On Thursday, 25 August 2022 at 19:06:15 UTC, Krzysztof Jajeśnica wrote:

>

On Monday, 22 August 2022 at 18:00:42 UTC, frame wrote:

>

AFAIK forking does indeed eliminate the problem of another thread using the data after scanner marks it as free.

Forked process does not share address space with the parent process (it gets a copy of the parent's address space), so any memory accesses done in the application process after the fork() will not influence the scanning process. If the scanner process marks a block as free, then that means it was definitely unreachable in the application process prior to the fork(), so the application process has no way to access the block (apart from cases where you hide the pointer from the GC on purpose, but that's a problem for almost all GC implementations, not just the fork() one)

I think it does share unless a write operation is performed on parent or child side, then both get a fresh mutable copy. So the child needs to commit IDs(?) indexes(?) whatever back to the parent, and the parent can map it to its memory block [or something like that].

That is, of course, way better than a background thread as it creates some isolated snapshot.

But consider the parent would call GC.free() manually while the child is running and the child thinks the block can be freed and the parent continued and just used the block right again (don't know if the GC actually does this, but I think so). The parent must ignore the supplied ID in this case or would collect used memory. The GC needs to work differently for this.