Thread overview
proposal: GC.*partial*collect(Duration maxPauseTime, Duration maxCollectionTime)
May 19, 2020
mw
May 19, 2020
rikki cattermole
May 19, 2020
mw
May 24, 2020
Luis
May 24, 2020
mw
May 25, 2020
rikki cattermole
May 25, 2020
mw
May 25, 2020
rikki cattermole
May 23, 2020
mw
May 19, 2020
Hi,

I'm new to D(2). Have spent past week reading a lot about other people's articles on D, pros & cons, etc. One of the things people have talked about alot is the slow gc, esp. in a multi-threaded env, real-time system, or interactive games dev.

OK, gc has its problem, but most programs cannot do without it (nogc phobos not there yet). Actually the programmer knows best *when & where* in his/her program, s/he can manually call gc.collect() to free resource. But currently gc.collect() will do a full collection:

https://dlang.org/library/core/memory/gc.collect.html

which is unpredictable how long it will take. In real-time systems this is not acceptable. The programmer will still hesitate to call gc.collect().

So we want a predictable gc: I just had a simple idea to improve the usability of current gc implementation (probably with minimal code change). My idea is add parameters to the gc.collect():

GC.*partial*collect(Duration maxPauseTime, Duration maxCollectionTime)

and ask it to do a partial collection. The parameters are defined here :-)

https://dlang.org/library/core/memory/gc.profile_stats.html

Name	Type	Description
maxCollectionTime	Duration	largest time spent doing one GC cycle
maxPauseTime		Duration	largest time threads were paused during one GC cycle


Now, the gc is much controllable in real-time systems.

Thoughts?

May 19, 2020
On 19/05/2020 5:42 PM, mw wrote:
> Hi,
> 
> I'm new to D(2). Have spent past week reading a lot about other people's articles on D, pros & cons, etc. One of the things people have talked about alot is the slow gc, esp. in a multi-threaded env, real-time system, or interactive games dev.
> 
> OK, gc has its problem, but most programs cannot do without it (nogc phobos not there yet). Actually the programmer knows best *when & where* in his/her program, s/he can manually call gc.collect() to free resource. But currently gc.collect() will do a full collection:
> 
> https://dlang.org/library/core/memory/gc.collect.html
> 
> which is unpredictable how long it will take. In real-time systems this is not acceptable. The programmer will still hesitate to call gc.collect().

There are two types of real time systems.

Hard: you will not be using the GC, like Weka you will roll your own libraries.
Soft: see dplug, you disable the GC and collect as appropriate while minimizing what memory is allocated via the GC.

Games full into the soft territory, although depending on the game, buffers and custom allocators can offset this significantly to the point you can enable the GC full time and not have to worry about it.

> So we want a predictable gc: I just had a simple idea to improve the usability of current gc implementation (probably with minimal code change). My idea is add parameters to the gc.collect():

Partial collection can be implemented using a strategy such as tri-color, and do a maximum amount of time on each iteration of scanning.

Alternatively you can use fork'ing and snapshots (Windows) which make it an entirely asynchronous process.

We know what can be done, just nobody wants to do the work.
Our GC implementation isn't all that friendly even though it now supports fork'ing and can be precise.

The GC performance isn't all that bad in D, as long as you minimize garbage, hence not much effort goes towards it. For most people you won't notice that it even runs.
May 19, 2020
On Tuesday, 19 May 2020 at 05:59:17 UTC, rikki cattermole wrote:
> We know what can be done, just nobody wants to do the work.
> Our GC implementation isn't all that friendly even though it now supports fork'ing and can be precise.

That’s exactly why I’m proposing just add a timeout to the current gc algorithm, which periodically checks for early return without finish a full collection. I hope the required code change will be minimal. Then the users will have a much predictable (stop-the-world) gc.






May 23, 2020
On Tuesday, 19 May 2020 at 05:59:17 UTC, rikki cattermole wrote:
> The GC performance isn't all that bad in D, as long as you minimize garbage, hence not much effort goes towards it. For most people you won't notice that it even runs.


I just replied in the other thread about choosing a new language in corporate environment:

https://forum.dlang.org/post/kjjredahuogmatldwolp@forum.dlang.org

It can be applied here too: companies always want predictability & warranty.

Personally I feel inclined to trust your informal words that "The GC performance isn't all that bad in D", but for commercial usage they always want (written) warranty: what's the worst scenario that can happen?

Even a laughable warranty is still a warranty: e.g. "the GC will stop your program for at most 1 seconds".  OK, that's fine, we know the where the limit is, so at least we can design the software to work-around it, e.g. design some user interaction steps / show animated entertaining pictures (>= 1 seconds), and in the background run the GC.


GC.*partial*collect(Duration maxPauseTime, Duration maxCollectionTime)
                             ^^^^^^^^^^^^           ^^^^^^^^^^^^^^^^^
This two parameters if added, is that *written* guarantee that the company would want.

May 24, 2020
On Tuesday, 19 May 2020 at 15:25:56 UTC, mw wrote:
> On Tuesday, 19 May 2020 at 05:59:17 UTC, rikki cattermole wrote:
>> We know what can be done, just nobody wants to do the work.
>> Our GC implementation isn't all that friendly even though it now supports fork'ing and can be precise.
>
> That’s exactly why I’m proposing just add a timeout to the current gc algorithm, which periodically checks for early return without finish a full collection. I hope the required code change will be minimal. Then the users will have a much predictable (stop-the-world) gc.

I think that the problem isn't that your idea is bad or nobody likes it. It's a good idea, and would be a nice improvement. The real problem is that there is nobody that would implement it.
May 24, 2020
On Sunday, 24 May 2020 at 08:27:39 UTC, Luis wrote:
> On Tuesday, 19 May 2020 at 15:25:56 UTC, mw wrote:
>> On Tuesday, 19 May 2020 at 05:59:17 UTC, rikki cattermole wrote:
>>> We know what can be done, just nobody wants to do the work.
>>> Our GC implementation isn't all that friendly even though it now supports fork'ing and can be precise.
>>
>> That’s exactly why I’m proposing just add a timeout to the current gc algorithm, which periodically checks for early return without finish a full collection. I hope the required code change will be minimal. Then the users will have a much predictable (stop-the-world) gc.
>
> I think that the problem isn't that your idea is bad or nobody likes it. It's a good idea, and would be a nice improvement. The real problem is that there is nobody that would implement it.


Maybe, we can start from drafting a DIP? anyone who is familiar with the process want to help? (esp if you also like the idea and want the improvement) @Adam D. Ruppe :-)


Also I noticed there isn't so much documentation about the GC's internal design and implementation:

https://forum.dlang.org/thread/yampusjziyptnbndymik@forum.dlang.org

Can someone who knows the internals can start a documentation of gc on D wiki? then others can pick up from there?

May 25, 2020
On 25/05/2020 5:41 AM, mw wrote:
> Maybe, we can start from drafting a DIP? anyone who is familiar with the process want to help? (esp if you also like the idea and want the improvement) @Adam D. Ruppe :-)

You don't need a DIP.

Write the code, show its good, get it merged.

May 25, 2020
On Monday, 25 May 2020 at 04:48:44 UTC, rikki cattermole wrote:
> You don't need a DIP.
>
> Write the code, show its good, get it merged.


Ha! I may give it a try although I'm not an expert on this.

Can you give some extra info? e.g. source file location? maybe in the following thread?

(Any doc apart from the code itself?)

"""
Also I noticed there isn't so much documentation about the GC's internal design and implementation:

https://forum.dlang.org/thread/yampusjziyptnbndymik@forum.dlang.org

Can someone who knows the internals can start a documentation of gc on D wiki? then others can pick up from there?
"""

May 25, 2020
On 25/05/2020 5:55 PM, mw wrote:
> On Monday, 25 May 2020 at 04:48:44 UTC, rikki cattermole wrote:
>> You don't need a DIP.
>>
>> Write the code, show its good, get it merged.
> 
> 
> Ha! I may give it a try although I'm not an expert on this.
> 
> Can you give some extra info? e.g. source file location? maybe in the following thread?

https://github.com/dlang/druntime/blob/master/src/gc/impl/conservative/gc.d

I tried once to add snapshot support so Windows could be asynchronous, lets just say I never found the place where it does the pointer checking and I didn't get much in the way of help myself.

Not that I made much fuss about it. Squeaky wheel gets the oil and all that.