January 12, 2014

On 11.01.2014 22:20, Benjamin Thaut wrote:
> Am 11.01.2014 21:44, schrieb Andrei Alexandrescu:
>> On 1/11/14 3:52 AM, Benjamin Thaut wrote:
>>> Am 10.01.2014 20:40, schrieb Andrei Alexandrescu:
>>>> On 1/10/14 11:34 AM, Benjamin Thaut wrote:
>>>>> Am 10.01.2014 09:03, schrieb Andrei Alexandrescu:
>>>>>>
>>>>>> Destroy.
>>>>>>
>>>>>> Andrei
>>>>>
>>>>> Just one question. Did you read "the garbage collection handbook"?
>>>>
>>>> Yah, the new edition. Unfortunately I didn't find a lot of new stuff or
>>>> things that would help in a practical implementation.
>>>>
>>>> Andrei
>>>
>>> Very good,
>>>
>>> In my opinion it should be required for everyone who wants to
>>> participate in D's GC to read that book. So everyone has at least a
>>> basic understanding of the problem at Hand.
>>>
>>> But if you read the book I don't understand, why you simply declare the
>>> hardest problem, percise pointer discovery, as done. To be able to
>>> actually implement and test a GC that is not a reimplementation of what
>>> we already have one needs percise pointer discovery of _all_ pointers,
>>> write barriers and GC halting points. So please enlighten me why you
>>> simply decalre this done?
>>
>> I'm not considering done as much as separable as a concern. All I'm
>> saying is I hope to be able to separate the low-level part of
>> discovering roots from the high-level part of marking used memory. BTW
>> the way I see this done is "mostly precise", i.e. there will be at least
>> for a while some words that will be handled conservatively (stack,
>> registers, certain union members).
>>
>> If there's anything in the GC book that suggest that would be
>> impossible, please do let me know!
>>
>>
>> Andrei
>>
>
> Well as far as my understanding of GCs goes, you have two options:
>
> 1) Impercise pointer discovery => limiting yourself to a mark & sweep
> 2) Percise pointer disccovery => option to use a semi space GC in
> combination with a mark & sweep and generations, which allows for
> superior handling of short lived allocations (which is the biggest
> problem of D's current GC).
>
> Also without percise pointer discovery you will never be able to move
> objects from one heap into another. This would be especially a problem
> for the immutable heap, because you might want to allocate all strings
> on the thread local heap until you discover that you actually need them
> to be shared between threads. You might also need to move objects into
> antoher heap whenever a casts happens or global variable is assigned.
>
> Kind Regards
> Benjamin Thaut

I think a moving collector is currently not feasible without restricting the language a lot, probably similar to safe D and more. I'm not sure we want that in general.

We should aim at something in between mark & sweep and compacting generational collection, e.g. a non-moving collector that keeps track of the youngest generation (I just made that up, not sure if it is realistic). Adding concurrency would also be nice...
January 12, 2014

On 11.01.2014 21:30, Jacob Carlborg wrote:
> On 2014-01-11 10:30, Rainer Schuetze wrote:
>
>> I was thinking about using the proposed module info extension to
>> generate data similar to RTInfo by interpreting global/tls data of a
>> module as a struct with type info. I don't know if this is feasable.
>
> You mean this: https://github.com/D-Programming-Language/dmd/pull/2271 ?
> I really hope this can be merged sometime.
>

Yes. I'm not sure if there is enough compiler support yet to figure out the global and TLS data layout of a module, especially for a static library where the module is split into a lot of object files.
January 12, 2014
Am 12.01.2014 11:27, schrieb Rainer Schuetze:
>
> I think a moving collector is currently not feasible without restricting
> the language a lot, probably similar to safe D and more. I'm not sure we
> want that in general.
>

Could you give an example which part of the language would not be doable with a moving collector? The only thing that comes to my mind is unions and that problem can be solved by allowing the user to specify manual scanning functions for structs or classes containing unions.

Also I don't think that we can create a GC which performs as good as the one of Java or C# if we are not willing to make the neccessary changes for a moving gc.


January 12, 2014
On Sunday, 12 January 2014 at 10:40:50 UTC, Benjamin Thaut wrote:
> Also I don't think that we can create a GC which performs as good as the one of Java or C# if we are not willing to make the neccessary changes for a moving gc.

We don't necessarily need that, though. In D we have a plethora of non-GC options, so it might be a better idea to tailor our GC to typical D programs rather than trying to reproduce the Java and C# collectors. For example, from a generational GC perspective, I don't think the young generation is significant in idiomatic D code, unlike Java code which relies on it heavily.
January 12, 2014
Am 12.01.2014 11:54, schrieb Jakob Ovrum:
> On Sunday, 12 January 2014 at 10:40:50 UTC, Benjamin Thaut wrote:
>> Also I don't think that we can create a GC which performs as good as
>> the one of Java or C# if we are not willing to make the neccessary
>> changes for a moving gc.
>
> We don't necessarily need that, though. In D we have a plethora of
> non-GC options, so it might be a better idea to tailor our GC to typical
> D programs rather than trying to reproduce the Java and C# collectors.
> For example, from a generational GC perspective, I don't think the young
> generation is significant in idiomatic D code, unlike Java code which
> relies on it heavily.

You should really try to write non-GC D code some time. You would be astonished how many hidden allocations there are.
January 12, 2014
12-Jan-2014 14:27, Rainer Schuetze пишет:
> On 11.01.2014 22:20, Benjamin Thaut wrote:
>> Am 11.01.2014 21:44, schrieb Andrei Alexandrescu:
>>
>> Well as far as my understanding of GCs goes, you have two options:
>>
>> 1) Impercise pointer discovery => limiting yourself to a mark & sweep
>> 2) Percise pointer disccovery => option to use a semi space GC in
>> combination with a mark & sweep and generations, which allows for
>> superior handling of short lived allocations (which is the biggest
>> problem of D's current GC).
>>
>> Also without percise pointer discovery you will never be able to move
>> objects from one heap into another. This would be especially a problem
>> for the immutable heap, because you might want to allocate all strings
>> on the thread local heap until you discover that you actually need them
>> to be shared between threads. You might also need to move objects into
>> antoher heap whenever a casts happens or global variable is assigned.
>>
>> Kind Regards
>> Benjamin Thaut
>
> I think a moving collector is currently not feasible without restricting
> the language a lot, probably similar to safe D and more. I'm not sure we
> want that in general.

I might be ignorant but why can't we make "mostly-moving" collector?

For that we discern precise pointers (with typeinfo) and conservative (such as potential pointers in registers/stack). Then any block pointed to by a least one conservative pointer is pinned, others though can be moved freely since all of the pointers are known.


-- 
Dmitry Olshansky
January 12, 2014
Am 12.01.2014 12:03, schrieb Dmitry Olshansky:
> 12-Jan-2014 14:27, Rainer Schuetze пишет:
>> On 11.01.2014 22:20, Benjamin Thaut wrote:
>>> Am 11.01.2014 21:44, schrieb Andrei Alexandrescu:
>>>
>>> Well as far as my understanding of GCs goes, you have two options:
>>>
>>> 1) Impercise pointer discovery => limiting yourself to a mark & sweep
>>> 2) Percise pointer disccovery => option to use a semi space GC in
>>> combination with a mark & sweep and generations, which allows for
>>> superior handling of short lived allocations (which is the biggest
>>> problem of D's current GC).
>>>
>>> Also without percise pointer discovery you will never be able to move
>>> objects from one heap into another. This would be especially a problem
>>> for the immutable heap, because you might want to allocate all strings
>>> on the thread local heap until you discover that you actually need them
>>> to be shared between threads. You might also need to move objects into
>>> antoher heap whenever a casts happens or global variable is assigned.
>>>
>>> Kind Regards
>>> Benjamin Thaut
>>
>> I think a moving collector is currently not feasible without restricting
>> the language a lot, probably similar to safe D and more. I'm not sure we
>> want that in general.
>
> I might be ignorant but why can't we make "mostly-moving" collector?
>
> For that we discern precise pointers (with typeinfo) and conservative
> (such as potential pointers in registers/stack). Then any block pointed
> to by a least one conservative pointer is pinned, others though can be
> moved freely since all of the pointers are known.
>
>

A semi-space garbage collector is best fitted for small short lived allocations. Which are mostly those that are referenced by the stack because they happend as functions are called. And for a semi-space garbage collector there is not mostly moving, it has to copy _all_ objects from one heap onto another so that those left on the original heap are known to be all garbage, can be destroyed and then the heap can be reused for the next collection. Mostly-moving doens't work here. Either you know all pointers percisely, or you don't do it. What you mean is pinning, you can pin objects for which you know they might be referenced by a impercise pointer and thus prevent them from beeing moved around. But this type of moving only works with mark & sweep compacting collectors and thats not really better than what we already have.

If you are really interrested in the topic I higly recommend reading "The garbage collector handbook".

Kind Regards
Benjamin Thaut
January 12, 2014
On Sunday, 12 January 2014 at 11:02:28 UTC, Benjamin Thaut wrote:
> You should really try to write non-GC D code some time. You would be astonished how many hidden allocations there are.

Is that down to the language, or parts of druntime and phobos that assume GC when they shouldn't?
January 12, 2014
Am 12.01.2014 12:12, schrieb Joseph Rushton Wakeling:
> On Sunday, 12 January 2014 at 11:02:28 UTC, Benjamin Thaut wrote:
>> You should really try to write non-GC D code some time. You would be
>> astonished how many hidden allocations there are.
>
> Is that down to the language, or parts of druntime and phobos that
> assume GC when they shouldn't?

Both. Although it kept getting better lately.
January 12, 2014
On Sunday, 12 January 2014 at 11:02:28 UTC, Benjamin Thaut wrote:
> You should really try to write non-GC D code some time.

I do.

> You would be astonished how many hidden allocations there are.

I was, years ago.