January 12, 2014
On 1/12/14 2:40 AM, Benjamin Thaut wrote:
> Am 12.01.2014 11:27, schrieb Rainer Schuetze:
>>
>> I think a moving collector is currently not feasible without restricting
>> the language a lot, probably similar to safe D and more. I'm not sure we
>> want that in general.
>>
>
> Could you give an example which part of the language would not be doable
> with a moving collector? The only thing that comes to my mind is unions
> and that problem can be solved by allowing the user to specify manual
> scanning functions for structs or classes containing unions.
>
> Also I don't think that we can create a GC which performs as good as the
> one of Java or C# if we are not willing to make the neccessary changes
> for a moving gc.

Yah, moving would be real nice. I hope to at least clarify the issues related to moving with my work.

Andrei

January 12, 2014
On Sunday, 12 January 2014 at 10:40:50 UTC, Benjamin Thaut wrote:
> Am 12.01.2014 11:27, schrieb Rainer Schuetze:
>>
>> I think a moving collector is currently not feasible without restricting
>> the language a lot, probably similar to safe D and more. I'm not sure we
>> want that in general.

I'd rather have more restrictions and a working precise GC, and let those who wish to do without the
safety ask for it explicitly.

> Could you give an example which part of the language would not be doable with a moving collector? The only thing that comes to my mind is unions and that problem can be solved by allowing the user to specify manual scanning functions for structs or classes containing unions.

How would the moving GC deal with pointer arithmetic?

> Also I don't think that we can create a GC which performs as good as the one of Java or C# if we are not willing to make the neccessary changes for a moving gc.

I agree. Even ignoring the comparison with Java/C#, I think if D is to be a 'fully garbage collected language' then it will have to support a state of the art GC.

January 12, 2014
Am 12.01.2014 18:12, schrieb Brian Rogoff:
>
> How would the moving GC deal with pointer arithmetic?

I don't see any problem with pointer arithmetic? Either the pointer is pointing to a gc managed memory block and will be patched accordingly, or it is not pointing to a gc managed memory block, and nothing will happen.

Obviously if you do unsafe things, you have to know what you are doing and might give additional information to the gc so it can propperly support unsafe stuff.

January 12, 2014
On Sunday, 12 January 2014 at 17:15:44 UTC, Benjamin Thaut wrote:
> Am 12.01.2014 18:12, schrieb Brian Rogoff:
>>
>> How would the moving GC deal with pointer arithmetic?
>
> I don't see any problem with pointer arithmetic? Either the pointer is pointing to a gc managed memory block and will be patched accordingly, or it is not pointing to a gc managed memory block, and nothing will happen.

Or you don't know if something is a pointer or not and so whatever it references is pinned.
January 12, 2014
On Sunday, 12 January 2014 at 10:27:42 UTC, Rainer Schuetze wrote:
> [...] Adding concurrency would also be nice...

I just want to point out that from an outer perspective, this is a really really *really* big deal.

By far the most common arguments I've heard against D are:

1. library availability (derelict, deimos now)
2. community size and impetus (getting there!)
3. shared druntime/phobos so Hello World isn't 800kb (getting there?)
4. garbage collector which is only possible to opt out of by writing C

The very reliance of the garbage collector, regardless of how far between the stop-the-world sweeps are, is a showstopper for many people. They hear GC and think Java pauses. Being able to honestly claim "well it runs concurrently in a separate thread and doesn't[*] incur any performance penalty" would be the single biggest leap to greater adoption D could take at this point. Maybe barring "it prints money", but only maybe.

It may be less of a *technical* problem than, say, this or that bug in the type system, or the identity crisis of shared. But fixing those would not make for a *twentieth* of the marketing that a concurrent GC would. Fixing those would make people stay -- introducing that GC would make people join.

Not saying that such bugs shouldn't get attention, I'm just saying that Bob would scroll past that link on /r/programming. In comparison, Lucarella's dconf slides were... *compelling*.



(P.S. many awkwardly long hugs to those culling allocations from druntime/phobos, you rock)
January 12, 2014
On Sunday, 12 January 2014 at 17:15:44 UTC, Benjamin Thaut wrote:
> Am 12.01.2014 18:12, schrieb Brian Rogoff:
>>
>> How would the moving GC deal with pointer arithmetic?
>
> I don't see any problem with pointer arithmetic? Either the pointer is pointing to a gc managed memory block and will be patched accordingly, or it is not pointing to a gc managed memory block, and nothing will happen.
>
> Obviously if you do unsafe things, you have to know what you are doing and might give additional information to the gc so it can propperly support unsafe stuff.

The garbage collection page of the D spec actually talks a lot
about what is safe and unsafe (even though D doesn't have a
moving collector).

http://dlang.org/garbage.html
January 13, 2014
On Sunday, 12 January 2014 at 10:40:50 UTC, Benjamin Thaut wrote:
> Am 12.01.2014 11:27, schrieb Rainer Schuetze:
>>
>> I think a moving collector is currently not feasible without restricting
>> the language a lot, probably similar to safe D and more. I'm not sure we
>> want that in general.
>>
>
> Could you give an example which part of the language would not be doable with a moving collector? The only thing that comes to my mind is unions and that problem can be solved by allowing the user to specify manual scanning functions for structs or classes containing unions.

Maybe I'm too pessimistic ;-) I guess moving in general could be ok, I was thinking about segregating heaps by type (shared/immutable/mutable) and moving data between them adds restrictions. I'd like to be proven wrong.

Some thoughts regarding a moving collector:

- interfacing with C/C++ is problematic: even if a pointer is passed on the stack, it is not guaranteed that this stack entry is not modified by the called function. While this might cause problems when collecting memory due to this being the last reference to the data, it is much more likely that there are still references, but moving pointers will definitely fail.
Having to explicitely pin every pointer passed to C functions would be very expensive.

- if we have many pinned objects, compaction loses some of its advantages (like fast allocation). If we keep allocating from free lists of equal sized bins, fragmantation and memory overhead is limited (though not small) without moving. Also, moving lot's of data can also be expensive.

- interior pointers do not allow "threading" for pointer updates, so it might get pretty expensive to update references to moved objects


>
> Also I don't think that we can create a GC which performs as good as the one of Java or C# if we are not willing to make the neccessary changes for a moving gc.

To compete with other GCs we'd probably need write barriers to keep track of changed references (for concurrent operation or generations). There just needs to be a way to avoid having to rescan the full heap every time, it does not scale.

PS: my SSD with all the D stuff just died yesterday (was it a failure of the disks GC?). I'll probably need some time to recover from that...
January 13, 2014
On 2014-01-13 10:20, Rainer Schuetze wrote:

> Maybe I'm too pessimistic ;-) I guess moving in general could be ok, I
> was thinking about segregating heaps by type (shared/immutable/mutable)
> and moving data between them adds restrictions. I'd like to be proven
> wrong.
>
> Some thoughts regarding a moving collector:
>
> - interfacing with C/C++ is problematic: even if a pointer is passed on
> the stack, it is not guaranteed that this stack entry is not modified by
> the called function. While this might cause problems when collecting
> memory due to this being the last reference to the data, it is much more
> likely that there are still references, but moving pointers will
> definitely fail.
> Having to explicitely pin every pointer passed to C functions would be
> very expensive.

Could we have a segregated heap for C pointers? Would that help? Basically having a special function allocating everything that should interface with C.

-- 
/Jacob Carlborg
January 13, 2014
On Sunday, 12 January 2014 at 19:29:08 UTC, sunspyre wrote:
> The very reliance of the garbage collector, regardless of how far between the stop-the-world sweeps are, is a showstopper for many people. They hear GC and think Java pauses. Being able to honestly claim "well it runs concurrently in a separate thread and doesn't[*] incur any performance penalty" would be the single biggest leap to greater adoption D could take at this point.

That is not technically possible. A truly concurrent GC has heavy penalty. People probably think of "Concurrent GC" has Microsoft calls one of the .NET GCs, which is "mostly concurrent".

The first step is get a precise GC. That should give a significant performance boost already. Everything else should probably build on this.

http://stackoverflow.com/questions/2583644/difference-between-background-and-concurrent-garbage-collection

January 13, 2014
On Monday, 13 January 2014 at 10:59:44 UTC, Jacob Carlborg wrote:
> Could we have a segregated heap for C pointers? Would that help? Basically having a special function allocating everything that should interface with C.

Is it possible to declare whether the C-function retains a pointer to the memory area or not, and to what extent? In general you will have to assume that the C code retains pointers not only to the object, but the transitive closure of anything that can be reached from it. That is quite extensive…

I also hope that the GC isn't fully modularized, because compiler support for specific GC strategies is likely to give better performance. Especially with whole program analysis.

It would be very nice to localize GC to a few threads. This  would be useful in games where you only want to GC the AI/game mechanics portion of the simulated world, but use less demanding memory management for graphics, physics etc, which keeps running uninterrupted (one can interpolate/predict for a few frames giving the GC some more room to complete).