April 08, 2012
On 4/8/2012 2:21 AM, Timon Gehr wrote:
> On 04/08/2012 10:45 AM, Timon Gehr wrote:
>> That actually sounds like a pretty awesome idea.
>
> I understand that the stack will still have to be scanned conservatively, but
> how does the scheme deal with closures?

For now, just treat them conservatively.
April 08, 2012
On 8 April 2012 11:56, Timon Gehr <timon.gehr@gmx.ch> wrote:

> On 04/08/2012 10:45 AM, Timon Gehr wrote:
>
>> That actually sounds like a pretty awesome idea.
>>
>
> Make sure that the compiler does not actually rely on the fact that the template generates a function. The design should include the possibility of just generating tables. It all should be completely transparent to the compiler, if that is possible.
>

This sounds important to me. If it is also possible to do the work with generated tables, and not calling thousands of indirect functions in someone's implementation, it would be nice to reserve that possibility. Indirect function calls in hot loops make me very nervous for non-x86 machines.


April 08, 2012

On 4/8/2012 11:21 AM, Timon Gehr wrote:
> On 04/08/2012 10:45 AM, Timon Gehr wrote:
>> That actually sounds like a pretty awesome idea.
>
> I understand that the stack will still have to be scanned
> conservatively, but how does the scheme deal with closures?

I guess the compiler should generate an (anonymous) struct type corresponding to the closure data layout. There probably has to be a template for compiler generated structs or classes anyway.

This new type could also be used as the type of the context pointer, so a debugger could display the closure variables.

April 08, 2012
On 08-04-2012 11:42, Manu wrote:
> On 8 April 2012 11:56, Timon Gehr <timon.gehr@gmx.ch
> <mailto:timon.gehr@gmx.ch>> wrote:
>
>     On 04/08/2012 10:45 AM, Timon Gehr wrote:
>
>         That actually sounds like a pretty awesome idea.
>
>
>     Make sure that the compiler does not actually rely on the fact that
>     the template generates a function. The design should include the
>     possibility of just generating tables. It all should be completely
>     transparent to the compiler, if that is possible.
>
>
> This sounds important to me. If it is also possible to do the work with
> generated tables, and not calling thousands of indirect functions in
> someone's implementation, it would be nice to reserve that possibility.
> Indirect function calls in hot loops make me very nervous for non-x86
> machines.

Yes, I agree here. The last thing we need is a huge amount of kinda-sorta-virtual function calls on ARM, MIPS, etc. It may work fine on x86, but anywhere else, it's really not what you want in a GC.

-- 
- Alex
April 08, 2012
On 08-04-2012 12:07, Rainer Schuetze wrote:
>
>
> On 4/8/2012 11:21 AM, Timon Gehr wrote:
>> On 04/08/2012 10:45 AM, Timon Gehr wrote:
>>> That actually sounds like a pretty awesome idea.
>>
>> I understand that the stack will still have to be scanned
>> conservatively, but how does the scheme deal with closures?
>
> I guess the compiler should generate an (anonymous) struct type
> corresponding to the closure data layout. There probably has to be a
> template for compiler generated structs or classes anyway.
>
> This new type could also be used as the type of the context pointer, so
> a debugger could display the closure variables.
>

This sounds sensible to me. No reason closure marking can't be precise if the compiler just emits the relevant type info (pretty much any other compiler with closures does this; see C#, F#, etc).

-- 
- Alex
April 09, 2012
On Sat, 07 Apr 2012 21:56:09 -0400, Walter Bright <newshound2@digitalmars.com> wrote:

> Of course, many of us have been thinking about this for a looong time, and what is the best way to go about it. The usual technique is for the compiler to emit some sort of table for each TypeInfo giving the layout of the object, i.e. where the pointers are.
>
> The general problem with these is the table is non-trivial, as it will require things like iterated data blocks, etc. It has to be compressed to save space, and the gc then has to execute a fair amount of code to decode it.
>
> It also requires some significant work on the compiler end, leading of course to complexity, rigidity, development bottlenecks, and the usual bugs.
>
> An alternative Andrei and I have been talking about is to put in the TypeInfo a pointer to a function. That function will contain customized code to mark the pointers in an instance of that type. That custom code will be generated by a template defined by the library. All the compiler has to do is stupidly instantiate the template for the type, and insert an address to the generated function.
>
> The compiler need know NOTHING about how the marking works.
>
> Even better, as ctRegex has demonstrated, the custom generated code can be very, very fast compared with a runtime table-driven approach. (The slow part will be calling the function indirectly.)
>
> And best of all, the design is pushed out of the compiler into the library, so various schemes can be tried out without needing compiler work.
>
> I think this is an exciting idea, it will enable us to get a precise gc by enabling people to work on it in parallel rather than serially waiting for me.

I think this is a really good idea.

I would like to go further and propose that there be an arbitrary way to add members to the TypeInfo types using templates.  Not sure how it would be implemented, but I don't see why this has to be specific to GCs.  Some way to signify "hey compiler, please initialize this member with template X given the type being compiled".

This could be a huge bridge between compile-time and runtime type information.

-Steve
April 09, 2012
Le 08/04/2012 14:02, Alex Rønne Petersen a écrit :
> On 08-04-2012 11:42, Manu wrote:
>> On 8 April 2012 11:56, Timon Gehr <timon.gehr@gmx.ch
>> <mailto:timon.gehr@gmx.ch>> wrote:
>>
>> On 04/08/2012 10:45 AM, Timon Gehr wrote:
>>
>> That actually sounds like a pretty awesome idea.
>>
>>
>> Make sure that the compiler does not actually rely on the fact that
>> the template generates a function. The design should include the
>> possibility of just generating tables. It all should be completely
>> transparent to the compiler, if that is possible.
>>
>>
>> This sounds important to me. If it is also possible to do the work with
>> generated tables, and not calling thousands of indirect functions in
>> someone's implementation, it would be nice to reserve that possibility.
>> Indirect function calls in hot loops make me very nervous for non-x86
>> machines.
>
> Yes, I agree here. The last thing we need is a huge amount of
> kinda-sorta-virtual function calls on ARM, MIPS, etc. It may work fine
> on x86, but anywhere else, it's really not what you want in a GC.
>

Nothing prevent the generated function to itself call other generated functions, when things are predictable. It avoid many indirect calls, and purely by lib, which is good (can be tuned for application/plateform).
April 09, 2012
Le 08/04/2012 03:56, Walter Bright a écrit :
> Of course, many of us have been thinking about this for a looong time,
> and what is the best way to go about it. The usual technique is for the
> compiler to emit some sort of table for each TypeInfo giving the layout
> of the object, i.e. where the pointers are.
>
> The general problem with these is the table is non-trivial, as it will
> require things like iterated data blocks, etc. It has to be compressed
> to save space, and the gc then has to execute a fair amount of code to
> decode it.
>
> It also requires some significant work on the compiler end, leading of
> course to complexity, rigidity, development bottlenecks, and the usual
> bugs.
>
> An alternative Andrei and I have been talking about is to put in the
> TypeInfo a pointer to a function. That function will contain customized
> code to mark the pointers in an instance of that type. That custom code
> will be generated by a template defined by the library. All the compiler
> has to do is stupidly instantiate the template for the type, and insert
> an address to the generated function.
>
> The compiler need know NOTHING about how the marking works.
>
> Even better, as ctRegex has demonstrated, the custom generated code can
> be very, very fast compared with a runtime table-driven approach. (The
> slow part will be calling the function indirectly.)
>
> And best of all, the design is pushed out of the compiler into the
> library, so various schemes can be tried out without needing compiler work.
>
> I think this is an exciting idea, it will enable us to get a precise gc
> by enabling people to work on it in parallel rather than serially
> waiting for me.

This id a good idea. However, this doesn't handle type qualifiers. And this is important !

D2 type system is made in such a way that most data are either thread local or immutable, and a small amount is shared. Both thread local storage and immutability are source of BIG improvement for the GC. Doing without it is a huge design error.

For instance, Ocaml's GC is known to be more performant than Java's. Because in Caml, most data are immutable, and the GC take advantage of this. Immutability means 100% concurrent garbage collection.

In the other hand, TLS can be collected independently and only influence the thread that own the data. Both are every powerfull improvement, and the design you propose « as this » cannot provide any mean to handle that. Which is a big missed opportunity, and will be hard to change in the future.
April 09, 2012
On 9 April 2012 21:20, deadalnix <deadalnix@gmail.com> wrote:

> Le 08/04/2012 14:02, Alex Rønne Petersen a écrit :
>
>  On 08-04-2012 11:42, Manu wrote:
>>
>>> On 8 April 2012 11:56, Timon Gehr <timon.gehr@gmx.ch <mailto:timon.gehr@gmx.ch>> wrote:
>>>
>>> On 04/08/2012 10:45 AM, Timon Gehr wrote:
>>>
>>> That actually sounds like a pretty awesome idea.
>>>
>>>
>>> Make sure that the compiler does not actually rely on the fact that the template generates a function. The design should include the possibility of just generating tables. It all should be completely transparent to the compiler, if that is possible.
>>>
>>>
>>> This sounds important to me. If it is also possible to do the work with generated tables, and not calling thousands of indirect functions in someone's implementation, it would be nice to reserve that possibility. Indirect function calls in hot loops make me very nervous for non-x86 machines.
>>>
>>
>> Yes, I agree here. The last thing we need is a huge amount of kinda-sorta-virtual function calls on ARM, MIPS, etc. It may work fine on x86, but anywhere else, it's really not what you want in a GC.
>>
>>
> Nothing prevent the generated function to itself call other generated functions, when things are predictable. It avoid many indirect calls, and purely by lib, which is good (can be tuned for application/plateform).
>

Eh?
Not sure what you mean. The idea is the template would produce a
struct/table of data instead of being a pointer to a function, this way the
GC could work without calling anything. If the GC was written to assume GC
info in a particular format/structure, it could be written without any
calls.
I'm just saying to leave that as a possibility, and not REQUIRE an indirect
function call for every single allocation in the system. Some GC might be
able to make better use of that sort of setup.


April 09, 2012
Le 09/04/2012 20:33, Manu a écrit :
> Eh?
> Not sure what you mean. The idea is the template would produce a
> struct/table of data instead of being a pointer to a function, this way
> the GC could work without calling anything. If the GC was written to
> assume GC info in a particular format/structure, it could be written
> without any calls.
> I'm just saying to leave that as a possibility, and not REQUIRE an
> indirect function call for every single allocation in the system. Some
> GC might be able to make better use of that sort of setup.

If you have reference to objects, you can't avoid a function call. If you have something you know at compile time, the generated function can directly call the other function that mark the pointed data (or even can do it itself, if you don't fear code bloat) without going back to the GC and its indirect call.

So it make no difference in the number of indirect calls you have, but the struct proposal is a stronger constraint on the GC that the function one.

BTW, starting you answer by « Not sure what you mean. » should have been a red flag.