-nogc

Replacing new? (was re: -nogc)
Apr 23, 2009 Joel C. Salomon
Apr 23, 2009 Andrei Alexandrescu
Apr 23, 2009 Daniel Keep
Apr 23, 2009 Andrei Alexandrescu
Apr 24, 2009 Daniel Keep
Apr 24, 2009 Denis Koroskin
Apr 24, 2009 Kagamin

Apr 23, 2009

bearophile

Apr 23, 2009

Apr 23, 2009

Apr 23, 2009

Apr 23, 2009

Apr 23, 2009

Apr 23, 2009

Apr 23, 2009

Apr 23, 2009

Apr 23, 2009

Apr 24, 2009

Apr 24, 2009

Apr 24, 2009

Apr 25, 2009

Apr 25, 2009

Apr 25, 2009

Apr 24, 2009

Apr 24, 2009

Apr 24, 2009

Apr 25, 2009

April 23, 2009

Posted by Andrei Alexandrescu

Permalink

Andrei Alexandrescu

Permalink

I've discussed something with Walter today and thought I'd share it here.

The possibility of using D without a garbage collector was always
looming and has been used to placate naysayers ("you can call malloc if
you want" etc.) but that opportunity has not been realized in a seamless
manner. As soon as you concatenate arrays, add to a hash, or create an
object, you will call into the GC.

So I'm thinking there should be a flag -nogc that enables a different
model of memory allocation. Here's the steps we need to take:

1. Put array definitions in object.d. Have the compiler rewrite "T[]" ->
".Array!(T)" and "[ a, b, c ]" -> ".Array!(typeof(a))(a, b, c)". I think
superdan suggested that when he wasn't busy cursing :o).

2. Do the similar thing for associative arrays.

3. Have two object.d at hand: one is "normal" and uses garbage
collection, the other (call it object_nogc.d) has an entirely different
definition for arrays, hashes, and Object.

4. The definition of Object in object_nogc.d includes a reference count
member for intrusive refcounting.

5. Define a Ref!(T) struct in object_nogc.d that does intrusive
reference counting against T using ctors and dtor.

6. At this point we already have a usable, credible no-gc offering: just
use object_nogc.d instead of object.d and instead of "new Widget(args)"
use "Ref!(Widget)(args)".

7. Add a -nogc option to the compiler. In that mode, the compiler
replaces automatically "T" -> "Ref!(T)" and "new T(args)" ->
"Ref!(T)(args)" for all classes T except inside
object_nogc.d. The exception, as Walter pointed out, is to avoid
infinite regression (how do you implement Ref if the reference you hold
inside will also be wrapped in Ref???)

8. Well with this all a very solid offering of D without garbage
collection would be available at a low cost!

One cool thing is that you can compile the same application with and
without GC and test the differences easily. That's bound to show a
number of interesting things!

A disadvantage is that -nogc must be global - you can't link a program
that's partially built with gc and partially without. This was a major
counter-argument to adding optional gc to C++.


Andrei

April 23, 2009

Re: -nogc

Posted by bearophile
in reply to Andrei Alexandrescu

Permalink

bearophile

Posted in reply to Andrei Alexandrescu

Permalink

Andrei Alexandrescu:
> The possibility of using D without a garbage collector was always looming and has been used to placate naysayers ("you can call malloc if you want" etc.) but that opportunity has not been realized in a seamless manner. As soon as you concatenate arrays, add to a hash, or create an object, you will call into the GC.

One simple possible solution: -nogc is to write C-like programs, with no automatic reference counting. It doesn't include the GC in the final executable (making it much smaller) and in such programs AAs and array concatenation and closures are forbidden (compilation error if you try to use them). "New" allocates using the C heap, and you have to use "delete" manually for each of them.
This is simple. While adding a second memory management system, ref-counted, looks like an increase of complexity for both the compiler and the programmers.


>1. Put array definitions in object.d. Have the compiler rewrite "T[]" -> ".Array!(T)"<

That has to be done with care an in a transparent way, not adding the Array name in the namespace, so you can create an Array youself, etc.

Bye,
bearophile

April 23, 2009

Re: -nogc

Posted by Christopher Wright
in reply to Andrei Alexandrescu

Permalink

Christopher Wright

Posted in reply to Andrei Alexandrescu

Permalink

Andrei Alexandrescu wrote:
> I've discussed something with Walter today and thought I'd share it here.
> 
> The possibility of using D without a garbage collector was always
> looming and has been used to placate naysayers ("you can call malloc if
> you want" etc.) but that opportunity has not been realized in a seamless
> manner. As soon as you concatenate arrays, add to a hash, or create an
> object, you will call into the GC.
> 
> So I'm thinking there should be a flag -nogc that enables a different
> model of memory allocation. Here's the steps we need to take:

This means replacing a mark/sweep GC with a reference counting GC. I'd think that it would be better to have -nogc not use Ref(T) by default, and add another flag -refcount that implies -nogc and uses Ref(T) by default.

April 23, 2009

Re: -nogc

Posted by Andrei Alexandrescu
in reply to Christopher Wright

Permalink

Andrei Alexandrescu

Posted in reply to Christopher Wright

Permalink

Christopher Wright wrote:
> Andrei Alexandrescu wrote:
>> I've discussed something with Walter today and thought I'd share it here.
>>
>> The possibility of using D without a garbage collector was always
>> looming and has been used to placate naysayers ("you can call malloc if
>> you want" etc.) but that opportunity has not been realized in a seamless
>> manner. As soon as you concatenate arrays, add to a hash, or create an
>> object, you will call into the GC.
>>
>> So I'm thinking there should be a flag -nogc that enables a different
>> model of memory allocation. Here's the steps we need to take:
> 
> This means replacing a mark/sweep GC with a reference counting GC.

It just means that certain types and constructs are rewritten. The exact strategy depends on how Ref, Array, and AssocArray are defined.

Probably a good approach is to simply rewrite anything anyway and have Ref vanish in gc mode by means of e.g. alias this.

So this means that we only need a flag -object=/path/to/object.d after all.

Andrei

April 23, 2009

Re: -nogc

Posted by Daniel Keep
in reply to Christopher Wright

Permalink

Daniel Keep

Posted in reply to Christopher Wright

Permalink

Christopher Wright wrote:
> Andrei Alexandrescu wrote:
>> I've discussed something with Walter today and thought I'd share it here.
>>
>> The possibility of using D without a garbage collector was always looming and has been used to placate naysayers ("you can call malloc if you want" etc.) but that opportunity has not been realized in a seamless manner. As soon as you concatenate arrays, add to a hash, or create an object, you will call into the GC.
>>
>> So I'm thinking there should be a flag -nogc that enables a different model of memory allocation. Here's the steps we need to take:
> 
> This means replacing a mark/sweep GC with a reference counting GC. I'd think that it would be better to have -nogc not use Ref(T) by default, and add another flag -refcount that implies -nogc and uses Ref(T) by default.

A thought occurs.  If you wanted to implement, say, a moving collector, it would be very useful to have some level of indirection instead of direct references (either using an ID for the actual address, or by using an opPostMove to notify the GC, if we ever get one.)

So perhaps you could make "-nogc" alias to "-nogc=object_nogc", thus
allowing people to replace Array(T), AA(K,V) and Ref(T) with their own.

  -- Daniel

April 23, 2009

Re: -nogc

Posted by Frank Benoit
in reply to Andrei Alexandrescu

Permalink

Frank Benoit

Posted in reply to Andrei Alexandrescu

Permalink

I am using D for a real time test system. There i have to make sure that
real time code does never use direct or indirect allocations.
I can use the GC in the non real time thread and at start up. I can
preallocate as much as I want. Just, it is not allowed in IRQ handler,
certainly.

What i did is, i patched the GC to have a callback in the allocation function. My application can register for that callback and checks the current thread. If it is a real time thread, an error is generated.

The disadvantage is, it is a runtime check. The advantage is, i can mix code that can use the GC and code that can't.

April 23, 2009

Re: -nogc

Posted by Andrei Alexandrescu
in reply to bearophile

Permalink

Andrei Alexandrescu

Posted in reply to bearophile

Permalink

bearophile wrote:
> Andrei Alexandrescu:
>> The possibility of using D without a garbage collector was always
>> looming and has been used to placate naysayers ("you can call malloc if
>> you want" etc.) but that opportunity has not been realized in a seamless
>> manner. As soon as you concatenate arrays, add to a hash, or create an
>> object, you will call into the GC.
> 
> One simple possible solution: -nogc is to write C-like programs, with no automatic reference counting. It doesn't include the GC in the final executable (making it much smaller) and in such programs AAs and array concatenation and closures are forbidden (compilation error if you try to use them). "New" allocates using the C heap, and you have to use "delete" manually for each of them.
> This is simple. While adding a second memory management system, ref-counted, looks like an increase of complexity for both the compiler and the programmers.

I was thinking of starting from the opposite end - add the required tools first so we gain experience with them, and then integrate with the compiler.

>> 1. Put array definitions in object.d. Have the compiler rewrite "T[]" -> ".Array!(T)"<
> 
> That has to be done with care an in a transparent way, not adding the Array name in the namespace, so you can create an Array youself, etc.

There shouldn't be any harm in using Array or AssocArray directly.


Andrei

April 23, 2009

Re: -nogc

Posted by Michel Fortin
in reply to Andrei Alexandrescu

Permalink

Michel Fortin

Posted in reply to Andrei Alexandrescu

Permalink

On 2009-04-23 06:58:38 -0400, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> said:

> I've discussed something with Walter today and thought I'd share it here.
> 
> The possibility of using D without a garbage collector was always
> looming and has been used to placate naysayers ("you can call malloc if
> you want" etc.) but that opportunity has not been realized in a seamless
> manner. As soon as you concatenate arrays, add to a hash, or create an
> object, you will call into the GC.

Very true. It's pretty easy to call the GC without noticing in D.


> So I'm thinking there should be a flag -nogc that enables a different
> model of memory allocation. Here's the steps we need to take:
> 
> 1. Put array definitions in object.d. Have the compiler rewrite "T[]" ->
> ".Array!(T)" and "[ a, b, c ]" -> ".Array!(typeof(a))(a, b, c)". I think
> superdan suggested that when he wasn't busy cursing :o).

That makes sense.


> 2. Do the similar thing for associative arrays.
> 
> 3. Have two object.d at hand: one is "normal" and uses garbage
> collection, the other (call it object_nogc.d) has an entirely different
> definition for arrays, hashes, and Object.

Couldn't that just be a version switch, such as `version (D_NO_GC)` and `version (D_GC)`. Then you can implement things differently in other modules too when there is or there isn't a GC.


> 4. The definition of Object in object_nogc.d includes a reference count
> member for intrusive refcounting.
> 
> 5. Define a Ref!(T) struct in object_nogc.d that does intrusive
> reference counting against T using ctors and dtor.
> 
> 6. At this point we already have a usable, credible no-gc offering: just
> use object_nogc.d instead of object.d and instead of "new Widget(args)"
> use "Ref!(Widget)(args)".

How's that going to work with scope classes?

	scope Widget = new Widget;
	scope Widget = Ref!(Widget)();


> 7. Add a -nogc option to the compiler. In that mode, the compiler
> replaces automatically "T" -> "Ref!(T)" and "new T(args)" ->
> "Ref!(T)(args)" for all classes T except inside
> object_nogc.d. The exception, as Walter pointed out, is to avoid
> infinite regression (how do you implement Ref if the reference you hold
> inside will also be wrapped in Ref???)

I'm just wondering, why wouldn't the compiler always use Ref!(T)? In the GC mode it'd simply resolve to a T, but if you wanted to experiment with another kind of GC -- say one which would require calling a notification function when writing a new value, such as the one in Objective-C 2.0 -- then you could.

Hum, also, how would it work for pointers to things in memory blocks that are normally managed by the GC? Would those increment the reference count for the memory block?


> 8. Well with this all a very solid offering of D without garbage
> collection would be available at a low cost!
> 
> One cool thing is that you can compile the same application with and
> without GC and test the differences easily. That's bound to show a
> number of interesting things!

Indeed.


> A disadvantage is that -nogc must be global - you can't link a program
> that's partially built with gc and partially without. This was a major
> counter-argument to adding optional gc to C++.

Another disadvantage is that you change the reference semantics and capabilities. With a GC, you can create circular pointer references and it won't leak memory once you stop referencing them. Do that with reference counting and you'll have memory leaks all around.

So with no GC, you have to have weak references if you're going to build tree structures where branches know about their parents (which is a pretty common thing). I'd suggest that weak references be put in the language so the compiler can replace them with WeakRef!(T) in no-GC mode and do something else in GC mode. Being in the language would just be some syntactic sugar that would make them more bearable in normal code.

As for compatibility, it may be worth looking at how it has been done in Objective-C 2.0. Objective-C has always used reference counting. Version 2.0 brought a GC. When you build a library, you have to specify whether the resulting binary expects a GC, reference counting, or can work in both modes. Something that works in both mode incurs a slight overhead, but sometime the binary compatibility is just worth it.

-- 
Michel Fortin
michel.fortin@michelf.com
http://michelf.com/

April 23, 2009

Re: -nogc

Posted by Andrei Alexandrescu
in reply to Frank Benoit

Permalink

Andrei Alexandrescu

Posted in reply to Frank Benoit

Permalink

Frank Benoit wrote:
> I am using D for a real time test system. There i have to make sure that
> real time code does never use direct or indirect allocations.
> I can use the GC in the non real time thread and at start up. I can
> preallocate as much as I want. Just, it is not allowed in IRQ handler,
> certainly.
> 
> What i did is, i patched the GC to have a callback in the allocation
> function. My application can register for that callback and checks the
> current thread. If it is a real time thread, an error is generated.
> 
> The disadvantage is, it is a runtime check. The advantage is, i can mix
> code that can use the GC and code that can't.

Great. We need to mind this scenario (the right place is I think Ref).

Andrei

April 23, 2009

Re: -nogc

Posted by Denis Koroskin
in reply to bearophile

Permalink

Denis Koroskin

Posted in reply to bearophile

Permalink

On Thu, 23 Apr 2009 15:08:43 +0400, bearophile <bearophileHUGS@lycos.com> wrote:

> Andrei Alexandrescu:
>> The possibility of using D without a garbage collector was always looming and has been used to placate naysayers ("you can call malloc if you want" etc.) but that opportunity has not been realized in a seamless manner. As soon as you concatenate arrays, add to a hash, or create an object, you will call into the GC.
>
> One simple possible solution: -nogc is to write C-like programs, with no
> automatic reference counting. It doesn't include the GC in the final
> executable (making it much smaller) and in such programs AAs and array
> concatenation and closures are forbidden (compilation error if you try
> to use them). "New" allocates using the C heap, and you have to use
> "delete" manually for each of them.
> This is simple. While adding a second memory management system,
> ref-counted, looks like an increase of complexity for both the compiler
> and the programmers.
>
>

Same here. My version of -nogc would work as follows:

1) You mark a module as -nogc/realtime/whatever (similar to module(system) or module(safe)).

module(nogc) some.module.that.doesnt.use.gc;

2) Array concatenations are not allowed, array.length is readonly, can't insert into AA, can't new objects (only malloc? this needs additional thoughts)

3) ...

I believe this is an interesting way to explore. When one writes real-time/kernel-level/etc code that shouldn't leak, he should be very cautious and compiler should help him by restricting access to methods that potentially/certainly leak. But how would one ensure that?

A simple idea would be to allow module(nogc) only access other modules that are marked as nogc, too.
This will effectively disallow most (or all - at least in the beginning) parts of Phobos (for a reason!).

But marking whole module as nogc is not very good. For example, I'd like to be able to read from and write to an existing array, but I should be unable to resize them.

Thus, all the method of Array but a void Array.resize(size_t newSize) must be marked as safe/noleak/nogc/whatever (nogc is a good name here).

Similarly, safe modules should be able to access other safe methods, to, and T* Array!(T).ptr() should not be among them.

Thoughts?

Top | Forum index | About this forum

Forums