Thread overview
A possible solution to the GC conundrom.
Jan 15, 2003
Andy Friesen
Feb 11, 2003
Walter
Feb 11, 2003
Mike Wynn
Feb 11, 2003
Ilya Minkov
Feb 11, 2003
Walter
Feb 11, 2003
Mike Wynn
Feb 13, 2003
Russ Lewis
Feb 13, 2003
Ilya Minkov
Feb 14, 2003
Ilya Minkov
Feb 14, 2003
Walter
January 15, 2003
Mayhaps pointers to auto classes should be allowed; they'd have to be new'd and deleted just like C++.  Such pointers could exist as object members as well.

This has the advantage of not interfering with the default object behaviour (which is going to be less of a headache in most cases), while still allowing things to be done "the hard way" wherever desired, on a per-class basis.

I'm not sure how auto classes holding references to normal objects would complicate the garbage collector, though.

 -- andy

February 11, 2003
See the new stuff in www.digitalmars.com/d/memory.html

-Walter


"Andy Friesen" <andy@ikagames.com> wrote in message news:b04o3c$ec0$1@digitaldaemon.com...
> Mayhaps pointers to auto classes should be allowed; they'd have to be new'd and deleted just like C++.  Such pointers could exist as object members as well.
>
> This has the advantage of not interfering with the default object behaviour (which is going to be less of a headache in most cases), while still allowing things to be done "the hard way" wherever desired, on a per-class basis.
>
> I'm not sure how auto classes holding references to normal objects would complicate the garbage collector, though.
>
>   -- andy
>


February 11, 2003
shouldn't things such as stack allocated obj's be handled by the compiler,
rather than the programmer.
and free lists would (imho) be better if part of the inner working of the gc
allocator (cache mem blocks of given sizes, or perhaps per class mem cache
chains). open the way the GC works so we can write our own gc's yes, but
this ... no.
the mark/release for instance can be done without new `new` syntax o.k. you
can't say new Foo(), but you can say Foo.create();

copy on write, why is this a manual op, it should be a built in behaviour, (err on the side of caution if the compiler can not fully detect no write occurs). this forces programmers to continually write the same code again and again.

again I'm confused by the changes you've made to D, I quote from the D introduction

"who is D for" item 5 ...

Programmers who enjoy the expressive power of C++ but are frustrated by the need to expend much effort explicitly managing memory and finding pointer bugs

and item 8

Programmers who think the language should provide enough features to obviate the continual necessity to manipulate pointers directly

you seem to have now added
Library writers who want to shoot the above programmers where it hurts, by
using a language with robust features, but turning them all off in their
libraries.

Mike.

"Walter" <walter@digitalmars.com> wrote in message news:b2aaet$gpe$1@digitaldaemon.com...
> See the new stuff in www.digitalmars.com/d/memory.html
>
> -Walter
>
>
> "Andy Friesen" <andy@ikagames.com> wrote in message news:b04o3c$ec0$1@digitaldaemon.com...
> > Mayhaps pointers to auto classes should be allowed; they'd have to be new'd and deleted just like C++.  Such pointers could exist as object members as well.
> >
> > This has the advantage of not interfering with the default object behaviour (which is going to be less of a headache in most cases), while still allowing things to be done "the hard way" wherever desired, on a per-class basis.
> >
> > I'm not sure how auto classes holding references to normal objects would complicate the garbage collector, though.
> >
> >   -- andy
> >
>
>


February 11, 2003
Mike Wynn wrote:
> shouldn't things such as stack allocated obj's be handled by the
> compiler, rather than the programmer.

Code with garbage-collected objects, introduce stack objects as
optimisation as soon as you're sure everything else is done. It's a
dangerous feature by itself, so no provision should be made to simplify
its usage. Sharing away stack-allocated data is a common bug source. And
why? Because it's made too easy to allocate data on stack, so that it's
often done without further thought!

> and free lists would (imho) be better if part of the inner working of
> the gc allocator (cache mem blocks of given sizes, or perhaps per
> class mem cache chains). open the way the GC works so we can write
> our own gc's yes, but this ... no.

GCs work differently. I can see that GC interface is being kept minimal,
maybe that's on purpose? Allowing further control of GC means
constraining the type of GC. Besides, different types of GCs may wish to
generate different in-line code, like pointer write wrappers, optimal
auto-scanners and such. Until there's no way to write code-generating
code, a plug-in GC doesn't make sense.

BTW, Free Lists make sense for C heaps only anyway, since D GC already
allocates memory in a similar performance-tuned manner.

> the mark/release for instance can be done without new `new` syntax
> o.k. you can't say new Foo(), but you can say Foo.create();
> 
> copy on write, why is this a manual op, it should be a built in
> behaviour, (err on the side of caution if the compiler can not fully
> detect no write occurs). this forces programmers to continually write
> the same code again and again.

Urrr... Wasn't it automated?

> again I'm confused by the changes you've made to D, I quote from the
> D introduction
> 
> "who is D for" item 5 ...
> 
> Programmers who enjoy the expressive power of C++ but are frustrated
> by the need to expend much effort explicitly managing memory and
> finding pointer bugs
> 
> and item 8
> 
> Programmers who think the language should provide enough features to
> obviate the continual necessity to manipulate pointers directly
> 
> you seem to have now added Library writers who want to shoot the
> above programmers where it hurts, by using a language with robust
> features, but turning them all off in their libraries.

Libraries *have* to use GC and exceptions and be safe where possible.
Besides, library writers have had all the possibilities before, because
D supports all C types and pointer handling. It just became easier for
the end-users to devise their own efficient memory menagement. Just that
it has to be avoided whenever possible, for all reasons.

> Mike.

February 11, 2003
"Mike Wynn" <mike.wynn@l8night.co.uk> wrote in message news:b2ahc1$kmt$1@digitaldaemon.com...
> shouldn't things such as stack allocated obj's be handled by the compiler,
> rather than the programmer.
> and free lists would (imho) be better if part of the inner working of the
gc
> allocator (cache mem blocks of given sizes, or perhaps per class mem cache
> chains). open the way the GC works so we can write our own gc's yes, but
> this ... no.
> the mark/release for instance can be done without new `new` syntax o.k.
you
> can't say new Foo(), but you can say Foo.create();

Stack obj's are automatically handled by the compiler for structs, static arrays, etc. Just not for class objects. What I provided in the new release is a mechanism for advanced programmers to explore some other ways of doing things. These techniques should not be commonly used, nearly all usage will work fine with the gc.

> copy on write, why is this a manual op, it should be a built in behaviour, (err on the side of caution if the compiler can not fully detect no write occurs). this forces programmers to continually write the same code again and again.

It needs to be manual because languages that do it automatically suffer from truly terrible performance when doing things like uppercasing string contents one character at a time.


> again I'm confused by the changes you've made to D, I quote from the D
> introduction
> "who is D for" item 5 ...
> Programmers who enjoy the expressive power of C++ but are frustrated by
the
> need to expend much effort explicitly managing memory and finding pointer
> bugs
> and item 8
> Programmers who think the language should provide enough features to
obviate
> the continual necessity to manipulate pointers directly

None of these new techniques are necessary to write D programs with. They are there for some very specialized uses where taking control of allocation/deallocation can get some performance gains.



February 11, 2003
have you seen http://www.cs.purdue.edu/s3/projects/bloat/
although its Java, the optimisations still aply to non stack based CPU's

> > copy on write, why is this a manual op, it should be a built in
behaviour,
> > (err on the side of caution if the compiler can not fully detect no
write
> > occurs). this forces programmers to continually write the same code
again
> > and again.
>
> It needs to be manual because languages that do it automatically suffer
from
> truly terrible performance when doing things like uppercasing string contents one character at a time.
>

having iterators that are designed for such ops would be good

/* the iterator func*/
template iterators( T ) {
    bit toUpperIt( in T orig, out T repl ) { repl = toUpper( orig ); return
repl != orig; }
    alias bit (*ifunc)( in T, out T );
    T[] modifyContents( T[] ar, ifunc iter ) {
            for ( int i = 0; i < ar.length; i++ ) {
                T tmp;
                if ( ifunc( ar[i], tmp ) )
                {
                        T[] nar = new T[ar.length];
                        if (i>0) {nar[0..i] = ar[0..i];}
                        nar[i] = tmp;
                        while( ++i < ar.length ) {
                            ifunc( ar[i], tmp ); nar[i] = tmp;
                        }
                        return nar;
                }
        return ar;
    }
}

if build into the lang, op's such as
char[] ups = foo.modifyContents( myStr, &toUpper );
could be optimied into a fast version of char[] or wchar[] toUpper
may be `char[] ups  = myStr.iter( &toUpper );`

at it is currently D has 3 types of array lets call them blocks, vectors and
slices.
blocks are int[7] or similar, so any op such as ~= will cause an new array
to be created
vectors are int[]   so sometimes ops such as ~= will NOT cause the array to
be realloced
and slices are ar[n..m] which are like blocks in that they can not be
extended, and are aliases to another array

import c.stdio;

int[] func( int[] a, int[]b )
{
    if ( a[0] < 4 ) a ~= b[0];
    return a[1..3];
}

int main( char[][] args )
{
 int [] foo;
 int [1] b; b[0] = 9;
 for( int i = 0; i< 2; i++ ) { foo ~= i; }
 int[] c = func( foo, b );
 foo ~= 1;
 printf("c[0] : %d\n", c[0] );
 return 0;
}

what is printed out ? 9 or 1 (no cheating and compiling it first).



February 13, 2003
Walter wrote:

> > copy on write, why is this a manual op, it should be a built in behaviour, (err on the side of caution if the compiler can not fully detect no write occurs). this forces programmers to continually write the same code again and again.
>
> It needs to be manual because languages that do it automatically suffer from truly terrible performance when doing things like uppercasing string contents one character at a time.

GC, in order to be fast and useful, will eventually need optimizing compilers that are GC aware.  In the COW example you gave (and I'm assuming for the moment that COW *was* the rule), the compiler would initially render the program as making the many copies you are talking about.  However, the optimizer would then realize that all the intermediate copies are going to be garbage immediately, and collapse it down into a single copy-modify.

IMHO, forcing this complexity on the programmer instead of the compiler is Bad. It's better to have the early compilers not very optimized than to add this heavy burden on all D programmers for all time.

OTOH, I recognize and agree with your point that it should be (relatively) easy to make a D compiler; otherwise *no* D compiler anywhere will be standards compliant.

For that reason, I argue (again) is that somehow we need a optimization/translation library for D.  I don't yet know how this would work, but it would be some sort of open source, standard set of mappings that all compilers could integrate into their parser and their optimizers.  It would provide a consistent set of syntax sugar and optimizations for all compilers to leverage.  Compilers would differentiate themselves by what they added to that standard set.  I don't know how this would work, exactly, but I think that long term it's the only solution for a good marketplace of optimizing, standards-compliant compilers.

--
The Villagers are Online! http://villagersonline.com

.[ (the fox.(quick,brown)) jumped.over(the dog.lazy) ]
.[ (a version.of(English).(precise.more)) is(possible) ]
?[ you want.to(help(develop(it))) ]


February 13, 2003
This can be accomplished by keeping user count information during compilation. If the user count is unknown, it is asssumed to be >1, else it's exactly 1. If it's exactly 1, data doesn't need to be copied.

This way, function input will be assumed to have >1 count and copied on the first write, then no more. A more advanced compiler could promote this information beyond function boundaries where statically possible.

BTW, is there any way to store this information in the array header to make runtime decisions? It's only 1 bit large...

There also has to be a way to simplify implementing a copy-on-write convention manually?

-i.


Russ Lewis wrote:
> Walter wrote:
> 
> 
>>>copy on write, why is this a manual op, it should be a built in behaviour,
>>>(err on the side of caution if the compiler can not fully detect no write
>>>occurs). this forces programmers to continually write the same code again
>>>and again.
>>
>>It needs to be manual because languages that do it automatically suffer from
>>truly terrible performance when doing things like uppercasing string
>>contents one character at a time.
> 
> 
> GC, in order to be fast and useful, will eventually need optimizing compilers
> that are GC aware.  In the COW example you gave (and I'm assuming for the moment
> that COW *was* the rule), the compiler would initially render the program as
> making the many copies you are talking about.  However, the optimizer would then
> realize that all the intermediate copies are going to be garbage immediately,
> and collapse it down into a single copy-modify.
> 
> IMHO, forcing this complexity on the programmer instead of the compiler is Bad.
> It's better to have the early compilers not very optimized than to add this
> heavy burden on all D programmers for all time.
> 
> OTOH, I recognize and agree with your point that it should be (relatively) easy
> to make a D compiler; otherwise *no* D compiler anywhere will be standards
> compliant.
> 
> For that reason, I argue (again) is that somehow we need a
> optimization/translation library for D.  I don't yet know how this would work,
> but it would be some sort of open source, standard set of mappings that all
> compilers could integrate into their parser and their optimizers.  It would
> provide a consistent set of syntax sugar and optimizations for all compilers to
> leverage.  Compilers would differentiate themselves by what they added to that
> standard set.  I don't know how this would work, exactly, but I think that long
> term it's the only solution for a good marketplace of optimizing,
> standards-compliant compilers.
> 
> --
> The Villagers are Online! http://villagersonline.com
> 
> .[ (the fox.(quick,brown)) jumped.over(the dog.lazy) ]
> .[ (a version.of(English).(precise.more)) is(possible) ]
> ?[ you want.to(help(develop(it))) ]
> 
> 

February 14, 2003
"Russ Lewis" <spamhole-2001-07-16@deming-os.org> wrote in message news:3E4BA572.242876D2@deming-os.org...
> GC, in order to be fast and useful, will eventually need optimizing
compilers
> that are GC aware.  In the COW example you gave (and I'm assuming for the
moment
> that COW *was* the rule), the compiler would initially render the program
as
> making the many copies you are talking about.  However, the optimizer
would then
> realize that all the intermediate copies are going to be garbage
immediately,
> and collapse it down into a single copy-modify.
>
> IMHO, forcing this complexity on the programmer instead of the compiler is
Bad.
> It's better to have the early compilers not very optimized than to add
this
> heavy burden on all D programmers for all time.
>
> OTOH, I recognize and agree with your point that it should be (relatively)
easy
> to make a D compiler; otherwise *no* D compiler anywhere will be standards compliant.

If this optimization was reasonably straightforward to do, I would fully agree with you that D should do COW automatically. But I don't see how to do it reliably, especially for non-trivial examples. I also don't know of any COW language compiler that is able to do such optimizations, and from that I infer it is a very hard problem.



> For that reason, I argue (again) is that somehow we need a optimization/translation library for D.  I don't yet know how this would
work,
> but it would be some sort of open source, standard set of mappings that
all
> compilers could integrate into their parser and their optimizers.  It
would
> provide a consistent set of syntax sugar and optimizations for all
compilers to
> leverage.  Compilers would differentiate themselves by what they added to
that
> standard set.  I don't know how this would work, exactly, but I think that
long
> term it's the only solution for a good marketplace of optimizing, standards-compliant compilers.

At least the lexer/parser/semantic analysis code is open source, for just the reason you state - to make it easy for others to do compatible implementations of D.


February 14, 2003
(correcting myself)

Ilya Minkov wrote:
> This can be accomplished by keeping user count information during compilation. If the user count is unknown, it is asssumed to be >1, else it's exactly 1. If it's exactly 1, data doesn't need to be copied.

Sorry, it's nonsense. It doesn't work in that manner. :)

> BTW, is there any way to store this information in the array header to make runtime decisions? It's only 1 bit large...

And thus slow ourselves down to interpreted languages. No, thanks. :)
It might be faster to copy than to make decisions all the time...

> There also has to be a way to simplify implementing a copy-on-write convention manually?

still needs thinking about...
The possible solution would be to "always copy" in alpha code, than change it into copy-on-write someday...

> -i.
-i.