Garbage Collection is bad... other comments

Garbage Collection is bad... other comments
Jan 12, 2003 rafael baptista
Jan 13, 2003 Clay
Jan 13, 2003 Evan McClanahan
Jan 13, 2003 rafael b aptista
Jan 13, 2003 Evan McClanahan
Jan 13, 2003 Sean L. Palmer
Jan 13, 2003 Mike Wynn
Jan 23, 2003 Walter
Jan 28, 2003 Ilya Minkov
Jan 13, 2003 Russell Lewis
Jan 13, 2003 Ilya Minkov
Jan 13, 2003 Burton Radons
Jan 13, 2003 rafael baptista
Jan 14, 2003 Ross Judson
Jan 14, 2003 Evan McClanahan
Jan 24, 2003 Bill Cox
Jan 24, 2003 Ilya Minkov
Jan 25, 2003 Bill Cox
Jan 25, 2003 Ilya Minkov
Jan 25, 2003 Bill Cox
Jan 31, 2003 Ilya Minkov
Feb 01, 2003 Sean L. Palmer
Feb 02, 2003 Mike Wynn
Feb 02, 2003 Mike Wynn
Feb 05, 2003 Walter
Feb 05, 2003 Ilya Minkov
Feb 05, 2003 Walter
Feb 02, 2003 Bill Cox
Jan 14, 2003 rafael baptista
Jan 15, 2003 Ross Judson
Jan 14, 2003 Ilya Minkov
Jan 14, 2003 Ilya Minkov
Jan 23, 2003 Walter
Jan 30, 2003 Achillefs Margaritis
Jan 31, 2003 Bill Cox
Jan 31, 2003 Ilya Minkov
Jan 31, 2003 Burton Radons
Feb 04, 2003 Walter
Feb 09, 2003 Evan McClanahan
Feb 09, 2003 Mike Wynn
Feb 09, 2003 Evan McClanahan
Feb 09, 2003 Mike Wynn
Feb 09, 2003 C
Feb 04, 2003 Walter

January 12, 2003

Posted by rafael baptista

Permalink

rafael baptista

Permalink

I just saw the D spec today... and it is quite wonderful. There are so many ways that C and C++ could be better and D addresses them. But I think D is a little too ambitious to its detriment. It has a good side: it fixes syntactic and semantic problems with C/C++. It eliminates all kinds of misfeatures and cruft in C++.

And D has a bad side: it attempts to implement all kinds of features that are best left to the programmer - not the language spec: built in string comparison, dynamic arrays, associative arrays, but the worst in my mind is the inclusion of garbage collection.

As is pointed out in the D overview garbage collection makes D unusable for real time programming - this makes it unusable for Kernel programming, device drivers, computer games, any real time graphics, industrial control etc. etc. It limits the language in a way that is really not necessary. Why not just add in language features that make it easier to integrate GC without making it a part of the language.

For my part I wouldn't use GC on any project in any case. My experience has been that GC is inferior to manually managing memory in every case anyway. Why?

Garbage collection is slower: There is an argument in the Garbage collection section of the D documents that GC is somehow faster than manually managing memory. Essentially the argument boils down to that totally bungling manual memory management is slower than GC. Ok that is true. If I program with tons of redundant destructors, and have massive memory leaks, and spend lots of time manually allocating and deallocating small buffers, and constantly twiddling thousands of ref counts then managing memory manually is a disaster. Most programs manage memory properly and the overhead of memory management is not a problem.

The argument that GC does not suffer from memory leaks is also invalid. In GC'd system, if I leave around references to memory that I never intend to use again I am leaking memory just as surely as if I did not call delete on an allocated buffer in c++.

Finally there is the argument that manually managing memory is somehow difficult and error prone. In C++ managing memory is simple - you create a few templatized container classes at the start of a project. They will allocate and free *all* of the memory in your program. You verify that they work correctly and then write the rest of your program using these classes. If you are calling "new" all over your C++ code you are asking for trouble. The advantage is that you have full control over how memory is managed in your program. I can plan exactly how I will deal with the different types of memory, where my data structures are placed in memory. I can code explicity to make sure that certain data structures will be in cache memory at the same time.

I had a similar reaction of having the compiler automatically build me hash tables, string classes and dynamic arrays. I normally code these things myself with particular attention to how they are going to be used. The unit test stuff is also redundant. I can chose to implement that stuff whether the language supports it or not. I normally make unit test programs that generate output that I can regress against earlier builds - but I wouldn't want to saddle arbitrary programs with having to run my often quite extensive and time consuming unit tests at startup. Imaginary numbers and bit fields should similarly be left up to the programmer.

I suppose I can just chose not to use these features if I don't want to - but it strikes me as unnecessary cruft - and I subscribe to the idea beautifully expressed in the D overview that removing cruft makes a language better.

The GC is worse though - because its in there operating - arbitrarily stalling my program whether I use it or not.

I urge the implementor of D to think about it this way: you have lots of ideas. Many of them are really good - but it is likely that despite your best efforts not all of them are. It would make your project much more likely to succeed if you decoupled the ideas as much as possible such that the good ones can succeed without being weighed down by the bad. So wherever possible I think you should move functionality from the language spec to a standard library. This way the library can continue to improve, while the language proper could stabilize early.

This is probably old ground - I assume lots of people have probably already commented on the GC. I looked for a thread on GC and could not find one.

I for one would find D a lot more attractive if it did not have GC.

January 13, 2003

Re: Garbage Collection is bad... other comments

Posted by Clay
in reply to rafael baptista

Permalink

Clay

Posted in reply to rafael baptista

Permalink

In article <avsfoq$6nc$1@digitaldaemon.com>, rafael baptista says...
>I for one would find D a lot more attractive if it did not have GC.

I disagree.  For business applications, it is nice not to have to spend time thinking about memory management, b/c it isn't worth the gains at runtime.  For simple senarios, STL and GC are probably the same complexity for the programmer. When the variable goes out of scope within a method, it can be cleaned up.

But for more complicated code, it helps that we do not think about memory management and more about the software.  For example, when a object is created in class A, given to class B and C, and either B or C will clean it up (it is not know ahead of time), a GC is much easier to write for.

It comes down to, "who is D for?". But, I wonder if it is possible to create a language that allows for both GC and STL memory management.

disclaimer: I am familiar with STL, but I have never used it.

January 13, 2003

Re: Garbage Collection is bad... other comments

Posted by Evan McClanahan
in reply to Clay

Permalink

Evan McClanahan

Posted in reply to Clay

Permalink

Clay wrote:
> In article <avsfoq$6nc$1@digitaldaemon.com>, rafael baptista says...
> 
>>I for one would find D a lot more attractive if it did not have GC.
> 
> 
> I disagree.  For business applications, it is nice not to have to spend time
> thinking about memory management, b/c it isn't worth the gains at runtime.  For
> simple senarios, STL and GC are probably the same complexity for the programmer.
> When the variable goes out of scope within a method, it can be cleaned up.
> 
> But for more complicated code, it helps that we do not think about memory
> management and more about the software.  For example, when a object is created
> in class A, given to class B and C, and either B or C will clean it up (it is
> not know ahead of time), a GC is much easier to write for. 
> 
> It comes down to, "who is D for?". But, I wonder if it is possible to create a
> language that allows for both GC and STL memory management.

I know Mr. Baptisa by reputation, so I think that I can anticipate some of his concerns (if he's the same person I'm thinking of).  As he's a games programmer, he's very concered with more or less realtime performance, and performance with small bits of memory to allocate. I've thought of many of the same issues, being in the same field, and I think that it's more a matter of problem domain than anything else.  I was trying to think of a way to do GC on a console, and I've basicaly come to the conclusion that it's more trouble than it would be worth.  While GC is great for people with less pressing time and space concerns, like yourself, seemingly, it lacks a lot of fine grained control, and regaining that control would almost require writing a special case CG for every game, which would certainly take as much at as all of the memory management and debugging on a project.  I think that his argument is similar to something that I've been saying, which is that memory management should be more cleanly modularized, so that one can pick and choose among strategies, without much language level interference.

The unittest comments are less well founded, since unittest go away in release builds.  I imagine that it's easy enough to do testing with the systems that are in there without many problems.  Kinda a non-issue for me.

I agree that some of the builtins should be moved into the library, but I don't really have the time right now to go very deeply into it.

Evan

January 13, 2003

Re: Garbage Collection is bad... other comments

Posted by rafael b aptista
in reply to Evan McClanahan

Permalink

rafael b aptista

Posted in reply to Evan McClanahan

Permalink

In article <avu432$224e$1@digitaldaemon.com>, Evan McClanahan says...
>I know Mr. Baptisa by reputation, so I think that I can anticipate some of his concerns (if he's the same person I'm thinking of).  As he's a games programmer, he's very concered with more or less realtime performance, and performance with small bits of memory to allocate.

Yeah, that describes me pretty well. Wow. I had no idea I had a reputation! Evan, you and I are in perfect agreement on GC, I think - it may be good for some other projects but no for what we do.


>The unittest comments are less well founded, since unittest go away in release builds.

Yeah, I don't agree with you about that. I think unit tests are a good idea. Some people would implement them as a series of quick sanity checks on their libraries that would run at load time in debug builds. A proper unit test in my mind is a console app that does an exhaustive check on all of your code and tests all of your assumptions and prints out a long log of all the things it tried. Every time you make changes to a lib, you run the unit test and then diff the log against a previous run of the log and make sure that the results are the same - or that they only differ in expected ways. You then check into source code control the updated diff log. I have known companies that have implemented source code control in such a way that you cannot check your files back in until the system has regressed all the unit tests.

Running unit tests at startup time has many disadvantages:

1. Even in a debug build you have limits on how long you are willing to wait through a unit test. With an offline unit test you can make it do everything you want.

2. Unit tests are better as text oriented apps, and most modern applications are graphical. You would have to make the unit test spew to a log, and then go check the log. Similarly it is a pain to try to run a gui app in a script. So if you want to make scripts that regress your libraries you have to make dummy console apps that only run the unit test in a text mode.

3. Unit tests running at application startup time are not running when they should run. You want a unit test to run every time you change the code that it tests - not necessarily every time you start up your main application.

Making a unit test function for a class be a class member function has another disadvantage: You are running the unit test after the class constructor, and the unit test is oriented toward testing only that one instance. To unit test a class I prefer to make a non-member function that exercises the whole API in one function - including all the constructors.

Experienced programmers can disagree about the best way to do unit tests - and that is exactly why unit tests should not be built in to the language.

The way that D supports unit tests is not the way most programmers who implement unit test chose to implement them. So you are codifying into the language something which is not established as the best practice anyway.

January 13, 2003

Re: Garbage Collection is bad... other comments

Posted by Evan McClanahan
in reply to rafael b aptista

Permalink

Evan McClanahan

Posted in reply to rafael b aptista

Permalink

rafael b aptista wrote:
> In article <avu432$224e$1@digitaldaemon.com>, Evan McClanahan says...
> 
>>I know Mr. Baptisa by reputation, so I think that I can anticipate some of his concerns (if he's the same person I'm thinking of).  As he's a games programmer, he's very concered with more or less realtime performance, and performance with small bits of memory to allocate. 
> 
> 
> Yeah, that describes me pretty well. Wow. I had no idea I had a reputation!
> Evan, you and I are in perfect agreement on GC, I think - it may be good for
> some other projects but no for what we do.

I applied to your company a long time ago (assuming that you're who I think that you are and not someone with the same name), which is how I know who you are (or might be). Game programming is a smallish world anyway. Some interesting stuff on your site.  I assumed that your perspective would be coming from GBA developement, and went from there.  A GC would be disastrous in that environment, and I think that it would be bad for a lot of system level stuff as well.  But then, I think that my perfect language would be almost nothing in the system core, a robust syntax extender, and a biggish library of standard extensions, so that might explain the slant in my perspective.

>>The unittest comments are less well founded, since unittest go away in release builds. 
> 
> Yeah, I don't agree with you about that. I think unit tests are a good idea.
> Some people would implement them as a series of quick sanity checks on their
> libraries that would run at load time in debug builds. A proper unit test in my
> mind is a console app that does an exhaustive check on all of your code and
> tests all of your assumptions and prints out a long log of all the things it
> tried. Every time you make changes to a lib, you run the unit test and then diff
> the log against a previous run of the log and make sure that the results are the
> same - or that they only differ in expected ways. You then check into source
> code control the updated diff log. I have known companies that have implemented
> source code control in such a way that you cannot check your files back in until
> the system has regressed all the unit tests.
> 
> Running unit tests at startup time has many disadvantages:
> 
> 1. Even in a debug build you have limits on how long you are willing to wait
> through a unit test. With an offline unit test you can make it do everything you
> want.
> 
> 2. Unit tests are better as text oriented apps, and most modern applications are
> graphical. You would have to make the unit test spew to a log, and then go check
> the log. Similarly it is a pain to try to run a gui app in a script. So if you
> want to make scripts that regress your libraries you have to make dummy console
> apps that only run the unit test in a text mode.
> 
> 3. Unit tests running at application startup time are not running when they
> should run. You want a unit test to run every time you change the code that it
> tests - not necessarily every time you start up your main application. 
> 
> Making a unit test function for a class be a class member function has another
> disadvantage: You are running the unit test after the class constructor, and the
> unit test is oriented toward testing only that one instance. To unit test a
> class I prefer to make a non-member function that exercises the whole API in one
> function - including all the constructors.
> 
> Experienced programmers can disagree about the best way to do unit tests - and
> that is exactly why unit tests should not be built in to the language. 
> 
> The way that D supports unit tests is not the way most programmers who implement
> unit test chose to implement them. So you are codifying into the language
> something which is not established as the best practice anyway.

Ok, I can see your point.  I don't think, though that they're insanely harmful, as you can still write traditional unittests of the kind that you describe, and if people decide that the unittests don't work as a normal feature, then they'll go away eventually.  I've used the other DBC features more, so you're likely right.  Without a proper computer or internet connection at home, I haven't had a chance to program in D as much as I would like to, and the programs that I have had a chance to write were untested and undocumented hacks, so what do i know about testing? :)

Evan

January 13, 2003

Re: Garbage Collection is bad... other comments

Posted by Sean L. Palmer
in reply to rafael b aptista

Permalink

Sean L. Palmer

Posted in reply to rafael b aptista

Permalink

"rafael b aptista" <rafael_member@pathlink.com> wrote in message news:avukdf$2f0g$1@digitaldaemon.com...
> In article <avu432$224e$1@digitaldaemon.com>, Evan McClanahan says...
> >I know Mr. Baptisa by reputation, so I think that I can anticipate some of his concerns (if he's the same person I'm thinking of).  As he's a games programmer, he's very concered with more or less realtime performance, and performance with small bits of memory to allocate.
>
> Yeah, that describes me pretty well. Wow. I had no idea I had a
reputation!
> Evan, you and I are in perfect agreement on GC, I think - it may be good
for
> some other projects but no for what we do.

On a console game, it's always possible to just allocate the entire memory from the system and manage it yourself.  That seems to be what everybody does in C++.  In D, you can disable GC, and even if you don't, it doesn't run unless you run out of memory.

The problem is that D doesn't have good RAII semantics, so implementing your own memory management is a painful process resembling C++ explicit new and delete.  That method is bug ridden.

In fact I am not sure D allows you to run ctors on arbitrary memory, so it wouldn't be possible to write your own 'new' function template.

> >The unittest comments are less well founded, since unittest go away in release builds.
>
> Yeah, I don't agree with you about that. I think unit tests are a good
idea.
> Some people would implement them as a series of quick sanity checks on
their
> libraries that would run at load time in debug builds. A proper unit test
in my
> mind is a console app that does an exhaustive check on all of your code
and
> tests all of your assumptions and prints out a long log of all the things
it
> tried. Every time you make changes to a lib, you run the unit test and
then diff
> the log against a previous run of the log and make sure that the results
are the
> same - or that they only differ in expected ways. You then check into
source
> code control the updated diff log. I have known companies that have
implemented
> source code control in such a way that you cannot check your files back in
until
> the system has regressed all the unit tests.
>
> Running unit tests at startup time has many disadvantages:
>
> 1. Even in a debug build you have limits on how long you are willing to
wait
> through a unit test. With an offline unit test you can make it do
everything you
> want.
>
> 2. Unit tests are better as text oriented apps, and most modern
applications are
> graphical. You would have to make the unit test spew to a log, and then go
check
> the log. Similarly it is a pain to try to run a gui app in a script. So if
you
> want to make scripts that regress your libraries you have to make dummy
console
> apps that only run the unit test in a text mode.
>
> 3. Unit tests running at application startup time are not running when
they
> should run. You want a unit test to run every time you change the code
that it
> tests - not necessarily every time you start up your main application.
>
> Making a unit test function for a class be a class member function has
another
> disadvantage: You are running the unit test after the class constructor,
and the
> unit test is oriented toward testing only that one instance. To unit test
a
> class I prefer to make a non-member function that exercises the whole API
in one
> function - including all the constructors.

Unit tests can be global module-scope functions too.

> Experienced programmers can disagree about the best way to do unit tests -
and
> that is exactly why unit tests should not be built in to the language.

Seems like it's more of a

#ifdef UNITTESTS
RunUnitTest();
#endif UNITTESTS

> The way that D supports unit tests is not the way most programmers who
implement
> unit test chose to implement them. So you are codifying into the language something which is not established as the best practice anyway.

I kinda like integrating the code for the unit tests into the module it tests.  But I would like more control over when they are run.

Sean

January 13, 2003

Re: Garbage Collection is bad... other comments

Posted by Russell Lewis
in reply to rafael baptista

Permalink

Russell Lewis

Posted in reply to rafael baptista

Permalink

rafael baptista wrote:
> I just saw the D spec today... and it is quite wonderful. There are so many ways
> that C and C++ could be better and D addresses them. But I think D is a little
> too ambitious to its detriment. It has a good side: it fixes syntactic and
> semantic problems with C/C++. It eliminates all kinds of misfeatures and cruft
> in C++.
> 
> And D has a bad side: it attempts to implement all kinds of features that are
> best left to the programmer - not the language spec: built in string comparison,
> dynamic arrays, associative arrays, but the worst in my mind is the inclusion of
> garbage collection.

gc.disable();

In Walter's compiler (although other compilers will differ in the future, I suppose) the GC only runs when the memory allocation routines have run out of buffers that they've gotten from the OS.  That is, the underlying routines for 'new' grab somewhat large blocks of virtual memory from the OS, and piece them up into individual allocations.  When the allocator has used all of its memory, it runs the garbage collector BEFORE it asks the OS for more memory.  It only talks to the OS if it cannot find any place in its current space to make the allocation.

If you disable the garbage collector, then it just doesn't do GC when the memory allocator runs out of memory.  Just use 'new' and 'delete' as you would have in C++.

You can also just disable the gc for a small time:
	gc.disable();
	DoTimeCriticalThing();
	gc.enable();

January 13, 2003

Re: Garbage Collection is bad... other comments

Posted by Mike Wynn
in reply to Sean L. Palmer

Permalink

Mike Wynn

Posted in reply to Sean L. Palmer

Permalink

> On a console game, it's always possible to just allocate the entire memory from the system and manage it yourself.  That seems to be what everybody does in C++.  In D, you can disable GC, and even if you don't, it doesn't run unless you run out of memory.

and you can therefore create all your required objects up front, and manage
then manually;
(as Java games / embedded programmers do)

>
> The problem is that D doesn't have good RAII semantics, so implementing
your
> own memory management is a painful process resembling C++ explicit new and
> delete.  That method is bug ridden.
> In fact I am not sure D allows you to run ctors on arbitrary memory, so it
> wouldn't be possible to write your own 'new' function template.

agree, and D with placement constructors would be good, the effect on the GC
(if not disabled) might be undesirable
and would allow some nasty 'unions' to be created (construct two objects in
the same space).
not only would you need to inform the GC that a memory range was under your
control, but any object in there would still need to be walked as they could
contain refs to GC managed objects. (one of the stumbleing blocks with
realtime Java)

January 13, 2003

Re: Garbage Collection is bad... other comments

Posted by Ilya Minkov
in reply to rafael baptista

Permalink

Ilya Minkov

Posted in reply to rafael baptista

Permalink

Please, believe me garbage collection (GC) is the future of game design. "Realtime programming" doesn't really include games and other things called "realtime", since they have to cope with disc access and such, graphic card, other devices, which are NOT REALTIME! You send some command and wait for it to be completed, and this usually lasts much, much, much, much longer than a GC cycle.

You don't even have to believe me. Read articles of TAD in HUGI #18, i guess. (www.hugi.de) He proposes highly efficient structures for games, which are highly complex meshes of (concave) objects. Now, as I read this through i was fascinated. But as soon as i thought of implementation, i got sudden strong headaches. As later on i was learning about Hans Boehms GC for C, then about functional programming in OCaml (GC-ed language), it became evident to me, that such structures as proposed by TAD can be very easily and efficiently implemented with GC and only this way. Ref-counting ultimately fails in many cases.

To the question of efficiency. Though the current implementation of D GC is said to be very simple-minded and thus not very efficient, there exists a very efficient implementation for C which is farly popular. I've read of experiences of creating a simple IDE (Wedit) for LCC-Win32, a very simple C compiler. The author claims, that plugging in the Hans Boehm GC into the project reduced the time requiered to load a project to 1/8 of original, because GC-malloc is simply better optimized than the normal one.

The manual says that Boehm GC proves to be significantly faster for almost any application managing many tiny objects, but tends to be slower for applications working with large objects only. The reason for the first claim is that it performs a number of optimisations (heap compaction), helping keep objects which are used together near each other, which drastically reduces cache misses, and as you might know the cache is about as fast as the CPU, while the RAM speed has been improving much slower than the CPU/Cache speed and the impact is already huge, and is likely to grow in the future. I guess D GC doesn't do such things yet, but they are to come. These optimisations can only be efficiently implemented in conjuction with a GC, that is, if you do them, you've got a GC basically for free.

The second claim is based upon the fact, that Boehm GC tends to spend too much time figuring out what is a pointer and what not by content. You can limit such pointless search for large pointerless (pun intended) objects, but it doesn't solve the issue in general. D GC uses type information and *never* searches large masses without use, so that you can assume that it should be faster than Boehm GC for large objects. That is, D gc should also work cleaner.

And if you're still not convinced, go on and turn it off. You're just not doing yourself a favor. D GC does the same amount of work what ref-counting does once in a while, while ref-counting has to do that at almost any operation.

You can also cause GC cycles by timer once every now and then, keeping memory usage constantly low and cycle times short, at the cost of making them longer in general. The way i'm currently aware of is to disable and then enable the GC, not sure it works.

regards,
-i./midiclub

January 13, 2003

Re: Garbage Collection is bad... other comments

Posted by Burton Radons
in reply to Ilya Minkov

Permalink

Burton Radons

Posted in reply to Ilya Minkov

Permalink

Ilya Minkov wrote:
> The second claim is based upon the fact, that Boehm GC tends to spend too much time figuring out what is a pointer and what not by content. You can limit such pointless search for large pointerless (pun intended) objects, but it doesn't solve the issue in general. D GC uses type information and *never* searches large masses without use, so that you can assume that it should be faster than Boehm GC for large objects. That is, D gc should also work cleaner.

Type-based searching isn't implemented yet.  DMD should produce a super-optimised scanning function for each allocated and public type in this module.

D's GC could also do block allocation for small objects or those which have proper characteristics in previous runs (good but fairly constant use, with high turnover of allocation and freeing).  That makes allocation and freeing take just a few cycles and reduces overhead.

The effect of this is incalculable at this time as it gets into cache issues, but I've seen the content-to-pointers ratio go in the hundreds when dealing with mesh vertices (textures were uploaded and then freed, otherwise it would be one pointer for 700Kb data).  I'm very optimistic that I'll be able to continually run the GC on any game.  But I'm even more sure that it's worth working for.

If the GC delay is still too great, then all games have sequence points.  When the player changes a level or a discrete segment, or is dead, or is clearly AFK, or if the player is swapping out, loading, saving, switching from the menu, or is not being attacked, then is your time.

Top | Forum index | About this forum

Forums