Thread overview
[Issue 2858] New: D specs allow GC implementations that don't call finalizers
Apr 19, 2009
d-bugmail
Apr 19, 2009
d-bugmail
Apr 19, 2009
d-bugmail
Apr 19, 2009
d-bugmail
Apr 19, 2009
d-bugmail
Apr 20, 2009
d-bugmail
Apr 20, 2009
d-bugmail
Apr 20, 2009
d-bugmail
Apr 20, 2009
d-bugmail
Apr 20, 2009
d-bugmail
April 19, 2009
http://d.puremagic.com/issues/show_bug.cgi?id=2858

           Summary: D specs allow GC implementations that don't call
                    finalizers
           Product: D
           Version: 1.043
          Platform: PC
        OS/Version: Linux
            Status: NEW
          Keywords: spec
          Severity: major
          Priority: P2
         Component: DMD
        AssignedTo: bugzilla@digitalmars.com
        ReportedBy: llucax@gmail.com


The D specs says (http://www.digitalmars.com/d/1.0/class.html#destructors):

    The garbage collector is not guaranteed to run the destructor for all
    unreferenced objects.

This means a conforming D implementation can have a GC implementation that doesn't call finalizers at all (when collecting (delete should call the finalizer according to the specs), ever.

I think the current situation is the worse it can be. It makes the language really weak. The current specs make any current D program using non-deterministic finalizers broken.

To fix this, several paths can be taken:
1) Guarantee finalization, at least at program end
2) Remove finalizers completely from the collection (leaving them for use only
with deterministic destruction, scope, delete, etc.)

For 2) This can be written in the specs instead of the current paragraph:

    The garbage collector doesn't run the destructor when objects are
collected.

But most D programs relying on destructors being call in the collection will break (well, they are broken now, but if this is implemented that brokenness will be exposed).

But much better will be to do 1). 1) is easy to implement too. The call to
gc_term() should be moved outside the try/catch block in the D main() function
and make gc_term() call finalizers for *all* the live objects. There is no need
to run a collection (like it actually does), we don't need to recover free
memory, just call the finalizers. So probably this fix will even even more
efficient than the current approach.

The specs should be changed to say something like:

    The garbage collector is guaranteed to run the destructor for all
    unreferenced objects, at least at programs exit. At program exit, all
    destructors are called, for referenced and unreferenced objects.


More discussion on the topic:
http://proj.llucax.com.ar/blog/dgc/blog/post/-43101db1
http://www.digitalmars.com/d/archives/digitalmars/D/GC_object_finalization_not_guaranteed_88298.html


-- 

April 19, 2009
http://d.puremagic.com/issues/show_bug.cgi?id=2858


smjg@iname.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |smjg@iname.com
         OS/Version|Linux                       |All




------- Comment #1 from smjg@iname.com  2009-04-19 16:14 -------
> I think the current situation is the worse it can be. It makes the language really weak. The current specs make any current D program using non-deterministic finalizers broken.

Indeed.  What's the point of finalizers if you can't use them?

> To fix this, several paths can be taken:
> 1) Guarantee finalization, at least at program end
> 2) Remove finalizers completely from the collection (leaving them for use only
> with deterministic destruction, scope, delete, etc.)

This would destroy a significant portion of GC's usefulness and break various GUI libraries.

> The specs should be changed to say something like:
> 
>     The garbage collector is guaranteed to run the destructor for all
>     unreferenced objects, at least at programs exit. At program exit, all
>     destructors are called, for referenced and unreferenced objects.

So it could wait until program exit before running _any_ destructors?  Where would it keep the objects it collects in the meantime in order that it can run the destructors on exit?

Perhaps better:

    The garbage collector runs the destructor for all unreferenced objects
    before freeing their memory.

followed by either the second sentence of your proposed rewrite or this:

    However, destructors are not guaranteed to be run on program exit, but
    the programmer can force them to be run by calling gc_term() immediately
    before termination of the program.

I presume Runtime.terminate() in D2 would do the same in any case.


-- 

April 19, 2009
http://d.puremagic.com/issues/show_bug.cgi?id=2858


fawzi@gmx.ch changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |fawzi@gmx.ch




------- Comment #2 from fawzi@gmx.ch  2009-04-19 17:36 -------
I agree that making the finalizers more deterministic is a good idea. As discussed in the NG I see two problems:

1) the use of finalizers for any non memory related resource is dangerous and should be avoided if possible. This because GC collections happens when the memory is constrained, not when the resource is constrained, and so one might unwillingly exhaust it. Thus making people use finalizers more and rely on the GC is not necessarily a good idea.

2) There is a problem with daemon threads, threads that offer a service of listen for events. It is not always good/easy to stop them, and you risk strange errors. And no not using the GC in them is not a good option for them.

Fawzi


-- 

April 19, 2009
http://d.puremagic.com/issues/show_bug.cgi?id=2858





------- Comment #3 from smjg@iname.com  2009-04-19 18:31 -------
(In reply to comment #2)
> I agree that making the finalizers more deterministic is a good idea. As discussed in the NG I see two problems:
> 
> 1) the use of finalizers for any non memory related resource is dangerous and should be avoided if possible. This because GC collections happens when the memory is constrained, not when the resource is constrained, and so one might unwillingly exhaust it. Thus making people use finalizers more and rely on the GC is not necessarily a good idea.

SDWF tries to get around this by running a collection if creation of a GDI object fails, in case it frees some system resources in order to try again. But maybe what we really need is some kind of monitoring system.  How does DMD's GC decide when to run, anyway?  Maybe something similar could be implemented for Windows system resources.

Besides, it doesn't make people rely on the GC.  Users of the library are free to use scope objects or manually delete them if they want.


-- 

April 19, 2009
http://d.puremagic.com/issues/show_bug.cgi?id=2858





------- Comment #4 from llucax@gmail.com  2009-04-19 18:41 -------
(In reply to comment #1)
> > To fix this, several paths can be taken:
> > 1) Guarantee finalization, at least at program end
> > 2) Remove finalizers completely from the collection (leaving them for use only
> > with deterministic destruction, scope, delete, etc.)
> 
> This would destroy a significant portion of GC's usefulness and break various GUI libraries.

I guess you're talking about the 2nd option.

> > The specs should be changed to say something like:
> > 
> >     The garbage collector is guaranteed to run the destructor for all
> >     unreferenced objects, at least at programs exit. At program exit, all
> >     destructors are called, for referenced and unreferenced objects.
> 
> So it could wait until program exit before running _any_ destructors?  Where would it keep the objects it collects in the meantime in order that it can run the destructors on exit?

You are right, that leaves the GC implementor the option to defer finalization for all objects until the program exits.

> Perhaps better:
> 
>     The garbage collector runs the destructor for all unreferenced objects
>     before freeing their memory.

At program exit the memory usually don't need to get freed, so this suggests that memory should be freed always, which I don't think it's a good idea. It also implies that destructors are only called for unreferenced objects, and I think that at program exit, the destructor for all live objects should be called (references or not).

That's why I think something like this could be more accurate:

    The GC runs the destructor of an object as soon as it detects the object
    is not used any more. This usually happens when: the delete operator is
    used, when collecting unreferenced memory or at program exit. At program
    exit, all destructors are called, for referenced and unreferenced
    objects. All objects are guaranteed to get their destructor called.

> followed by either the second sentence of your proposed rewrite or this:
> 
>     However, destructors are not guaranteed to be run on program exit, but
>     the programmer can force them to be run by calling gc_term() immediately
>     before termination of the program.

Why do you think is a good idea not to guarantee the destructors to be called at program exit?


-- 

April 20, 2009
http://d.puremagic.com/issues/show_bug.cgi?id=2858





------- Comment #5 from smjg@iname.com  2009-04-19 19:10 -------
(In reply to comment #4)
>> Perhaps better:
>> 
>> The garbage collector runs the destructor for all unreferenced objects before freeing their memory.
> 
> At program exit the memory usually don't need to get freed, so this suggests that memory should be freed always, which I don't think it's a good idea.

Not quite, because some objects may never become unreferenced.  Moreover, I realise now there are two possible interpretations of what I said here:

(a) The GC is guaranteed to destruct and free every object that ever becomes
unreferenced in the program's lifetime
(b) The GC, when it runs, destructs and frees every unreferenced object

The underlying difference is whether the GC runs on program exit or not.

>> followed by either the second sentence of your proposed rewrite or this:
>> 
>> However, destructors are not guaranteed to be run on program exit, but the programmer can force them to be run by calling gc_term() immediately before termination of the program.
> 
> Why do you think is a good idea not to guarantee the destructors to be called at program exit?

It depends on your point of view and whether you have any destructors that do something the OS doesn't do when a program exits anyway.  Moreover, I think this is what Walter intended once upon a time, though that isn't really an argument either way.


-- 

April 20, 2009
http://d.puremagic.com/issues/show_bug.cgi?id=2858





------- Comment #6 from llucax@gmail.com  2009-04-19 20:02 -------
(In reply to comment #2)
> I agree that making the finalizers more deterministic is a good idea. As discussed in the NG I see two problems:
> 
> 1) the use of finalizers for any non memory related resource is dangerous and should be avoided if possible. This because GC collections happens when the memory is constrained, not when the resource is constrained, and so one might unwillingly exhaust it. Thus making people use finalizers more and rely on the GC is not necessarily a good idea.

Agree, but finalizers exists, so I think the they should have the best support possible.

> 2) There is a problem with daemon threads, threads that offer a service of listen for events. It is not always good/easy to stop them, and you risk strange errors. And no not using the GC in them is not a good option for them.

The problem is the whole runtime is shutdown. Daemon threads cannot allocate new memory, for example, because the GC is terminated. So providing any kind of guarantees for those threads seems to be almost impossible. I see those threads very similar to calling C functions. You are going outside D support, and thus you should have extra care and avoid using D "services" on them. As I said in the NG discussion, I think for those rare cases were you have daemon threads that are careful enough not to allocate new memory once the GC is terminated but still use GC memory, a function can be provided to tell the GC not to call finalizers for still referenced objects. But I don't think this should be the default.


-- 

April 20, 2009
http://d.puremagic.com/issues/show_bug.cgi?id=2858





------- Comment #7 from llucax@gmail.com  2009-04-19 20:07 -------
(In reply to comment #3)
> SDWF tries to get around this by running a collection if creation of a GDI object fails, in case it frees some system resources in order to try again. But maybe what we really need is some kind of monitoring system.  How does DMD's GC decide when to run, anyway?

Collection is triggered by gc_malloc() when no free space can be found.


-- 

April 20, 2009
http://d.puremagic.com/issues/show_bug.cgi?id=2858





------- Comment #8 from smjg@iname.com  2009-04-20 04:28 -------
(In reply to comment #7)
> (In reply to comment #3)
> > SDWF tries to get around this by running a collection if creation of a GDI object fails, in case it frees some system resources in order to try again. But maybe what we really need is some kind of monitoring system.  How does DMD's GC decide when to run, anyway?
> 
> Collection is triggered by gc_malloc() when no free space can be found.

So it waits until the system runs out of memory before trying to free some? This way, D programs are almost bound to run down system memory, denying it to other programs, sooner or later.  If you're running several programs that rely on this gc_malloc implementation or similar, at a given time one of them is likely to be at or near its peak in memory consumption.  Generally speaking, one program's memory demand cannot trigger another program to collect its garbage.


-- 

April 20, 2009
http://d.puremagic.com/issues/show_bug.cgi?id=2858





------- Comment #9 from llucax@gmail.com  2009-04-20 07:25 -------
(In reply to comment #8)
> (In reply to comment #7)
> > (In reply to comment #3)
> > > SDWF tries to get around this by running a collection if creation of a GDI object fails, in case it frees some system resources in order to try again. But maybe what we really need is some kind of monitoring system.  How does DMD's GC decide when to run, anyway?
> > 
> > Collection is triggered by gc_malloc() when no free space can be found.
> 
> So it waits until the system runs out of memory before trying to free some? This way, D programs are almost bound to run down system memory, denying it to other programs, sooner or later.  If you're running several programs that rely on this gc_malloc implementation or similar, at a given time one of them is likely to be at or near its peak in memory consumption.  Generally speaking, one program's memory demand cannot trigger another program to collect its garbage.

It doesn't. gc_malloc() uses an internal pool of memory, when no memory in *that* pool is not found, the collection is triggered, if that collection couldn't find some free memory, then the GC ask the OS for some more. So nobody can trigger a collection but the program itself.

Anyway, this is a little off-topic. This bug report is not about that =)


--