Thread overview
[D-runtime] Mixing GC and non-GC in D. (AKA, "don't touch GC-references from DTOR, preferably don't use DTOR at all")
Dec 12, 2010
Ulrik Mikaelsson
Dec 14, 2010
Vladimir Panteleev
Jan 02, 2011
Ulrik Mikaelsson
December 12, 2010
Hi,

DISCLAIMER: I'm developing for D1/Tango. It is possible these issues are already resolved for druntime/D2. If so, I've failed to find any information about it, please do tell.

Recently, I've been trying to optimize my application by swapping out some resource allocation (file-descriptors for one) to reference-counted allocation instead of GC. I've hit some problems.

Problem
=======

Basically, the core of all my problems is something expressed in
http://bartoszmilewski.wordpress.com/2009/08/19/the-anatomy-of-reference-counting/
as "An object?s destructor must not access any garbage-collected
objects embedded in it.".

This becomes a real problem for various allocation-schemes, be it hierarchic allocation, reference counting, or a bunch of other custom resource-schemes. _It renders the destructor of mostly D useless for anything but mere C-binding-cleanup._

Consequence
===========
For the Reference-Counted example, the only working solution is to
have the counted object malloced, instead of GC-allocated. One could
argue that "correct" programs with reference-counting should do the
memory management completely explicit anyways, and yes, it's largely
true. The struct-dtor of D2 makes the C++ "smartptr"-construct
possible, making refcount-use mostly natural and automatic anyways.

However, it also means, that the refcounted object itself, can never use GC-allocated structures, such as mostly ANYTHING from the stdlib! In effect, as soon as you leave the GC behind, you leave over half of all useful things of D behind.

This is a real bummer. What first attracted me to D, and I believe is still the one of the key strengths of D, is the possibilities of hybrid GC/other memory-schemes. It allows the developer to write up something quick-n-dirty, and then improve in the places where it's actually needed, such as for open files, or gui-context-handles, or other expensive/limited resources.

As another indication that is really is a problem: In Tango, this have lead to the introduction of an additional destructor-type method "dispose", which is doing AFAICT what the destructor should have done, but is only invoked for deterministic destruction by "delete" or scope-guards. IMO, that can only lead to a world of pain and misunderstandings, having two different "destructors" ran depending on WHY the object were destroyed.

Proposed Solution
=================
Back to the core problem "An object?s destructor must not access any
garbage-collected objects embedded in it.".

As far as I can tell (but I'm no GC expert), this is a direct effect
of the current implementation of the GC, more specifically the loop
starting at http://www.dsource.org/projects/druntime/browser/trunk/src/gc/gcx.d#L2492.
In this loop, all non-marked objects gets their finalizers run, and
immediately after, they get freed. If I understand the code right,
this freeing is what's actually causing the problems, namely that if
the order in the loop don't match the order of references in the freed
object (which it can't, especially for circular references), it might
destroy a memory-area before calling the next finalizer, attempting to
use the just freed reference.

Wouldn't it instead be possible to split this loop into two separate loops, the first calling all finalizers, letting them run with all objects still intact, and then afterwards run a second pass, actually destroying the objects? AFAICT, this would resolve the entire problem, allowing for much more mixed-use of allocation strategies as well as making the destructor much more useful.

Ideas, opinions? Perhaps this have been discussed before?

Regards
/ Ulrik
December 15, 2010
On Sun, 12 Dec 2010 20:49:22 +0200, Ulrik Mikaelsson <ulrik.mikaelsson at gmail.com> wrote:

> Ideas, opinions? Perhaps this have been discussed before?

I see no obvious flaws for this proposal (except, obviously, relying on accessing destroyed objects). Perhaps you should repost this in the digitalmars.D newsgroup for higher visibility?

Either way, here's an alternate suggestion: add the objects in unmanaged memory to the list of GC roots. If you use a bulk allocator, you can make the allocator add the whole pool.

-- 
Best regards,
  Vladimir                            mailto:vladimir at thecybershadow.net

January 02, 2011
> Either way, here's an alternate suggestion: add the objects in unmanaged memory to the list of GC roots. If you use a bulk allocator, you can make the allocator add the whole pool.
FYI: I don't have access to build-env for D2/druntime, but I've created an implementation for Tango, in http://www.dsource.org/projects/tango/ticket/2024. It might be of interest to port to druntime, with the mentioned D2-improvements to SmartRef.