rt_attachDisposeEvent, monitors, delegate storage, garbage collection, and lifetime (or: you can flee in horror now)

Hi,

So, I had an interesting debugging session today.

I've been trying to implement a Weak(T) type which stores an invisible reference to an object, thus making that object collectable. This is simply the plain old weak reference: http://en.wikipedia.org/wiki/Weak_reference

Now, the code:

import core.atomic, core.memory;

alias void delegate(Object) FinalizeCallback;

extern (C) void rt_attachDisposeEvent(Object h, FinalizeCallback e);
extern (C) void rt_detachDisposeEvent(Object h, FinalizeCallback e);

final class Weak(T : Object)
{
    // Note: This class uses a clever trick which works fine for a
conservative GC
    // that was never intended to do compaction/copying in the first
place. However,
    // if compaction is ever added to D's GC, this class will break horribly. If
    // D ever gets such a GC, we should push strongly for built-in
weak references.

    private size_t _object;
    private size_t _ptr;
    private hash_t _hash;

    invariant()
    {
        assert(_ptr);
    }

    this(T object)
    in
    {
        assert(object);
    }
    body
    {
        auto ptr = cast(size_t)cast(void*)object;

        // We use atomics because not all architectures may guarantee
atomic store
        // and load of these values.
        atomicStore(*cast(shared)&_object, ptr);

        // Only assigned once, so no atomics.
        _ptr = ptr;
        _hash = typeid(T).getHash(&object);

        FinalizeCallback dg;

        dg = (Object o)
        {
            // This assignment is important. If we don't null _object
when it is collected,
            // the check in object could return false positives where
the GC has reused the
            // memory for a new object.
            atomicStore(*cast(shared)&_object, cast(size_t)0);
        };

        // This call does more than it may seem at first. Since the
second parameter
        // is a delegate, that means it has a context. In this
particular case, the
        // this reference becomes the context. Now, since the delegate
is attached to
        // the underlying object we're referring to, that means that
as long as that
        // object is alive, so are we. In other words, we will always
outlive it. Note
        // that this invariant doesn't actually hold during runtime
shutdown (see the
        // note in the delegate above).
        rt_attachDisposeEvent(object, dg);

        // This ensures that the GC does not see the reference to the
object that we
        // have embedded inside the this reference.
        GC.setAttr(cast(void*)this, GC.BlkAttr.NO_SCAN);
    }

    @property T object()
    {
        auto obj = cast(T)cast(void*)atomicLoad(*cast(shared)&_object);

        // We've moved it into the GC-scanned stack space, so it's now
safe to ask
        // the GC whether the object is still alive. Note that even if
the cast and
        // assignment of the obj local doesn't put the object on the stack, this
        // call will. So, either way, this is safe.
        if (GC.addrOf(cast(void*)obj))
            return obj;

        return null;
    }

    // ... opEquals, toHash, etc ...
}

Now, there are a lot of subtleties about this code, but finalization is what's most important here. See the comment above the rt_attachDisposeEvent call. At first glance, the logic seems sound: Since the context of the dispose delegate's closure references the 'this' object, surely it becomes part of the delegate's context and is therefore GC-tracked! Indeed, the first part is true, but the second is not. The reason is quite simple: The delegate registered with rt_attachDisposeEvent is stored into memory allocated through good old libc. This means that the context of the delegate is effectively unreachable to the GC, thus *rendering the Weak(T) object unreachable*!

It turns out that, in druntime, monitors are allocated from the native heap, and not through the GC. This makes sense, for the most part, because they aren't going to have anything in them that needs GC tracking. Or so the assumption might have been originally. This no longer holds true, now that we can store full-blown delegates into the monitor's devt array.

Now you might ask: Why don't I just add a finalizer to Weak(T) which unregisters the finalization callback for the hidden object, such that this lifetime issue cannot happen? Why, the reason is simple! It could potentially cause a deadlock (but in practice, causes a segmentation fault due to some druntime voodoo).

TL;DR: Delegates registered with rt_attachDisposeEvent *somehow* need to be GC-tracked.

Anyone have any input on how this might be done?

Regards,
Alex
_______________________________________________
D-runtime mailing list
D-runtime@puremagic.com
http://lists.puremagic.com/mailman/listinfo/d-runtime

Forums