Thread overview
A thread-safe weak reference implementation
Jul 09, 2013
Robert
Jul 10, 2013
David
Jul 10, 2013
Kagamin
Jul 10, 2013
Andrej Mitrovic
March 14, 2012
Hi,

This is something I hacked together today. It seems to work exactly as intended, and has no threading issues.

import core.atomic,
       core.memory;

private alias void delegate(Object) DEvent;
private extern (C) void rt_attachDisposeEvent(Object h, DEvent e);
private extern (C) void rt_detachDisposeEvent(Object h, DEvent e);

final class Weak(T : Object)
{
    // Note: This class uses a clever trick which works fine for
    // a conservative GC that was never intended to do
    // compaction/copying in the first place. However, if compaction is
    // ever added to D's GC, this class will break horribly. If D ever
    // gets such a GC, we should push strongly for built-in weak
    // references.

    private size_t _object;
    private size_t _ptr;
    private hash_t _hash;

    this(T object)
    in
    {
        assert(object);
    }
    body
    {
        auto ptr = cast(size_t)cast(void*)object;

        // We use atomics because not all architectures may guarantee
        // atomic store and load of these values.
        atomicStore(*cast(shared)&_object, ptr);

        // Only assigned once, so no atomics.
        _ptr = ptr;
        _hash = typeid(T).getHash(&object);

        rt_attachDisposeEvent(object, &unhook);
        GC.setAttr(cast(void*)this, GC.BlkAttr.NO_SCAN);
    }

    @property T object()
    {
        auto obj = cast(T)cast(void*)atomicLoad(*cast(shared)&_object);

        // We've moved obj into the GC-scanned stack space, so it's now
        // safe to ask the GC whether the object is still alive. Note
        // that even if the cast and assignment of the obj local
        // doesn't put the object on the stack, this call will. So,
        // either way, this is safe.
        if (GC.addrOf(cast(void*)obj))
            return obj;

        return null;
    }

    private void unhook(Object object)
    {
        rt_detachDisposeEvent(object, &unhook);

        // This assignment is important. If we don't null _object when
        // it is collected, the check in object could return false
        // positives where the GC has reused the memory for a new
        // object.
        atomicStore(*cast(shared)&_object, cast(size_t)0);
    }

    override equals_t opEquals(Object o)
    {
        if (this is o)
            return true;

        if (auto weak = cast(Weak!T)o)
            return _ptr == weak._ptr;

        return false;
    }

    override int opCmp(Object o)
    {
        if (auto weak = cast(Weak!T)o)
            return _ptr > weak._ptr;

        return 1;
    }

    override hash_t toHash()
    {
        auto obj = object;

        return obj ? typeid(T).getHash(&obj) : _hash;
    }

    override string toString()
    {
        auto obj = object;

        return obj ? obj.toString() : toString();
    }
}

Things worth noting:

* I've chosen to make it a class for simplicity. I don't believe that the effort to make a struct correctly is worth it, when using weak references already implies GC anyway.
* The implementation relies on atomicLoad/atomicStore to use actual lock-prefixed instructions on x86 (and the equivalent on other architectures) rather than e.g. a spinlock.
* It obviously doesn't work for a compacting GC. Ideally, a compacting GC will have full-blown support for weak (and maybe soft) references so we don't have to do hacks like this one. But, until then, this is IMHO a necessary addition to the standard library.
* The class does not support custom overloads of opEquals() or opCmp() on the stored object; it always uses referential equality. It also stores its hash at construction time. This is necessary because a Weak(T) cannot sensibly be used as e.g. an AA key otherwise, because invariants cannot be maintained when you need to call methods on the contained object (which may be collected).
* Strictly speaking, the atomics aren't necessary on x86 since everything is size_t, and loads/stores of these are guaranteed to be atomic. However, I opted to be on the safe side, and also use atomics to stay compatible with !x86 architectures.
* I know, I haven't annotated it with any type or method qualifiers at all. I'm more focused on the implementation right now.
* Hacks, hacks, hacks. I know it isn't pretty, but there's no good way to do it in a thread-safe fashion with a conservative GC other than this.

Comments, suggestions, reviews are all *very* welcome.

-- 
- Alex
July 09, 2013
>         if (GC.addrOf(cast(void*)obj))
>             return obj;


Smart :-) You are waiting for the collection to complete, if we are one of the threads started before destructor calls happen. Brilliant.

Is it ok if I shamelessly copy parts of your implementation for the new std.signals? I will of course mention your name.



Best regards,
Robert
July 10, 2013
What race condition do you have in the constructor?

Declare _object field shared and cast it less.
July 10, 2013
Am 09.07.2013 23:35, schrieb Robert:
> 
>>         if (GC.addrOf(cast(void*)obj))
>>             return obj;
> 
> 
> Smart :-) You are waiting for the collection to complete, if we are one of the threads started before destructor calls happen. Brilliant.
> 
> Is it ok if I shamelessly copy parts of your implementation for the new std.signals? I will of course mention your name.

My first thought when I saw this title on the NG, this could work for std.signals! Thanks for reading it ;)

July 10, 2013
On 3/14/12, Alex Rønne Petersen <xtzgzorex@gmail.com> wrote:
>          auto ptr = cast(size_t)cast(void*)object;

I think this should be:

auto ptr = cast(size_t)*cast(void**)&object;

The object might have an opCast method, and this is one way to avoid calling it by accident.