April 20, 2009
== Quote from Jason House (jason.james.house@gmail.com)'s article
> Jason House Wrote:
> >
> > If I understand you correctly, a weak reference library needs to query the GC to see if an object is
marked for collection. How do I do this?
> Here's the implementation I'm currently thinking of:
> weak ref construction:
> Copy pointer into struct without casting
> Call rt_attach_dispose_event
> weak ref dereference:
> copy value into temp var
> ** query gc to see if garbage **
> If not garbage, Recopy value and return
> If garbage set value = null
> dispose of weakly referenced obj:
> set value = null
> The emphasized line is impossible according to the publicized gc interface. Given the delayed
finalization, I don't see any way around it.

You're guaranteed that the GC will notify you when it finalizes the referenced object, so you shouldn't have to query the GC at all.  Just copy the reference into a GC-scannable location and then if you haven't been notified by the GC that the object was finalized the reference is good.
April 20, 2009
Sean Kelly Wrote:

> == Quote from Jason House (jason.james.house@gmail.com)'s article
> > Jason House Wrote:
> > >
> > > If I understand you correctly, a weak reference library needs to query the GC to see if an object is
> marked for collection. How do I do this?
> > Here's the implementation I'm currently thinking of:
> > weak ref construction:
> > Copy pointer into struct without casting
> > Call rt_attach_dispose_event
> > weak ref dereference:
> > copy value into temp var
> > ** query gc to see if garbage **
> > If not garbage, Recopy value and return
> > If garbage set value = null
> > dispose of weakly referenced obj:
> > set value = null
> > The emphasized line is impossible according to the publicized gc interface. Given the delayed
> finalization, I don't see any way around it.
> 
> You're guaranteed that the GC will notify you when it finalizes the referenced object, so you shouldn't have to query the GC at all.  Just copy the reference into a GC-scannable location and then if you haven't been notified by the GC that the object was finalized the reference is good.

Maybe I misunderstood, but I interpreted earlier discussion in this thread to mean the gc did the following steps:
1. Pause all threads
2. Scan objects to clear marks
3. Scan objects to add marks
4. Resume all threads
5. Scan objects, if no mark then...
    5a. Call object finalizer
    5b. Free memory for later use

Assuming that is correct, any deref of a weak reference is unsafe between steps 4 and 5a. Even though the object may be valid in that window, there's no guarantee that the reference won't be used after step 5b... This would lead to random behavior and seg faults. No self-respecting weak reference library should allow that to happen.

I hope I got something wrong and weak refs are easier to implement than I'm currently thinking.
April 20, 2009
== Quote from Jason House (jason.james.house@gmail.com)'s article
> Sean Kelly Wrote:
> >
> > You're guaranteed that the GC will notify you when it finalizes the referenced object, so you shouldn't have to query the GC at all.  Just copy the reference into a GC-scannable location and then if you haven't been notified by the GC that the object was finalized the reference is good.
> Maybe I misunderstood, but I interpreted earlier discussion in this thread to mean the gc did the
following steps:
> 1. Pause all threads
> 2. Scan objects to clear marks
> 3. Scan objects to add marks
> 4. Resume all threads
> 5. Scan objects, if no mark then...
>     5a. Call object finalizer
>     5b. Free memory for later use
> Assuming that is correct, any deref of a weak reference is unsafe between steps 4 and 5a. Even
though the object may be valid in that window, there's no guarantee that the reference won't be used after step 5b... This would lead to random behavior and seg faults. No self-respecting weak reference library should allow that to happen.
> I hope I got something wrong and weak refs are easier to implement than I'm currently thinking.

Oh I see what you're saying.  So this could happen:

4. The GC resume all threads
4a. Another thread derefs the WeakRef
4b. This thread checks to see if the GC has notified that the
       object is being collected.  No notification, so it uses the object.
5. The GC scans its memory and finalizes the object
6. The other thread tries to use the object and explodes

You're right, it sounds like you'd need to add a step between 4a
and 4b where the WeakRef notifies the GC that the object it's using
shouldn't be collected (if a collection is in progress).  In fact, all you
really need to do is acquire the GCs mutex for a moment.  Once
you've acquired it then you know that any collection in progress
must be complete (assuming of course that the GC uses a mutex
to prevent corruption).

The horrible thing about all this is it makes using a multithreaded
WeakRef horribly slow.  Even if the GC facility were provided for
what you want to do you're talking about acquiring a mutex in
the GC for every deref operation.  What a mess.
April 20, 2009
On 2009-04-20 22:19:42 +0200, Jason House <jason.james.house@gmail.com> said:

> Sean Kelly Wrote:
> 
>> == Quote from Jason House (jason.james.house@gmail.com)'s article
>>> Jason House Wrote:
>>>> 
>>>> If I understand you correctly, a weak reference library needs to query the GC to see if an object is
>> marked for collection. How do I do this?
>>> Here's the implementation I'm currently thinking of:
>>> weak ref construction:
>>> Copy pointer into struct without casting
>>> Call rt_attach_dispose_event
>>> weak ref dereference:
>>> copy value into temp var
>>> ** query gc to see if garbage **
>>> If not garbage, Recopy value and return
>>> If garbage set value = null
>>> dispose of weakly referenced obj:
>>> set value = null
>>> The emphasized line is impossible according to the publicized gc interface. Given the delayed
>> finalization, I don't see any way around it.
>> 
>> You're guaranteed that the GC will notify you when it finalizes
>> the referenced object, so you shouldn't have to query the GC
>> at all.  Just copy the reference into a GC-scannable location
>> and then if you haven't been notified by the GC that the object
>> was finalized the reference is good.
> 
> Maybe I misunderstood, but I interpreted earlier discussion in this thread to mean the gc did the following steps:
> 1. Pause all threads
> 2. Scan objects to clear marks
> 3. Scan objects to add marks
> 4. Resume all threads
> 5. Scan objects, if no mark then...
>     5a. Call object finalizer
>     5b. Free memory for later use
> 
> Assuming that is correct, any deref of a weak reference is unsafe between steps 4 and 5a. Even though the object may be valid in that window, there's no guarantee that the reference won't be used after step 5b... This would lead to random behavior and seg faults. No self-respecting weak reference library should allow that to happen.
> 
> I hope I got something wrong and weak refs are easier to implement than I'm currently thinking.

I was thinking to add weak refs to tango core, and I also saw this race condition.
The only clean way to fix it is to notify while threads are stopped, all the rest is really a cludge.

Fawzi

April 20, 2009
On Mon, Apr 20, 2009 at 5:03 PM, Sean Kelly <sean@invisibleduck.org> wrote:

> The horrible thing about all this is it makes using a multithreaded
> WeakRef horribly slow.  Even if the GC facility were provided for
> what you want to do you're talking about acquiring a mutex in
> the GC for every deref operation.  What a mess.

Then take a hint from D2 and make them thread-local by default ;)  But provide a SharedWeakRef for the cases where you need it.
April 20, 2009
== Quote from Jarrett Billingsley (jarrett.billingsley@gmail.com)'s article
> On Mon, Apr 20, 2009 at 5:03 PM, Sean Kelly <sean@invisibleduck.org> wrote:
> > The horrible thing about all this is it makes using a multithreaded
> > WeakRef horribly slow.  Even if the GC facility were provided for
> > what you want to do you're talking about acquiring a mutex in
> > the GC for every deref operation.  What a mess.
> Then take a hint from D2 and make them thread-local by default ;)  But provide a SharedWeakRef for the cases where you need it.

Right.  These problems all go away if the object is not shared.
April 20, 2009
Sean Kelly Wrote:

> == Quote from Jason House (jason.james.house@gmail.com)'s article
> > Sean Kelly Wrote:
> > >
> > > You're guaranteed that the GC will notify you when it finalizes the referenced object, so you shouldn't have to query the GC at all.  Just copy the reference into a GC-scannable location and then if you haven't been notified by the GC that the object was finalized the reference is good.
> > Maybe I misunderstood, but I interpreted earlier discussion in this thread to mean the gc did the
> following steps:
> > 1. Pause all threads
> > 2. Scan objects to clear marks
> > 3. Scan objects to add marks
> > 4. Resume all threads
> > 5. Scan objects, if no mark then...
> >     5a. Call object finalizer
> >     5b. Free memory for later use
> > Assuming that is correct, any deref of a weak reference is unsafe between steps 4 and 5a. Even
> though the object may be valid in that window, there's no guarantee that the reference won't be used after step 5b... This would lead to random behavior and seg faults. No self-respecting weak reference library should allow that to happen.
> > I hope I got something wrong and weak refs are easier to implement than I'm currently thinking.
> 
> Oh I see what you're saying.  So this could happen:
> 
> 4. The GC resume all threads
> 4a. Another thread derefs the WeakRef
> 4b. This thread checks to see if the GC has notified that the
>        object is being collected.  No notification, so it uses the object.
> 5. The GC scans its memory and finalizes the object
> 6. The other thread tries to use the object and explodes
> 
> You're right, it sounds like you'd need to add a step between 4a
> and 4b where the WeakRef notifies the GC that the object it's using
> shouldn't be collected (if a collection is in progress).  In fact, all you
> really need to do is acquire the GCs mutex for a moment.  Once
> you've acquired it then you know that any collection in progress
> must be complete (assuming of course that the GC uses a mutex
> to prevent corruption).
> 
> The horrible thing about all this is it makes using a multithreaded
> WeakRef horribly slow.  Even if the GC facility were provided for
> what you want to do you're talking about acquiring a mutex in
> the GC for every deref operation.  What a mess.

I was hoping for a lock-free query that would return garbage (without crashing) in the rare case that a race occurred. If I get garbage (and says the object is marked) then the finalizer has run when I recheck the weak ref.  Do all druntime queries acquire the GC lock? I'm hoping to do about 100_000+ of these operations per second per core (and do other work too).

As long as memory blocks are not released to the OS while all threads are unpaused, lock-free operations should be possible.

As a side note, does druntime use multiple threads for the garbage collector? Scalability with Tango and a memory intensive application was horrible last time I tried renting an 8 core system. It may have been my fault, but the garbage collector is a convenient scape goat.
April 20, 2009
Jarrett Billingsley Wrote:

> On Mon, Apr 20, 2009 at 5:03 PM, Sean Kelly <sean@invisibleduck.org> wrote:
> 
> > The horrible thing about all this is it makes using a multithreaded
> > WeakRef horribly slow.  Even if the GC facility were provided for
> > what you want to do you're talking about acquiring a mutex in
> > the GC for every deref operation.  What a mess.
> 
> Then take a hint from D2 and make them thread-local by default ;)  But provide a SharedWeakRef for the cases where you need it.

I'm pretty sure the GC can bypass thread locality for mark and sweep, so thread local won't help.  Anything that stops the sweep from occurring in parallel with a thread using the data bypasses this issue.
April 20, 2009
Jason House, el 20 de abril a las 17:45 me escribiste:
> As a side note, does druntime use multiple threads for the garbage collector?

No, but I'm planning to add concurrency support to the GC. D1 and Posix OSs, though. Porting to D2 should be fairly easy because I'll use Tango's GC as base, which is almost the same as Druntime AFAIK.

-- 
Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/
----------------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------------
careful to all animals (never washing spiders down the plughole),
keep in contact with old friends (enjoy a drink now and then),
will frequently check credit at (moral) bank (hole in the wall),
April 20, 2009
== Quote from Jason House (jason.james.house@gmail.com)'s article
> Sean Kelly Wrote:
> >
> > The horrible thing about all this is it makes using a multithreaded
> > WeakRef horribly slow.  Even if the GC facility were provided for
> > what you want to do you're talking about acquiring a mutex in
> > the GC for every deref operation.  What a mess.
> I was hoping for a lock-free query that would return garbage (without crashing) in the rare case that
a race occurred. If I get garbage (and says the object is marked) then the finalizer has run when I recheck the weak ref.  Do all druntime queries acquire the GC lock? I'm hoping to do about 100_000+ of these operations per second per core (and do other work too).

They all acquire the lock.  And even with per-thread GCs, any operation against the shared GC will acquire its lock as well. I'm not terribly inclined to try and make a lock-free GC to avoid this for even simple queries.

> As long as memory blocks are not released to the OS while all threads are unpaused, lock-free
operations should be possible.

Blocks may be released to the OS at this time, but currently they will only be for pools that are completely empty.  In practice this generally only happens for a pool allocated specifically for a single big allocation.  As for lock-free GC stuff in general, there's too much state information involved to make me terribly inclined to give it a shot without some serious motivation.

> As a side note, does druntime use multiple threads for the garbage collector? Scalability with Tango
and a memory intensive application was horrible last time I tried renting an 8 core system. It may have been my fault, but the garbage collector is a convenient scape goat.

Currently, no.  The Druntime GC is nearly identical to the Tango GC but for a few tweaks here and there.  I'm eventually going to make a GC with per-thread heaps as an alternative, but I haven't found the time yet.

The current GC is a convenient scapegoat, but it's also a valid one if the threads in this app churn through memory continuously. If nothing else the threads are almost definitely convoying on the GC mutex, not to mention the cost of "stop the world" collections. You might want to look into judicious use of GC.reserve() to help with this some.  In apps that grow memory rather than just churn through it, you can see a pretty solid benefit from GC.reserve().