View mode: basic / threaded / horizontal-split · Log in · Help
April 20, 2009
Re: D2 weak references
== Quote from Jason House (jason.james.house@gmail.com)'s article
> Jarrett Billingsley Wrote:
> > On Mon, Apr 20, 2009 at 5:03 PM, Sean Kelly <sean@invisibleduck.org> wrote:
> >
> > > The horrible thing about all this is it makes using a multithreaded
> > > WeakRef horribly slow.  Even if the GC facility were provided for
> > > what you want to do you're talking about acquiring a mutex in
> > > the GC for every deref operation.  What a mess.
> >
> > Then take a hint from D2 and make them thread-local by default ;)  But
> > provide a SharedWeakRef for the cases where you need it.
> I'm pretty sure the GC can bypass thread locality for mark and sweep, so thread local won't help.
Anything that stops the sweep from occurring in parallel with a thread using the data bypasses this
issue.

Yeah.  The unfortunate tradeoff is that finalizers are run while all these
app threads are suspended, and if these finalizers enter a synchronized
block they're likely to deadlock.  This happened enough with the Phobos
GC that I was motivated to make the change in the first place.  It's
unfortunate, but at least WeakRef is the only victim of the current
approach that I'm aware of.
April 20, 2009
Re: D2 weak references
On 2009-04-20 18:31:28 -0400, Sean Kelly <sean@invisibleduck.org> said:

> Yeah.  The unfortunate tradeoff is that finalizers are run while all these
> app threads are suspended, and if these finalizers enter a synchronized
> block they're likely to deadlock.  This happened enough with the Phobos
> GC that I was motivated to make the change in the first place.  It's
> unfortunate, but at least WeakRef is the only victim of the current
> approach that I'm aware of.

Then can't you notify the weak refs, restart the world, and then 
finalize the objects for real?


-- 
Michel Fortin
michel.fortin@michelf.com
http://michelf.com/
April 21, 2009
Re: D2 weak references
Sean Kelly wrote:
> == Quote from Jason House (jason.james.house@gmail.com)'s article
>> Sean Kelly Wrote:
>>> You're guaranteed that the GC will notify you when it finalizes
>>> the referenced object, so you shouldn't have to query the GC
>>> at all.  Just copy the reference into a GC-scannable location
>>> and then if you haven't been notified by the GC that the object
>>> was finalized the reference is good.
>> Maybe I misunderstood, but I interpreted earlier discussion in this thread to mean the gc did the
> following steps:
>> 1. Pause all threads
>> 2. Scan objects to clear marks
>> 3. Scan objects to add marks
>> 4. Resume all threads
>> 5. Scan objects, if no mark then...
>>     5a. Call object finalizer
>>     5b. Free memory for later use
>> Assuming that is correct, any deref of a weak reference is unsafe between steps 4 and 5a. Even
> though the object may be valid in that window, there's no guarantee that the reference won't be used
> after step 5b... This would lead to random behavior and seg faults. No self-respecting weak reference
> library should allow that to happen.
>> I hope I got something wrong and weak refs are easier to implement than I'm currently thinking.
> 
> Oh I see what you're saying.  So this could happen:
> 
> 4. The GC resume all threads
> 4a. Another thread derefs the WeakRef
> 4b. This thread checks to see if the GC has notified that the
>        object is being collected.  No notification, so it uses the object.
> 5. The GC scans its memory and finalizes the object
> 6. The other thread tries to use the object and explodes
> 
> You're right, it sounds like you'd need to add a step between 4a
> and 4b where the WeakRef notifies the GC that the object it's using
> shouldn't be collected (if a collection is in progress).  In fact, all you
> really need to do is acquire the GCs mutex for a moment.  Once
> you've acquired it then you know that any collection in progress
> must be complete (assuming of course that the GC uses a mutex
> to prevent corruption).
> 
> The horrible thing about all this is it makes using a multithreaded
> WeakRef horribly slow.  Even if the GC facility were provided for
> what you want to do you're talking about acquiring a mutex in
> the GC for every deref operation.  What a mess.

What if you added this:

3.5. Set GC.collectionInProgress to true.
4. The GC resumes all threads.
4a. Another thread derefs the WeakRef
4b. Before the WeakRef checks to see if the reference has been cleared,
it spinlocks on GC.collectionInProgress.
5. The GC scans its memory and finalises the object.
5a. The GC clears GC.collectionInProgress.
6. The other thread unblocks, sees the WeakRef has been cleared and
fails gracefully.

 -- Daniel
April 21, 2009
Re: D2 weak references
Michel Fortin, el 20 de abril a las 18:53 me escribiste:
> On 2009-04-20 18:31:28 -0400, Sean Kelly <sean@invisibleduck.org> said:
> 
> >Yeah.  The unfortunate tradeoff is that finalizers are run while all these
> >app threads are suspended, and if these finalizers enter a synchronized
> >block they're likely to deadlock.  This happened enough with the Phobos
> >GC that I was motivated to make the change in the first place.  It's
> >unfortunate, but at least WeakRef is the only victim of the current
> >approach that I'm aware of.
> 
> Then can't you notify the weak refs, restart the world, and then
> finalize the objects for real?

I've not followed this thread very closely but I think that would imply
that the sweep phase had to run with all threads suspended, which is a bad
idea unless you *really* enjoy pauses =)

-- 
Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/
----------------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------------
Karma police
arrest this man,
he talks in maths,
he buzzes like a fridge,
he's like a detuned radio.
April 22, 2009
Re: D2 weak references
Sean Kelly wrote:

> == Quote from Jason House (jason.james.house@gmail.com)'s article
>> Jarrett Billingsley Wrote:
>> > On Mon, Apr 20, 2009 at 5:03 PM, Sean Kelly <sean@invisibleduck.org>
>> > wrote:
>> >
>> > > The horrible thing about all this is it makes using a multithreaded
>> > > WeakRef horribly slow. �Even if the GC facility were provided for
>> > > what you want to do you're talking about acquiring a mutex in
>> > > the GC for every deref operation. �What a mess.
>> >
>> > Then take a hint from D2 and make them thread-local by default ;)  But
>> > provide a SharedWeakRef for the cases where you need it.
>> I'm pretty sure the GC can bypass thread locality for mark and sweep, so
>> thread local won't help.
> Anything that stops the sweep from occurring in parallel with a thread
> using the data bypasses this issue.
> 
> Yeah.  The unfortunate tradeoff is that finalizers are run while all these
> app threads are suspended, and if these finalizers enter a synchronized
> block they're likely to deadlock.  This happened enough with the Phobos
> GC that I was motivated to make the change in the first place.  It's
> unfortunate, but at least WeakRef is the only victim of the current
> approach that I'm aware of.

Is it possible for druntime to expose an option as to whether or not the 
sweep phase / finalizers should run while all threads are suspended?

It'd really suck to implement some kind of mark and sweep garbage collection 
just to compensate for how the standard mark and sweep garbage collector is 
implemented.

Another alternative I came up with is notification when the GC clears or sets 
a mark on an object.  That would allow the weak ref to know that its 
information is suspect.  That approach would only required that the mark 
phase is done with all threads stopped.

If you like either of these ideas, I could attempt a patch to add the 
functionality.
April 22, 2009
Re: D2 weak references
> Is it possible for druntime to expose an option as to whether or not the 
> sweep phase / finalizers should run while all threads are suspended?

But then you can only do boring stuff in the finalizer. Calling any C or 
library function must be avoided, because it can lead to deadlocks. At 
the same time, your code is called at arbitrary times, and anything you 
do will cause race conditions by default. It's like a UNIX signal 
handler: programming in it sucks so much.

Maybe the finalizer register function could take two callbacks: one 
called before threads are resumed, and one after threads are resumed. 
You could use the first callback to zero out the weak reference, while 
the second takes care of advanced functionality, like notifying someone 
that the reference has been collected.
April 22, 2009
Re: D2 weak references
grauzone Wrote:

> > Is it possible for druntime to expose an option as to whether or not the 
> > sweep phase / finalizers should run while all threads are suspended?
> 
> But then you can only do boring stuff in the finalizer. Calling any C or 
> library function must be avoided, because it can lead to deadlocks. At 
> the same time, your code is called at arbitrary times, and anything you 
> do will cause race conditions by default. It's like a UNIX signal 
> handler: programming in it sucks so much.

I've never had this kind of issue with finalizers, so I have limited appreciation for the argument. In the application I'm working on, I have no need for any functionality in this area except weak reference support.

> Maybe the finalizer register function could take two callbacks: one 
> called before threads are resumed, and one after threads are resumed. 
> You could use the first callback to zero out the weak reference, while 
> the second takes care of advanced functionality, like notifying someone 
> that the reference has been collected.

Wouldn't that implicitly require two sweep phases? In my last post, I suggested the possibiluty of a callback in the mark phase. At the moment, I'm thinking one callback at the start of collection and another callback when marking a weakly referenced object would do the trick. Any weakly referenced object not marked after the last collection is invalid
April 22, 2009
Re: D2 weak references
Jason House, el 21 de abril a las 21:11 me escribiste:
> Another alternative I came up with is notification when the GC clears or sets 
> a mark on an object.  That would allow the weak ref to know that its 
> information is suspect.  That approach would only required that the mark 
> phase is done with all threads stopped.

I think instead of a notification mechanism access to the mark bit can be
provided.

The mark phase is ran with all thread paused, so you don't need any
notifications in that phase because nobody can ask the weak ref object for
the underlaying pointer then.

When threads are resumed, the mark phase is complete, so the invariant of
all live objects having the mark bit set and all the garbage having the
bit unset should hold. If the GC guarantee that the mark bit is set
*always* when a object is live (the mark bit clearing should be done with
all threads stopped and that should be it), except only when the mark
phase is running, then, a weak ref implementation can look at the mark bit
to safely see if the reference is garbage or not (even before the sweep
phase find it and call the finalizer).

I think this approach is both simple and efficient, it doesn't have races
at all (unless I'm missing something, of course). The only problem is it's
too tied to the current GC implementation. But I think any weak ref
implementation will, so it looks like weak ref really belong to the GC.

-- 
Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/
----------------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------------
no longer afraid of the dark or midday shadows
nothing so ridiculously teenage and desperate,
nothing so childish - at a better pace,
slower and more calculated,
no chance of escape,
April 22, 2009
Re: D2 weak references
Leandro Lucarella Wrote:

> Jason House, el 21 de abril a las 21:11 me escribiste:
> > Another alternative I came up with is notification when the GC clears or sets 
> > a mark on an object.  That would allow the weak ref to know that its 
> > information is suspect.  That approach would only required that the mark 
> > phase is done with all threads stopped.
> 
> I think instead of a notification mechanism access to the mark bit can be
> provided.
> 
> The mark phase is ran with all thread paused, so you don't need any
> notifications in that phase because nobody can ask the weak ref object for
> the underlaying pointer then.
> 
> When threads are resumed, the mark phase is complete, so the invariant of
> all live objects having the mark bit set and all the garbage having the
> bit unset should hold. If the GC guarantee that the mark bit is set
> *always* when a object is live (the mark bit clearing should be done with
> all threads stopped and that should be it), except only when the mark
> phase is running, then, a weak ref implementation can look at the mark bit
> to safely see if the reference is garbage or not (even before the sweep
> phase find it and call the finalizer).
> 
> I think this approach is both simple and efficient, it doesn't have races
> at all (unless I'm missing something, of course). The only problem is it's
> too tied to the current GC implementation. But I think any weak ref
> implementation will, so it looks like weak ref really belong to the GC.

Unfortunately, that is not efficient. The state used by the GC is in flux during the sweep phase, so there is no easy lockless way to access that data. Having to acquire a lock on every weakref dereference absolutely kills performance.

I'm definitely open to hearing more ideas on how to efficiently implement.
April 22, 2009
Re: D2 weak references
Jason House, el 22 de abril a las 11:32 me escribiste:
> Leandro Lucarella Wrote:
> 
> > Jason House, el 21 de abril a las 21:11 me escribiste:
> > > Another alternative I came up with is notification when the GC clears or sets 
> > > a mark on an object.  That would allow the weak ref to know that its 
> > > information is suspect.  That approach would only required that the mark 
> > > phase is done with all threads stopped.
> > 
> > I think instead of a notification mechanism access to the mark bit can be
> > provided.
> > 
> > The mark phase is ran with all thread paused, so you don't need any
> > notifications in that phase because nobody can ask the weak ref object for
> > the underlaying pointer then.
> > 
> > When threads are resumed, the mark phase is complete, so the invariant of
> > all live objects having the mark bit set and all the garbage having the
> > bit unset should hold. If the GC guarantee that the mark bit is set
> > *always* when a object is live (the mark bit clearing should be done with
> > all threads stopped and that should be it), except only when the mark
> > phase is running, then, a weak ref implementation can look at the mark bit
> > to safely see if the reference is garbage or not (even before the sweep
> > phase find it and call the finalizer).
> > 
> > I think this approach is both simple and efficient, it doesn't have races
> > at all (unless I'm missing something, of course). The only problem is it's
> > too tied to the current GC implementation. But I think any weak ref
> > implementation will, so it looks like weak ref really belong to the GC.
> 
> Unfortunately, that is not efficient. The state used by the GC is in
> flux during the sweep phase, so there is no easy lockless way to access
> that data. Having to acquire a lock on every weakref dereference
> absolutely kills performance.

The sweep phase don't touch mark bits. I can't possible understand why
you should synchronize access to a (temporarily) read-only variable.

Mark bits are only written when:
1) Clearing all the mark bits before marking (this is done with all
  threads running now, and this is what it should be changed to be done
  when all threads are suspended)
2) Marking (threads are suspended in the current implementation)

> I'm definitely open to hearing more ideas on how to efficiently
> implement.

The only performance penalty of my proposal would be incrementing the time
the world is stooped a bit (adding the marks cleaning phase there).
I think is a fair price to pay (being that mark bits are implemented using
bitmaps, it should be pretty fast resetting all bits to 0) for an easy and
safe weak references implementations.

-- 
Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/
----------------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------------
ELLA FUE INFIEL, PERO EX POLOLO PAGÓ
	-- TV CHILE
1 2 3 4 5
Top | Discussion index | About this forum | D home