Thread overview | |||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
September 13, 2011 [phobos] parallelism segfaults | ||||
---|---|---|---|---|
| ||||
I've had a look the core dumps from the sometimes failing std.parallelism test. The issue is one of having daemon threads running while the GC is unmapping memory. Usually this goes unnoticed because the parallelism threads wait in a work queue condition. Sometimes a daemon thread is awakening from it's GC suspend handler after memory was already freed. This issue is already mentioned in a comment at gc_term. Thread obj = Thread.getThis(); ... suspend ... if( obj && !obj.m_lock ) // <- segfault I think we should bluntly kill daemon threads after thread_joinAll. martin |
September 13, 2011 [phobos] parallelism segfaults | ||||
---|---|---|---|---|
| ||||
Posted in reply to Martin Nowak | Thanks for looking into this. I had been ignoring this because I thought it was related to 6014 (http://d.puremagic.com/issues/show_bug.cgi?id=6014). I'm a little bit confused about what the root cause is. How can memory that the daemon thread still has access to be getting freed? In terms of root cause, is this a bug in std.parallelism or druntime? On Tue, Sep 13, 2011 at 2:59 PM, Martin Nowak <dawg at dawgfoto.de> wrote: > I've had a look the core dumps from the sometimes failing std.parallelism > test. > The issue is one of having daemon threads running while the GC is unmapping > memory. > Usually this goes unnoticed because the parallelism threads wait in a work > queue condition. > Sometimes a daemon thread is awakening from it's GC suspend handler after > memory was already > freed. This issue is already mentioned in a comment at gc_term. > > > Thread obj = Thread.getThis(); > > ... > suspend > ... > > if( obj && !obj.m_lock ) // <- segfault > > > I think we should bluntly kill daemon threads after thread_joinAll. > > martin > ______________________________**_________________ > phobos mailing list > phobos at puremagic.com > http://lists.puremagic.com/**mailman/listinfo/phobos<http://lists.puremagic.com/mailman/listinfo/phobos> > -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.puremagic.com/pipermail/phobos/attachments/20110913/1d2d269f/attachment.html> |
September 14, 2011 [phobos] parallelism segfaults | ||||
---|---|---|---|---|
| ||||
Posted in reply to David Simcha | It's an issue with the runtime shutdown. It ultimately unmaps all memory. Something simpler like this will always segfault for me. import std.parallelism, std.stdio; void printChar(dchar c) { write(c); } void main() { foreach(c; "hello world\n") taskPool.put(task!printChar(c)); } I can't think of any benefit in letting daemon threads continue up to program termination. On Tue, 13 Sep 2011 21:37:26 +0200, David Simcha <dsimcha at gmail.com> wrote: > Thanks for looking into this. I had been ignoring this because I > thought it > was related to 6014 (http://d.puremagic.com/issues/show_bug.cgi?id=6014). Which is another bug/oversight in the runtime shutdown. I've stumbled over this when sketching out the allocators. Will clarify this with a reduced test case. > I'm a little bit confused about what the root cause is. How can memory > that > the daemon thread still has access to be getting freed? In terms of root > cause, is this a bug in std.parallelism or druntime? > > On Tue, Sep 13, 2011 at 2:59 PM, Martin Nowak <dawg at dawgfoto.de> wrote: > >> I've had a look the core dumps from the sometimes failing >> std.parallelism >> test. >> The issue is one of having daemon threads running while the GC is >> unmapping >> memory. >> Usually this goes unnoticed because the parallelism threads wait in a >> work >> queue condition. >> Sometimes a daemon thread is awakening from it's GC suspend handler >> after >> memory was already >> freed. This issue is already mentioned in a comment at gc_term. >> >> >> Thread obj = Thread.getThis(); >> >> ... >> suspend >> ... >> >> if( obj && !obj.m_lock ) // <- segfault >> >> >> I think we should bluntly kill daemon threads after thread_joinAll. >> >> martin >> ______________________________**_________________ >> phobos mailing list >> phobos at puremagic.com >> http://lists.puremagic.com/**mailman/listinfo/phobos<http://lists.puremagic.com/mailman/listinfo/phobos> |
September 13, 2011 [phobos] parallelism segfaults | ||||
---|---|---|---|---|
| ||||
Posted in reply to Martin Nowak | On Sep 13, 2011, at 11:59 AM, Martin Nowak wrote:
> I've had a look the core dumps from the sometimes failing std.parallelism test.
> The issue is one of having daemon threads running while the GC is unmapping memory.
> Usually this goes unnoticed because the parallelism threads wait in a work queue condition.
> Sometimes a daemon thread is awakening from it's GC suspend handler after memory was already
> freed. This issue is already mentioned in a comment at gc_term.
>
>
> Thread obj = Thread.getThis();
>
> ...
> suspend
> ...
>
> if( obj && !obj.m_lock ) // <- segfault
>
>
> I think we should bluntly kill daemon threads after thread_joinAll.
One issue with this is that if these threads held any locks accessed during later parts of the cleanup (the GC lock, module-level locks accessed in static dtors, etc), the app could hang instead of shutting down cleanly. If the threads are forcibly terminated then it would really have to be after these cleanup steps occurred, but by then Bad Things could already be happening because module dtors have been run and the GC is terminated.
Since daemon threads are an explicit choice made by the user, I hope that they have also considered how to notify them to terminate cleanly (as one does in C/C++ where daemon threads are the default). The best thing is really to build this into the relevant module dtor. There's also Runtime.isHalting if someone can present a case to un-deprecate it.
|
September 14, 2011 [phobos] parallelism segfaults | ||||
---|---|---|---|---|
| ||||
Posted in reply to Sean Kelly | On 9/14/2011 1:04 AM, Sean Kelly wrote:
> On Sep 13, 2011, at 11:59 AM, Martin Nowak wrote:
>
>> I've had a look the core dumps from the sometimes failing std.parallelism test.
>> The issue is one of having daemon threads running while the GC is unmapping memory.
>> Usually this goes unnoticed because the parallelism threads wait in a work queue condition.
>> Sometimes a daemon thread is awakening from it's GC suspend handler after memory was already
>> freed. This issue is already mentioned in a comment at gc_term.
>>
>>
>> Thread obj = Thread.getThis();
>>
>> ...
>> suspend
>> ...
>>
>> if( obj&& !obj.m_lock ) //<- segfault
>>
>>
>> I think we should bluntly kill daemon threads after thread_joinAll.
> One issue with this is that if these threads held any locks accessed during later parts of the cleanup (the GC lock, module-level locks accessed in static dtors, etc), the app could hang instead of shutting down cleanly. If the threads are forcibly terminated then it would really have to be after these cleanup steps occurred, but by then Bad Things could already be happening because module dtors have been run and the GC is terminated.
>
> Since daemon threads are an explicit choice made by the user, I hope that they have also considered how to notify them to terminate cleanly (as one does in C/C++ where daemon threads are the default). The best thing is really to build this into the relevant module dtor. There's also Runtime.isHalting if someone can present a case to un-deprecate it.
> _______________________________________________
> phobos mailing list
> phobos at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/phobos
>
Is the finalizer guaranteed to be called on all GC-allocated class instances before program termination, or is a regular collection cycle just run?
|
September 14, 2011 [phobos] parallelism segfaults | ||||
---|---|---|---|---|
| ||||
Posted in reply to David Simcha | A regular collection is run.
Sent from my iPhone
On Sep 14, 2011, at 4:36 AM, David Simcha <dsimcha at gmail.com> wrote:
> On 9/14/2011 1:04 AM, Sean Kelly wrote:
>> On Sep 13, 2011, at 11:59 AM, Martin Nowak wrote:
>>
>>> I've had a look the core dumps from the sometimes failing std.parallelism test.
>>> The issue is one of having daemon threads running while the GC is unmapping memory.
>>> Usually this goes unnoticed because the parallelism threads wait in a work queue condition.
>>> Sometimes a daemon thread is awakening from it's GC suspend handler after memory was already
>>> freed. This issue is already mentioned in a comment at gc_term.
>>>
>>>
>>> Thread obj = Thread.getThis();
>>>
>>> ...
>>> suspend
>>> ...
>>>
>>> if( obj&& !obj.m_lock ) //<- segfault
>>>
>>>
>>> I think we should bluntly kill daemon threads after thread_joinAll.
>> One issue with this is that if these threads held any locks accessed during later parts of the cleanup (the GC lock, module-level locks accessed in static dtors, etc), the app could hang instead of shutting down cleanly. If the threads are forcibly terminated then it would really have to be after these cleanup steps occurred, but by then Bad Things could already be happening because module dtors have been run and the GC is terminated.
>>
>> Since daemon threads are an explicit choice made by the user, I hope that they have also considered how to notify them to terminate cleanly (as one does in C/C++ where daemon threads are the default). The best thing is really to build this into the relevant module dtor. There's also Runtime.isHalting if someone can present a case to un-deprecate it.
>> _______________________________________________
>> phobos mailing list
>> phobos at puremagic.com
>> http://lists.puremagic.com/mailman/listinfo/phobos
>>
>
> Is the finalizer guaranteed to be called on all GC-allocated class instances before program termination, or is a regular collection cycle just run?
> _______________________________________________
> phobos mailing list
> phobos at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/phobos
|
September 14, 2011 [phobos] parallelism segfaults | ||||
---|---|---|---|---|
| ||||
Posted in reply to Sean Kelly | Ok, in that case, do you have any suggestions for how to terminate daemon threads cleanly? I had no idea this was an issue. The only thing I can think of is to keep a shared registry of all TaskPool objects and send them stop signals on module destruction. However, having to keep such a registry would be kind of annoying. One thing that I definitely don't want is to punt the problem to the user of std,parallelism, because it's the kind of low-level thing that the module is supposed to abstract away. On Wed, Sep 14, 2011 at 10:31 AM, Sean Kelly <sean at invisibleduck.org> wrote: > A regular collection is run. > > Sent from my iPhone > > On Sep 14, 2011, at 4:36 AM, David Simcha <dsimcha at gmail.com> wrote: > > > On 9/14/2011 1:04 AM, Sean Kelly wrote: > >> On Sep 13, 2011, at 11:59 AM, Martin Nowak wrote: > >> > >>> I've had a look the core dumps from the sometimes failing > std.parallelism test. > >>> The issue is one of having daemon threads running while the GC is > unmapping memory. > >>> Usually this goes unnoticed because the parallelism threads wait in a > work queue condition. > >>> Sometimes a daemon thread is awakening from it's GC suspend handler > after memory was already > >>> freed. This issue is already mentioned in a comment at gc_term. > >>> > >>> > >>> Thread obj = Thread.getThis(); > >>> > >>> ... > >>> suspend > >>> ... > >>> > >>> if( obj&& !obj.m_lock ) //<- segfault > >>> > >>> > >>> I think we should bluntly kill daemon threads after thread_joinAll. > >> One issue with this is that if these threads held any locks accessed > during later parts of the cleanup (the GC lock, module-level locks accessed > in static dtors, etc), the app could hang instead of shutting down cleanly. > If the threads are forcibly terminated then it would really have to be > after these cleanup steps occurred, but by then Bad Things could already be > happening because module dtors have been run and the GC is terminated. > >> > >> Since daemon threads are an explicit choice made by the user, I hope > that they have also considered how to notify them to terminate cleanly (as one does in C/C++ where daemon threads are the default). The best thing is really to build this into the relevant module dtor. There's also Runtime.isHalting if someone can present a case to un-deprecate it. > >> _______________________________________________ > >> phobos mailing list > >> phobos at puremagic.com > >> http://lists.puremagic.com/mailman/listinfo/phobos > >> > > > > Is the finalizer guaranteed to be called on all GC-allocated class > instances before program termination, or is a regular collection cycle just run? > > _______________________________________________ > > phobos mailing list > > phobos at puremagic.com > > http://lists.puremagic.com/mailman/listinfo/phobos > _______________________________________________ > phobos mailing list > phobos at puremagic.com > http://lists.puremagic.com/mailman/listinfo/phobos > -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.puremagic.com/pipermail/phobos/attachments/20110914/b0da70cb/attachment.html> |
September 14, 2011 [phobos] parallelism segfaults | ||||
---|---|---|---|---|
| ||||
Posted in reply to Sean Kelly | How about this: Could we send all daemon threads hardware exceptions after joinAll()? In the vast majority of cases, locks will be scoped, either through synchronized blocks or simple scope(exit) mutex.unlock() type statements. If they're not then they should be. (If any still aren't in std.parallelism then I'll fix this. I originally made a few non-scoped around code that couldn't throw, but this was silly and I think I changed all of them.) This way daemon threads terminate immediately, locks get released if the code's well-written, and if you **really** need to do some cleanup, you can catch the exception. On Wed, Sep 14, 2011 at 1:04 AM, Sean Kelly <sean at invisibleduck.org> wrote: > On Sep 13, 2011, at 11:59 AM, Martin Nowak wrote: > > > I've had a look the core dumps from the sometimes failing std.parallelism > test. > > The issue is one of having daemon threads running while the GC is > unmapping memory. > > Usually this goes unnoticed because the parallelism threads wait in a > work queue condition. > > Sometimes a daemon thread is awakening from it's GC suspend handler after > memory was already > > freed. This issue is already mentioned in a comment at gc_term. > > > > > > Thread obj = Thread.getThis(); > > > > ... > > suspend > > ... > > > > if( obj && !obj.m_lock ) // <- segfault > > > > > > I think we should bluntly kill daemon threads after thread_joinAll. > > One issue with this is that if these threads held any locks accessed during later parts of the cleanup (the GC lock, module-level locks accessed in static dtors, etc), the app could hang instead of shutting down cleanly. If the threads are forcibly terminated then it would really have to be after these cleanup steps occurred, but by then Bad Things could already be happening because module dtors have been run and the GC is terminated. > > Since daemon threads are an explicit choice made by the user, I hope that > they have also considered how to notify them to terminate cleanly (as one > does in C/C++ where daemon threads are the default). The best thing is > really to build this into the relevant module dtor. There's also > Runtime.isHalting if someone can present a case to un-deprecate it. > _______________________________________________ > phobos mailing list > phobos at puremagic.com > http://lists.puremagic.com/mailman/listinfo/phobos > -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.puremagic.com/pipermail/phobos/attachments/20110914/e02d5621/attachment.html> |
September 14, 2011 [phobos] parallelism segfaults | ||||
---|---|---|---|---|
| ||||
Posted in reply to David Simcha | On Sep 14, 2011, at 7:37 AM, David Simcha wrote:
> Ok, in that case, do you have any suggestions for how to terminate daemon threads cleanly? I had no idea this was an issue. The only thing I can think of is to keep a shared registry of all TaskPool objects and send them stop signals on module destruction. However, having to keep such a registry would be kind of annoying. One thing that I definitely don't want is to punt the problem to the user of std,parallelism, because it's the kind of low-level thing that the module is supposed to abstract away.
First, I should mention that the thread ownership rules implemented in std.concurrency will pass an OwnerTerminated message to spawned threads when module dtors are run. So if you're using message passing the daemon threads should exit cleanly with an OwnerTerminated exception provided they call receive in a timely fashion.
Otherwise, cleanup is just like when using threads in C/C++, and there are a variety of approaches. The general idea though is to either send a message to each thread or set a global flag and then block while waiting for the daemon threads to terminate. I'll usually have a timeout on this, so if a daemon thread doesn't terminate in a timely manner I'll just let the app exit, thereby forcibly terminating the thread (as you're seeing now). The timeout is just a failsafe so if a thread is hung for some reason the app doesn't wait indefinitely for it to terminate.
|
September 14, 2011 [phobos] parallelism segfaults | ||||
---|---|---|---|---|
| ||||
Posted in reply to David Simcha | On Sep 14, 2011, at 10:03 AM, David Simcha wrote:
> How about this: Could we send all daemon threads hardware exceptions after joinAll()? In the vast majority of cases, locks will be scoped, either through synchronized blocks or simple scope(exit) mutex.unlock() type statements. If they're not then they should be. (If any still aren't in std.parallelism then I'll fix this. I originally made a few non-scoped around code that couldn't throw, but this was silly and I think I changed all of them.) This way daemon threads terminate immediately, locks get released if the code's well-written, and if you **really** need to do some cleanup, you can catch the exception.
How would we do this? Signals don't cause an exception to be thrown (because it's technically illegal to throw from a signal handler). Is there some other way we could sent a hardware exception to a thread that would cause it to terminate cleanly?
|
Copyright © 1999-2021 by the D Language Foundation