Thread overview | ||||||||
---|---|---|---|---|---|---|---|---|
|
June 13, 2013 GC dead-locking ? | ||||
---|---|---|---|---|
| ||||
Here is an excerpt from a stack trace I got while profiling with OProfile: #0 sem_wait () from /lib64/libpthread.so.0 #1 thread_suspendAll () at core/thread.d:2471 #2 gc.gcx.Gcx.fullcollect() (this=...) at gc/gcx.d:2427 #3 gc.gcx.Gcx.bigAlloc() (this=..., size=16401, poolPtr=0x7fc3d4bfe3c8, alloc_size=0x7fc3d4bfe418) at gc/gcx.d:2099 #4 gc.gcx.GC.mallocNoSync (alloc_size=0x7fc3d4bfe418, bits=10, size=16401, this=...) gc/gcx.d:503 #5 gc.gcx.GC.malloc() (this=..., size=16401, bits=10, alloc_size=0x7fc3d4bfe418) gc/gcx.d:421 #6 gc.gc.gc_qalloc (ba=10, sz=<optimized out>) gc/gc.d:203 #7 gc_qalloc (sz=<optimized out>, ba=10) gc/gc.d:198 #8 _d_newarrayT (ti=..., length=4096) rt/lifetime.d:807 #9 sequencer.algorithm.gzip.HuffmanTree.__T6__ctorTG32hZ.__ctor() (this=..., bitLengths=...) sequencer/algorithm/gzip.d:444 Two more threads are alive, but waiting on a condition variable (i.e.: in pthread_cond_wait(), but from my own and not from druntime code. Is there some obvious way I could have dead-locked the GC ? Or is there a bug ? This was compiled with GDC using DMD FE 2.062. -- Marco |
June 13, 2013 Re: GC dead-locking ? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Marco Leise | One more note: I get this consistently during profiling, but
not without.
I don't count kernel involvement out either, since OProfile is
a kernel based profiler and there could be a quirk in its
interaction with semaphores.
--
Marco
|
June 17, 2013 Re: GC dead-locking ? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Marco Leise | On Jun 13, 2013, at 2:22 AM, Marco Leise <Marco.Leise@gmx.de> wrote:
> Here is an excerpt from a stack trace I got while profiling with OProfile:
>
> #0 sem_wait () from /lib64/libpthread.so.0
> #1 thread_suspendAll () at core/thread.d:2471
> #2 gc.gcx.Gcx.fullcollect() (this=...) at gc/gcx.d:2427
> #3 gc.gcx.Gcx.bigAlloc() (this=..., size=16401, poolPtr=0x7fc3d4bfe3c8, alloc_size=0x7fc3d4bfe418) at gc/gcx.d:2099
> #4 gc.gcx.GC.mallocNoSync (alloc_size=0x7fc3d4bfe418, bits=10, size=16401, this=...) gc/gcx.d:503
> #5 gc.gcx.GC.malloc() (this=..., size=16401, bits=10, alloc_size=0x7fc3d4bfe418) gc/gcx.d:421
> #6 gc.gc.gc_qalloc (ba=10, sz=<optimized out>) gc/gc.d:203
> #7 gc_qalloc (sz=<optimized out>, ba=10) gc/gc.d:198
> #8 _d_newarrayT (ti=..., length=4096) rt/lifetime.d:807
> #9 sequencer.algorithm.gzip.HuffmanTree.__T6__ctorTG32hZ.__ctor() (this=..., bitLengths=...) sequencer/algorithm/gzip.d:444
>
> Two more threads are alive, but waiting on a condition variable (i.e.: in pthread_cond_wait(), but from my own and not from druntime code. Is there some obvious way I could have dead-locked the GC ? Or is there a bug ?
I assume you're running on Linux, which uses signals (SIGUSR1, specifically) to suspend threads for a collection. So I imagine what's happening is that your thread is trying to suspend all the other threads so it can collect, and those threads are ignoring the signal for some reason. I would expect pthread_cond_wait to be interrupted if a signal arrives though. Have you overridden the signal handler for SIGUSR1?
|
June 18, 2013 Re: GC dead-locking ? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Sean Kelly | Am Mon, 17 Jun 2013 10:46:19 -0700 schrieb Sean Kelly <sean@invisibleduck.org>: > On Jun 13, 2013, at 2:22 AM, Marco Leise <Marco.Leise@gmx.de> wrote: > > > Here is an excerpt from a stack trace I got while profiling with OProfile: > > > > #0 sem_wait () from /lib64/libpthread.so.0 > > #1 thread_suspendAll () at core/thread.d:2471 > > #2 gc.gcx.Gcx.fullcollect() (this=...) at gc/gcx.d:2427 > > #3 gc.gcx.Gcx.bigAlloc() (this=..., size=16401, poolPtr=0x7fc3d4bfe3c8, alloc_size=0x7fc3d4bfe418) at gc/gcx.d:2099 > > #4 gc.gcx.GC.mallocNoSync (alloc_size=0x7fc3d4bfe418, bits=10, size=16401, this=...) gc/gcx.d:503 > > #5 gc.gcx.GC.malloc() (this=..., size=16401, bits=10, alloc_size=0x7fc3d4bfe418) gc/gcx.d:421 > > #6 gc.gc.gc_qalloc (ba=10, sz=<optimized out>) gc/gc.d:203 > > #7 gc_qalloc (sz=<optimized out>, ba=10) gc/gc.d:198 > > #8 _d_newarrayT (ti=..., length=4096) rt/lifetime.d:807 > > #9 sequencer.algorithm.gzip.HuffmanTree.__T6__ctorTG32hZ.__ctor() (this=..., bitLengths=...) sequencer/algorithm/gzip.d:444 > > > > Two more threads are alive, but waiting on a condition variable (i.e.: in pthread_cond_wait(), but from my own and not from druntime code. Is there some obvious way I could have dead-locked the GC ? Or is there a bug ? > > I assume you're running on Linux, which uses signals (SIGUSR1, specifically) to suspend threads for a collection. So I imagine what's happening is that your thread is trying to suspend all the other threads so it can collect, and those threads are ignoring the signal for some reason. I would expect pthread_cond_wait to be interrupted if a signal arrives though. Have you overridden the signal handler for SIGUSR1? No, I have not overridden the signal handler. I'm aware of the fact that signals make pthread_cond_wait() return early and put them in a while loop as one would expect, that is all. -- Marco |
June 19, 2013 Re: GC dead-locking ? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Marco Leise | On Jun 18, 2013, at 7:01 AM, Marco Leise <Marco.Leise@gmx.de> wrote:
> Am Mon, 17 Jun 2013 10:46:19 -0700
> schrieb Sean Kelly <sean@invisibleduck.org>:
>
>> On Jun 13, 2013, at 2:22 AM, Marco Leise <Marco.Leise@gmx.de> wrote:
>>
>>> Here is an excerpt from a stack trace I got while profiling with OProfile:
>>>
>>> #0 sem_wait () from /lib64/libpthread.so.0
>>> #1 thread_suspendAll () at core/thread.d:2471
>>> #2 gc.gcx.Gcx.fullcollect() (this=...) at gc/gcx.d:2427
>>> #3 gc.gcx.Gcx.bigAlloc() (this=..., size=16401, poolPtr=0x7fc3d4bfe3c8, alloc_size=0x7fc3d4bfe418) at gc/gcx.d:2099
>>> #4 gc.gcx.GC.mallocNoSync (alloc_size=0x7fc3d4bfe418, bits=10, size=16401, this=...) gc/gcx.d:503
>>> #5 gc.gcx.GC.malloc() (this=..., size=16401, bits=10, alloc_size=0x7fc3d4bfe418) gc/gcx.d:421
>>> #6 gc.gc.gc_qalloc (ba=10, sz=<optimized out>) gc/gc.d:203
>>> #7 gc_qalloc (sz=<optimized out>, ba=10) gc/gc.d:198
>>> #8 _d_newarrayT (ti=..., length=4096) rt/lifetime.d:807
>>> #9 sequencer.algorithm.gzip.HuffmanTree.__T6__ctorTG32hZ.__ctor() (this=..., bitLengths=...) sequencer/algorithm/gzip.d:444
>>>
>>> Two more threads are alive, but waiting on a condition variable (i.e.: in pthread_cond_wait(), but from my own and not from druntime code. Is there some obvious way I could have dead-locked the GC ? Or is there a bug ?
>>
>> I assume you're running on Linux, which uses signals (SIGUSR1, specifically) to suspend threads for a collection. So I imagine what's happening is that your thread is trying to suspend all the other threads so it can collect, and those threads are ignoring the signal for some reason. I would expect pthread_cond_wait to be interrupted if a signal arrives though. Have you overridden the signal handler for SIGUSR1?
>
> No, I have not overridden the signal handler. I'm aware of the fact that signals make pthread_cond_wait() return early and put them in a while loop as one would expect, that is all.
Hrm... Can you trap this in a debugger and post the stack traces of all threads? That stack above is a thread waiting for others to say they're suspended so it can collect.
|
July 01, 2013 Re: GC dead-locking ? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Sean Kelly | Am Tue, 18 Jun 2013 19:12:06 -0700 schrieb Sean Kelly <sean@invisibleduck.org>: > On Jun 18, 2013, at 7:01 AM, Marco Leise <Marco.Leise@gmx.de> wrote: > > > Am Mon, 17 Jun 2013 10:46:19 -0700 > > schrieb Sean Kelly <sean@invisibleduck.org>: > > > >> On Jun 13, 2013, at 2:22 AM, Marco Leise <Marco.Leise@gmx.de> wrote: > >> > >>> Here is an excerpt from a stack trace I got while profiling with OProfile: > >>> > >>> #0 sem_wait () from /lib64/libpthread.so.0 > >>> #1 thread_suspendAll () at core/thread.d:2471 > >>> #2 gc.gcx.Gcx.fullcollect() (this=...) at gc/gcx.d:2427 > >>> #3 gc.gcx.Gcx.bigAlloc() (this=..., size=16401, poolPtr=0x7fc3d4bfe3c8, alloc_size=0x7fc3d4bfe418) at gc/gcx.d:2099 > >>> #4 gc.gcx.GC.mallocNoSync (alloc_size=0x7fc3d4bfe418, bits=10, size=16401, this=...) gc/gcx.d:503 > >>> #5 gc.gcx.GC.malloc() (this=..., size=16401, bits=10, alloc_size=0x7fc3d4bfe418) gc/gcx.d:421 > >>> #6 gc.gc.gc_qalloc (ba=10, sz=<optimized out>) gc/gc.d:203 > >>> #7 gc_qalloc (sz=<optimized out>, ba=10) gc/gc.d:198 > >>> #8 _d_newarrayT (ti=..., length=4096) rt/lifetime.d:807 > >>> #9 sequencer.algorithm.gzip.HuffmanTree.__T6__ctorTG32hZ.__ctor() (this=..., bitLengths=...) sequencer/algorithm/gzip.d:444 > >>> > >>> Two more threads are alive, but waiting on a condition variable (i.e.: in pthread_cond_wait(), but from my own and not from druntime code. Is there some obvious way I could have dead-locked the GC ? Or is there a bug ? > >> > >> I assume you're running on Linux, which uses signals (SIGUSR1, specifically) to suspend threads for a collection. So I imagine what's happening is that your thread is trying to suspend all the other threads so it can collect, and those threads are ignoring the signal for some reason. I would expect pthread_cond_wait to be interrupted if a signal arrives though. Have you overridden the signal handler for SIGUSR1? > > > > No, I have not overridden the signal handler. I'm aware of the fact that signals make pthread_cond_wait() return early and put them in a while loop as one would expect, that is all. > > Hrm... Can you trap this in a debugger and post the stack traces of all threads? That stack above is a thread waiting for others to say they're suspended so it can collect. I could do that (with a little work setting the scenario up again), but it wont help. As I said, the other two threads were paused in pthread_cond_wait() in my own code. There was nothing special about their stack trace. -- Marco |
Copyright © 1999-2021 by the D Language Foundation