Thread overview
SIGUSR1 Makes Program Hang?
Sep 07, 2004
teqDruid
Sep 07, 2004
Ben Hinkle
Sep 07, 2004
antiAlias
Sep 07, 2004
teqDruid
Sep 07, 2004
Ben Hinkle
Sep 07, 2004
teqDruid
Sep 11, 2004
Walter
Sep 07, 2004
Dave
September 07, 2004
I've been having an issue with the XML-RPC stuff I'm doing with Mango. The libraries being used are phobos, mango, and Andy's XML parser, in addition to my XmlRpc code.  I'm using DMD 0.98 on Linux.

Kris and I have been trying to figure out what's going on. (the
discussion is at: http://dsource.org/forums/viewtopic.php?t=338 if
you want to follow along)
We've narrowed it down using some strace output, and we think this is
causing the melt-down:
kill(6228, SIGUSR1) = 0
followed by:
rt_sigprocmask(SIG_SETMASK, NULL, [RTMIN], 8) = 0
rt_sigsuspend([]--- SIGRTMIN (Unknown signal 32) @ 0 (0) ---
) = -1 EINTR (Interrupted system call)
sigreturn()                             = ? (mask now [RTMIN])
rt_sigprocmask(SIG_SETMASK, NULL, [RTMIN], 8) = 0
rt_sigsuspend([]+++ killed by SIGKILL +++

The "killed by SIGKILL" is me doing a killall -s KILL, since that's the only way to kill it.  At this point in program execution, a certain number of POST requests have been made at the server, each one being sent after each reply.  At this point there appears to be 3 processes.  3228 is the main thread which launched the mango HTTP server.  It waits for me to hit enter, then dies, and takes everything else with it. (this is the "correct" behavoir".) There's PID 3229, which I think is the listener thread, then there's 3230 (which I think is a worker thread), from which the strace output above is from.

Once that kill is run by that process (at least we think that's the point it's occuring at) everything seems to hang.  The server won't accept connections, and the main thread won't die when I hit enter.

Here's what I don't know:
What's running the kill.  Why it's being run after a certain number of
requests.  Why a SIGUSR1 at the main process causes everything to hang.
BTW, the number of the requests seems to vary with what code is
running... if I comment out a line or two the number changes, although I
haven't been able to pick out a pattern.  It's been bloody impossible for
me to figure out what code is causing this, because everytime I comment
out some code, or put in some debug code, the point at which everything
dies changes.

Does anyone have any suggestions or possible leads?  (Please)

John
September 07, 2004
teqDruid wrote:

> I've been having an issue with the XML-RPC stuff I'm doing with Mango. The libraries being used are phobos, mango, and Andy's XML parser, in addition to my XmlRpc code.  I'm using DMD 0.98 on Linux.
> 
> Kris and I have been trying to figure out what's going on. (the
> discussion is at: http://dsource.org/forums/viewtopic.php?t=338 if
> you want to follow along)
> We've narrowed it down using some strace output, and we think this is
> causing the melt-down:
> kill(6228, SIGUSR1) = 0
> followed by:
> rt_sigprocmask(SIG_SETMASK, NULL, [RTMIN], 8) = 0
> rt_sigsuspend([]--- SIGRTMIN (Unknown signal 32) @ 0 (0) ---
> ) = -1 EINTR (Interrupted system call)
> sigreturn()                             = ? (mask now [RTMIN])
> rt_sigprocmask(SIG_SETMASK, NULL, [RTMIN], 8) = 0
> rt_sigsuspend([]+++ killed by SIGKILL +++
> 
> The "killed by SIGKILL" is me doing a killall -s KILL, since that's the only way to kill it.  At this point in program execution, a certain number of POST requests have been made at the server, each one being sent after each reply.  At this point there appears to be 3 processes.  3228 is the main thread which launched the mango HTTP server.  It waits for me to hit enter, then dies, and takes everything else with it. (this is the "correct" behavoir".) There's PID 3229, which I think is the listener thread, then there's 3230 (which I think is a worker thread), from which the strace output above is from.
> 
> Once that kill is run by that process (at least we think that's the point it's occuring at) everything seems to hang.  The server won't accept connections, and the main thread won't die when I hit enter.
> 
> Here's what I don't know:
> What's running the kill.  Why it's being run after a certain number of
> requests.  Why a SIGUSR1 at the main process causes everything to hang.
> BTW, the number of the requests seems to vary with what code is
> running... if I comment out a line or two the number changes, although I
> haven't been able to pick out a pattern.  It's been bloody impossible for
> me to figure out what code is causing this, because everytime I comment
> out some code, or put in some debug code, the point at which everything
> dies changes.
> 
> Does anyone have any suggestions or possible leads?  (Please)
> 
> John

I just checked and std.thread installs SIGUSER1 and SIGUSER2 handlers for thread pausing and resuming.
September 07, 2004
Bare in mind I don't know anything about Mango and the rest, but in phobos/std/thread.d SIGUSR1 is trapped to pause a thread.

Maybe one of the processes is doing a pthread_kill(tid,SIGUSR1) to pause the main thread as part of the socket polling process?

Just a thought..

- Dave

In article <pan.2004.09.07.01.04.06.41323@teqdruid.com>, teqDruid says...
>
>I've been having an issue with the XML-RPC stuff I'm doing with Mango. The libraries being used are phobos, mango, and Andy's XML parser, in addition to my XmlRpc code.  I'm using DMD 0.98 on Linux.
>
>Kris and I have been trying to figure out what's going on. (the
>discussion is at: http://dsource.org/forums/viewtopic.php?t=338 if
>you want to follow along)
>We've narrowed it down using some strace output, and we think this is
>causing the melt-down:
>kill(6228, SIGUSR1) = 0
>followed by:
>rt_sigprocmask(SIG_SETMASK, NULL, [RTMIN], 8) = 0
>rt_sigsuspend([]--- SIGRTMIN (Unknown signal 32) @ 0 (0) ---
>) = -1 EINTR (Interrupted system call)
>sigreturn()                             = ? (mask now [RTMIN])
>rt_sigprocmask(SIG_SETMASK, NULL, [RTMIN], 8) = 0
>rt_sigsuspend([]+++ killed by SIGKILL +++
>
>The "killed by SIGKILL" is me doing a killall -s KILL, since that's the only way to kill it.  At this point in program execution, a certain number of POST requests have been made at the server, each one being sent after each reply.  At this point there appears to be 3 processes.  3228 is the main thread which launched the mango HTTP server.  It waits for me to hit enter, then dies, and takes everything else with it. (this is the "correct" behavoir".) There's PID 3229, which I think is the listener thread, then there's 3230 (which I think is a worker thread), from which the strace output above is from.
>
>Once that kill is run by that process (at least we think that's the point it's occuring at) everything seems to hang.  The server won't accept connections, and the main thread won't die when I hit enter.
>
>Here's what I don't know:
>What's running the kill.  Why it's being run after a certain number of
>requests.  Why a SIGUSR1 at the main process causes everything to hang.
>BTW, the number of the requests seems to vary with what code is
>running... if I comment out a line or two the number changes, although I
>haven't been able to pick out a pattern.  It's been bloody impossible for
>me to figure out what code is causing this, because everytime I comment
>out some code, or put in some debug code, the point at which everything
>dies changes.
>
>Does anyone have any suggestions or possible leads?  (Please)
>
>John


September 07, 2004
"Ben Hinkle" <bhinkle4@juno.com>
I just checked and std.thread installs SIGUSER1 and SIGUSER2 handlers for
thread pausing and resuming.

==================

I have a sneaking suspicion that the GC paused that thread. What do you think, Ben?


September 07, 2004
On Mon, 06 Sep 2004 21:02:15 -0700, antiAlias wrote:

> "Ben Hinkle" <bhinkle4@juno.com>
> I just checked and std.thread installs SIGUSER1 and SIGUSER2 handlers for
> thread pausing and resuming.
> 
> ==================
> 
> I have a sneaking suspicion that the GC paused that thread. What do you think, Ben?

I did a:
std.gc.disable();

at the start of the program.  No effect.  I'm assuming that one doesn't have to call this in each thread.
September 07, 2004
teqDruid wrote:

> On Mon, 06 Sep 2004 21:02:15 -0700, antiAlias wrote:
> 
>> "Ben Hinkle" <bhinkle4@juno.com>
>> I just checked and std.thread installs SIGUSER1 and SIGUSER2 handlers for
>> thread pausing and resuming.
>> 
>> ==================
>> 
>> I have a sneaking suspicion that the GC paused that thread. What do you think, Ben?
> 
> I did a:
> std.gc.disable();
> 
> at the start of the program.  No effect.  I'm assuming that one doesn't have to call this in each thread.

interestingly enough I looked at internal/gc/gcx.d and calling disable() doesn't seem to actually disable the gc. It sets the flag but the flag is never checked.
September 07, 2004
OK... so how does one disable the GC?  Is it even possible, Walter?

John

On Tue, 07 Sep 2004 12:30:34 -0400, Ben Hinkle wrote:

> teqDruid wrote:
> 
>> On Mon, 06 Sep 2004 21:02:15 -0700, antiAlias wrote:
>> 
>>> "Ben Hinkle" <bhinkle4@juno.com>
>>> I just checked and std.thread installs SIGUSER1 and SIGUSER2 handlers for
>>> thread pausing and resuming.
>>> 
>>> ==================
>>> 
>>> I have a sneaking suspicion that the GC paused that thread. What do you think, Ben?
>> 
>> I did a:
>> std.gc.disable();
>> 
>> at the start of the program.  No effect.  I'm assuming that one doesn't have to call this in each thread.
> 
> interestingly enough I looked at internal/gc/gcx.d and calling disable() doesn't seem to actually disable the gc. It sets the flag but the flag is never checked.

September 11, 2004
It's just not implemented yet :-( the function is there for future use.

"teqDruid" <me@teqdruid.com> wrote in message news:pan.2004.09.07.19.49.52.444734@teqdruid.com...
> OK... so how does one disable the GC?  Is it even possible, Walter?
>
> John
>
> On Tue, 07 Sep 2004 12:30:34 -0400, Ben Hinkle wrote:
>
> > teqDruid wrote:
> >
> >> On Mon, 06 Sep 2004 21:02:15 -0700, antiAlias wrote:
> >>
> >>> "Ben Hinkle" <bhinkle4@juno.com>
> >>> I just checked and std.thread installs SIGUSER1 and SIGUSER2 handlers
for
> >>> thread pausing and resuming.
> >>>
> >>> ==================
> >>>
> >>> I have a sneaking suspicion that the GC paused that thread. What do
you
> >>> think, Ben?
> >>
> >> I did a:
> >> std.gc.disable();
> >>
> >> at the start of the program.  No effect.  I'm assuming that one doesn't have to call this in each thread.
> >
> > interestingly enough I looked at internal/gc/gcx.d and calling disable() doesn't seem to actually disable the gc. It sets the flag but the flag
is
> > never checked.
>