Thread overview
thread and gc.fullCollect
Jan 12, 2007
Ant
Jan 12, 2007
Sean Kelly
Jan 12, 2007
Ant
Jan 12, 2007
Ant
Jan 12, 2007
Ant
Jan 12, 2007
Bradley Smith
Jan 12, 2007
Sean Kelly
Jan 12, 2007
Sean Kelly
January 12, 2007
I don't get this.

If I run this program count is printed about 50 times
and takes 5 seconds to complete as expected.

but if uncomment line 19 count is printed only twice and the program exist almost immediately...

Why is that?

Ant

private import std.gc;
private import std.thread;
private import std.c.time;
private import std.stdio;

class Foo
{
	int count;
	this()
	{
		Thread ft = new Thread(&foo);

		ft.start();
		std.gc.fullCollect();	// ok
		
		while ( ft.getState() != Thread.TS.TERMINATED )
		{
// line 19		std.gc.fullCollect();
			writefln("Count = %s ", count++);
			std.c.time.usleep(100000);
		}
	}
	int foo()
	{
		std.c.time.sleep(5);
		return true;
	}
}

void main()
{
	new Foo();
}
January 12, 2007
Ant wrote:
> I don't get this.
> 
> If I run this program count is printed about 50 times
> and takes 5 seconds to complete as expected.
> 
> but if uncomment line 19 count is printed only twice and the program exist almost immediately...
> 
> Why is that?

I'd guess that a segfault is occurring.  This doesn't seem like intended behavior.


Sean
January 12, 2007
Sean Kelly wrote:
> Ant wrote:
>> I don't get this.
>>
>> If I run this program count is printed about 50 times
>> and takes 5 seconds to complete as expected.
>>
>> but if uncomment line 19 count is printed only twice and the program exist almost immediately...
>>
>> Why is that?
> 
> I'd guess that a segfault is occurring.  This doesn't seem like intended behavior.
> 
> 
> Sean

here is gdb catching a SIGUSR1. does t make sense?

Ant

$ gdb ./tt
GNU gdb 6.4.90-debian
Copyright (C) 2006 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i486-linux-gnu"...Using host libthread_db library "/lib/tls/i686/cmov/libthread_db.so.1".

(gdb) r
Starting program: /tt
[Thread debugging using libthread_db enabled]
[New Thread -1210354000 (LWP 10783)]
[New Thread -1211405408 (LWP 10786)]

Program received signal SIGUSR1, User defined signal 1.
[Switching to Thread -1211405408 (LWP 10786)]
0xb7e85508 in clone () from /lib/tls/i686/cmov/libc.so.6
(gdb)
January 12, 2007
Ant wrote:
complete run on gdb:

$ gdb ./tt
GNU gdb 6.4.90-debian
Copyright (C) 2006 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i486-linux-gnu"...Using host libthread_db library "/lib/tls/i686/cmov/libthread_db.so.1".

(gdb) r
Starting program: /tt
[Thread debugging using libthread_db enabled]
[New Thread -1210247504 (LWP 11551)]
[New Thread -1211298912 (LWP 11554)]

Program received signal SIGUSR1, User defined signal 1.
[Switching to Thread -1211298912 (LWP 11554)]
0xb7e9f508 in clone () from /lib/tls/i686/cmov/libc.so.6
(gdb) c
Continuing.

Program received signal SIGUSR2, User defined signal 2.
0xffffe410 in __kernel_vsyscall ()
(gdb) c
Continuing.

Program received signal SIGUSR1, User defined signal 1.
0xb7e9f508 in clone () from /lib/tls/i686/cmov/libc.so.6
(gdb) c
Continuing.

Program received signal SIGUSR2, User defined signal 2.
0xffffe410 in __kernel_vsyscall ()
(gdb) c
Continuing.
Count = 0
Count = 1

Program received signal SIGUSR1, User defined signal 1.
0xffffe410 in __kernel_vsyscall ()
(gdb) c
Continuing.

Program received signal SIGUSR2, User defined signal 2.
0xffffe410 in __kernel_vsyscall ()
(gdb) c
Continuing.
[Thread -1211298912 (LWP 11554) exited]

Program exited normally.
(gdb)
January 12, 2007
You need to configure gdb in that way it does not stop at SIGUSR1/2. These signals are used by the GC to pause the program.

in the GDB command line use these commands:
handle SIGUSR1 nostop noprint
handle SIGUSR2 nostop noprint


then run the program.
January 12, 2007
Frank Benoit (keinfarbton) wrote:
> You need to configure gdb in that way it does not stop at SIGUSR1/2.
> These signals are used by the GC to pause the program.
> 
> in the GDB command line use these commands:
> handle SIGUSR1 nostop noprint
> handle SIGUSR2 nostop noprint
> 
> 
> then run the program.

ah...

so the program just prints count twice and exists normally.

is this a bug?

Ant


(gdb) handle SIGUSR1 nostop noprint
Signal        Stop      Print   Pass to program Description
SIGUSR1       No        No      Yes             User defined signal 1
(gdb) handle SIGUSR2 nostop noprint
Signal        Stop      Print   Pass to program Description
SIGUSR2       No        No      Yes             User defined signal 2
(gdb) r
Starting program: /tt
[Thread debugging using libthread_db enabled]
[New Thread -1210354000 (LWP 12324)]
[New Thread -1211405408 (LWP 12325)]
Count = 0
Count = 1
[Thread -1211405408 (LWP 12325) exited]

Program exited normally.
(gdb)
January 12, 2007
> Program exited normally.

Now you know, there was not segmentation fault. Fine :)

Hm, perhaps your class was collected while running?
January 12, 2007
I don't know anyway to check this, but won't the garbage collection interrupt the sleeping thread? I would expect the std.c.time.sleep() call to be interrupted when the garbage collector stops all threads to operate. Is there anyway to check whether the sleep call was interrupted?

Frank Benoit (keinfarbton) wrote:
>> Program exited normally.
> 
> Now you know, there was not segmentation fault. Fine :)
> 
> Hm, perhaps your class was collected while running?
January 12, 2007
Ant wrote:
> Ant wrote:
> complete run on gdb:
> 
> $ gdb ./tt
> GNU gdb 6.4.90-debian
> Copyright (C) 2006 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and you are
> welcome to change it and/or distribute copies of it under certain conditions.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB.  Type "show warranty" for details.
> This GDB was configured as "i486-linux-gnu"...Using host libthread_db library "/lib/tls/i686/cmov/libthread_db.so.1".
> 
> (gdb) r
> Starting program: /tt
> [Thread debugging using libthread_db enabled]
> [New Thread -1210247504 (LWP 11551)]
> [New Thread -1211298912 (LWP 11554)]
> 
> Program received signal SIGUSR1, User defined signal 1.
> [Switching to Thread -1211298912 (LWP 11554)]
> 0xb7e9f508 in clone () from /lib/tls/i686/cmov/libc.so.6
> (gdb) c
> Continuing.
> 
> Program received signal SIGUSR2, User defined signal 2.
> 0xffffe410 in __kernel_vsyscall ()
> (gdb) c
> Continuing.
> 
> Program received signal SIGUSR1, User defined signal 1.
> 0xb7e9f508 in clone () from /lib/tls/i686/cmov/libc.so.6
> (gdb) c
> Continuing.
> 
> Program received signal SIGUSR2, User defined signal 2.
> 0xffffe410 in __kernel_vsyscall ()
> (gdb) c
> Continuing.
> Count = 0
> Count = 1
> 
> Program received signal SIGUSR1, User defined signal 1.
> 0xffffe410 in __kernel_vsyscall ()
> (gdb) c
> Continuing.
> 
> Program received signal SIGUSR2, User defined signal 2.
> 0xffffe410 in __kernel_vsyscall ()
> (gdb) c
> Continuing.
> [Thread -1211298912 (LWP 11554) exited]
> 
> Program exited normally.
> (gdb)

I know what's happening.  The sleep function is being interrupted by SIGUSR1, which is sent to indicate the beginning of a collection.  Then when the thread is resumed via SIGUSR2 it simply exits, since sleep was canceled.  The only real alternative on Unix would be to get the system time before sleeping, compare it against the time after sleeping, and resume sleeping if the thread hasn't slept long enough.


Sean
January 12, 2007
Bradley Smith wrote:
> I don't know anyway to check this, but won't the garbage collection interrupt the sleeping thread? I would expect the std.c.time.sleep() call to be interrupted when the garbage collector stops all threads to operate. Is there anyway to check whether the sleep call was interrupted?

Yes.  And it's actually easier than I thought.  Here's the pertinent quote from the Posix spec:

"If sleep() returns because the requested time has elapsed, the value returned shall be 0. If sleep() returns due to delivery of a signal, the return value shall be the "unslept" amount (the requested time minus the time actually slept) in seconds."

I don't know how I missed this the last time I read the spec--this is an issue I was aware of and wasn't sure how to solve efficiently.  I'm going to fix it in Tango right now :-p


Sean