View mode: basic / threaded / horizontal-split · Log in · Help
December 02, 2012
Need help with debugging Segfault
Greetings

I have a code that crashes with current github dmd snapshot with a
segfault. It compiles and runs fine with the released versions of DMD. I am
using lots of structs and classes in the code and I believe the problem
could be related with the other issues that are there with structs. When I
give the compiled binary to valgrind, the report suggests that the segfault
might be related to garbage collector. I wanted to isolate the issue and
report it on bugzilla. Can somebody help me with the valgrind trace below
here, and guide me where to look for the problem? My actual code is
thousands of lines big and I am at a loss as to how I should isolate and
report this issue.

Regards
- Puneet

==4453== Invalid read of size 8
==4453==    at 0x44EFF5: _D4nett5mule5Mule3esl6__dtorMFZv
(../src/nett/mule.d:115)
==4453==    by 0x471124: _D4nett5mule5Mule9EslDomain11__fieldDtorMFZv
(../src/nett/mule.d:7923)
==4453==    by 0x4AD845: rt_finalize2 (in
/home/pgoel/mule/examples/test_code)
==4453==    by 0x4ABC92: _D2gc3gcx3Gcx11fullcollectMFZm (in
/home/pgoel/mule/examples/test_code)
==4453==    by 0x4A9BBA: _D2gc3gcx2GC18fullCollectNoStackMFZv (in
/home/pgoel/mule/examples/test_code)
==4453==    by 0x4A7F2C: gc_term (in /home/pgoel/mule/examples/test_code)
==4453==    by 0x48661B: _D2rt6dmain211_d_run_mainUiPPaPUAAaZiZi6runAllMFZv
(in /home/pgoel/mule/examples/test_code)
==4453==    by 0x4860F5:
_D2rt6dmain211_d_run_mainUiPPaPUAAaZiZi7tryExecMFMDFZvZv (in
/home/pgoel/mule/examples/test_code)
==4453==    by 0x4860B1: _d_run_main (in
/home/pgoel/mule/examples/test_code)
==4453==    by 0x485EF2: main (in /home/pgoel/mule/examples/test_code)
==4453==  Address 0x2c0 is not stack'd, malloc'd or (recently) free'd
December 02, 2012
Re: Need help with debugging Segfault
On 12/01/2012 08:44 PM, d coder wrote:

> ==4453== Invalid read of size 8
> ==4453==    at 0x44EFF5: _D4nett5mule5Mule3esl6__dtorMFZv
> (../src/nett/mule.d:115)

Are you accessing any resource in Mule's destructor, which is maintained 
by the GC? If so, it is possible that that resource has already been 
finalized.

The destruction order of GC-maintained resuorces is not deterministic as 
e.g. in C++. It is quite possible that the member of an object is 
destroyed before the object itself.

Ali
December 02, 2012
Re: Need help with debugging Segfault
On Sunday, 2 December 2012 at 04:45:13 UTC, d coder wrote:
> Greetings
>
> I have a code that crashes with current github dmd snapshot 
> with a
> segfault. It compiles and runs fine with the released versions 
> of DMD. I am
> using lots of structs and classes in the code and I believe the 
> problem
> could be related with the other issues that are there with 
> structs. When I
> give the compiled binary to valgrind, the report suggests that 
> the segfault
> might be related to garbage collector. I wanted to isolate the 
> issue and
> report it on bugzilla. Can somebody help me with the valgrind 
> trace below
> here, and guide me where to look for the problem? My actual 
> code is
> thousands of lines big and I am at a loss as to how I should 
> isolate and
> report this issue.
>
> Regards
> - Puneet
>
> ==4453== Invalid read of size 8
> ==4453==    at 0x44EFF5: _D4nett5mule5Mule3esl6__dtorMFZv
> (../src/nett/mule.d:115)
> ==4453==    by 0x471124: 
> _D4nett5mule5Mule9EslDomain11__fieldDtorMFZv
> (../src/nett/mule.d:7923)
> ==4453==    by 0x4AD845: rt_finalize2 (in
> /home/pgoel/mule/examples/test_code)
> ==4453==    by 0x4ABC92: _D2gc3gcx3Gcx11fullcollectMFZm (in
> /home/pgoel/mule/examples/test_code)
> ==4453==    by 0x4A9BBA: _D2gc3gcx2GC18fullCollectNoStackMFZv 
> (in
> /home/pgoel/mule/examples/test_code)
> ==4453==    by 0x4A7F2C: gc_term (in 
> /home/pgoel/mule/examples/test_code)
> ==4453==    by 0x48661B: 
> _D2rt6dmain211_d_run_mainUiPPaPUAAaZiZi6runAllMFZv
> (in /home/pgoel/mule/examples/test_code)
> ==4453==    by 0x4860F5:
> _D2rt6dmain211_d_run_mainUiPPaPUAAaZiZi7tryExecMFMDFZvZv (in
> /home/pgoel/mule/examples/test_code)
> ==4453==    by 0x4860B1: _d_run_main (in
> /home/pgoel/mule/examples/test_code)
> ==4453==    by 0x485EF2: main (in 
> /home/pgoel/mule/examples/test_code)
> ==4453==  Address 0x2c0 is not stack'd, malloc'd or (recently) 
> free'd

In addition to accessing reclaimed by GC objects in class 
destructors you may encounter segfaults with structs 
(http://forum.dlang.org/thread/50B3859D.7060900@webdrake.net  and 
http://forum.dlang.org/thread/mailman.2410.1354281296.5162.digitalmars-d@puremagic.com).
December 02, 2012
Re: Need help with debugging Segfault
On Sunday, 2 December 2012 at 04:45:13 UTC, d coder wrote:
> Greetings
>
> I have a code that crashes with current github dmd snapshot 
> with a
> segfault. It compiles and runs fine with the released versions 
> of DMD. I am
> using lots of structs and classes in the code and I believe the 
> problem
> could be related with the other issues that are there with 
> structs. When I
> give the compiled binary to valgrind, the report suggests that 
> the segfault
> might be related to garbage collector. I wanted to isolate the 
> issue and
> report it on bugzilla. Can somebody help me with the valgrind 
> trace below
> here, and guide me where to look for the problem? My actual 
> code is
> thousands of lines big and I am at a loss as to how I should 
> isolate and
> report this issue.
>
> Regards
> - Puneet
>
> ==4453== Invalid read of size 8
> ==4453==    at 0x44EFF5: _D4nett5mule5Mule3esl6__dtorMFZv
> (../src/nett/mule.d:115)
> ==4453==    by 0x471124: 
> _D4nett5mule5Mule9EslDomain11__fieldDtorMFZv
> (../src/nett/mule.d:7923)
> ==4453==    by 0x4AD845: rt_finalize2 (in
> /home/pgoel/mule/examples/test_code)
> ==4453==    by 0x4ABC92: _D2gc3gcx3Gcx11fullcollectMFZm (in
> /home/pgoel/mule/examples/test_code)
> ==4453==    by 0x4A9BBA: _D2gc3gcx2GC18fullCollectNoStackMFZv 
> (in
> /home/pgoel/mule/examples/test_code)
> ==4453==    by 0x4A7F2C: gc_term (in 
> /home/pgoel/mule/examples/test_code)
> ==4453==    by 0x48661B: 
> _D2rt6dmain211_d_run_mainUiPPaPUAAaZiZi6runAllMFZv
> (in /home/pgoel/mule/examples/test_code)
> ==4453==    by 0x4860F5:
> _D2rt6dmain211_d_run_mainUiPPaPUAAaZiZi7tryExecMFMDFZvZv (in
> /home/pgoel/mule/examples/test_code)
> ==4453==    by 0x4860B1: _d_run_main (in
> /home/pgoel/mule/examples/test_code)
> ==4453==    by 0x485EF2: main (in 
> /home/pgoel/mule/examples/test_code)
> ==4453==  Address 0x2c0 is not stack'd, malloc'd or (recently) 
> free'd

You have two pretty powerful tools to help you isolate and debug, 
they are not advertised enough imho:

1) DustMite a tool which allows to automatically reduce test 
cases. It has been used with success several times here.

2) (I haven't tried it yet, to tell the truth, but it DOES look 
pretty powerful. I'm really surprised noone seems interested, as 
it should allow one to reproduce a bug exactly, - even on another 
computer -, down to threading and GC issues), the tool (or rather 
I should say debugging environment) I mentionned here: 
http://forum.dlang.org/post/ymfxuozenafnsvuipnjr@forum.dlang.org 
(deterministic replay engine)

Maybe thses can help.
December 03, 2012
Re: Need help with debugging Segfault
> 1) DustMite a tool which allows to automatically reduce test cases. It has
> been used with success several times here.
>

Awesome!

The tool took some 2 hours to reduce my testcase to less than 50 lines.

I have filed a regression.
http://d.puremagic.com/issues/show_bug.cgi?id=9111
December 03, 2012
Re: Need help with debugging Segfault
12/3/2012 12:56 PM, d coder пишет:
>
>     1) DustMite a tool which allows to automatically reduce test cases.
>     It has been used with success several times here.
>
>
> Awesome!
>
> The tool took some 2 hours to reduce my testcase to less than 50 lines.
>
> I have filed a regression.
> http://d.puremagic.com/issues/show_bug.cgi?id=9111

But it's not a bug. Like Ali said:

The destruction order of GC-maintained resources is not deterministic as 
e.g. in C++. It is quite possible that the member of an object is 
destroyed before the object itself.

-- 
Dmitry Olshansky
December 03, 2012
Re: Need help with debugging Segfault
> But it's not a bug. Like Ali said:
>
> The destruction order of GC-maintained resources is not deterministic as
> e.g. in C++. It is quite possible that the member of an object is destroyed
> before the object itself.


Oops. I get it now.

What should be done to avoid this situation? I think I need to add a
destructor for the parent object class that would make sure that such child
objects (that need the parent to be alive during GC process) are destroyed
before the GC process kicks in. Would that be sufficient or would it again
group such GC processes and still keep the sequence indeterminable? In that
case, I will need to introduce a finalize() function which needs to be
called explicitly.

Thanks and Regards
- Puneet
Top | Discussion index | About this forum | D home