Jump to page: 1 2 3
Thread overview
Understanding SIGSEGV issues
Jan 03, 2019
Russel Winder
Jan 03, 2019
Nicholas Wilson
Jan 03, 2019
Russel Winder
Jan 03, 2019
Nicholas Wilson
Jan 05, 2019
Russel Winder
Jan 05, 2019
Nicholas Wilson
Jan 05, 2019
Russel Winder
Jan 05, 2019
Nicholas Wilson
Jan 05, 2019
Russel Winder
Jan 05, 2019
Nicholas Wilson
Jan 08, 2019
Russel Winder
Jan 08, 2019
Nicholas Wilson
Jan 09, 2019
Russel Winder
Jan 09, 2019
Johannes Loher
Jan 10, 2019
Russel Winder
Jan 10, 2019
Nicholas Wilson
Jan 10, 2019
Russel Winder
Jan 05, 2019
Russel Winder
Jan 09, 2019
Russel Winder
Jan 10, 2019
Russel Winder
Jan 10, 2019
H. S. Teoh
Re: Building Libraries in the face of API and ABI changes [was Understanding SIGSEGV issues]
Jan 11, 2019
Russel Winder
Jan 13, 2019
Jacob Carlborg
Re: Building Libraries in the face of API and ABI changes
Jan 14, 2019
Russel Winder
Jan 14, 2019
Jacob Carlborg
January 03, 2019
So I have a D program that used to work. I come back to it, recompile it, and:

|> dub run -- ~/lib/DigitalTelevision/DVBv5/uk-CrystalPalace__RW
Performing "debug" build using /usr/bin/ldc2 for x86_64.
libdvbv5_d 0.1.1: target for configuration "library" is up to date.
dvb-tune ~master: target for configuration "application" is up to date.
To force a rebuild of up-to-date targets, run again with --force.
Running ./bin/dvb-tune /home/users/russel/lib/DigitalTelevision/DVBv5/uk-CrystalPalace__RW
Device: Silicon Labs Si2168, adapter  0, frontend  0
Program exited with code -11

(gdb) r ~/lib/DigitalTelevision/DVBv5/uk-CrystalPalace__RW
Starting program: /home/users/russel/Repositories/Git/Masters/Public/DVBTune/bin/dvb-tune ~/lib/DigitalTelevision/DVBv5/uk-CrystalPalace__RW
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Device: Silicon Labs Si2168, adapter  0, frontend  0

Program received signal SIGSEGV, Segmentation fault.
__GI___libc_free (mem=0xa) at malloc.c:3093
3093	malloc.c: No such file or directory.
(gdb)

Can anyone give me any hints as to where to start even getting a glimmer of an understanding of WTF is going on?

-- 
Russel.
===========================================
Dr Russel Winder      t: +44 20 7585 2200
41 Buckmaster Road    m: +44 7770 465 077
London SW11 1EN, UK   w: www.russel.org.uk



January 03, 2019
On Thursday, 3 January 2019 at 06:25:46 UTC, Russel Winder wrote:
> So I have a D program that used to work. I come back to it, recompile it, and:
>
> [...]

> __GI___libc_free (mem=0xa) at malloc.c:3093

You've tried to free a pointer that, while not null, was derived from a pointer that was, i.e. an offset to a field of a struct.

A backtrace would help a lot, otherwise it really is just guessing.
January 03, 2019
On Thu, 2019-01-03 at 07:52 +0000, Nicholas Wilson via Digitalmars-d-learn wrote:
> On Thursday, 3 January 2019 at 06:25:46 UTC, Russel Winder wrote:
> > So I have a D program that used to work. I come back to it, recompile it, and:
> > 
> > [...]
> > __GI___libc_free (mem=0xa) at malloc.c:3093
> 
> You've tried to free a pointer that, while not null, was derived from a pointer that was, i.e. an offset to a field of a struct.
> 
> A backtrace would help a lot, otherwise it really is just guessing.


Sorry about that, fairly obvious that the backtrace is needed in hindsight. :- )

#0  __GI___libc_free (mem=0xa) at malloc.c:3093
#1  0x000055555558f174 in dvb_file_free (dvb_file=0x5555555a1320) at dvb_file.d:282
#2  0x000055555558edcc in types.File_Ptr.~this() (this=...) at types.d:83
#3  0x0000555555574809 in channels.TransmitterData.__fieldDtor() (this=<error reading variable: Cannot access memory at address 0xa>) at channels.d:144
#4  0x000055555556aeda in channels.TransmitterData.__aggrDtor() (this=...) at channels.d:144
#5  0x000055555556ab53 in D main (args=...) at main.d:33

Which indicates that the destructor is being called before the instance has been constructed. Which is a real WTF.


-- 
Russel.
===========================================
Dr Russel Winder      t: +44 20 7585 2200
41 Buckmaster Road    m: +44 7770 465 077
London SW11 1EN, UK   w: www.russel.org.uk



January 03, 2019
On Thursday, 3 January 2019 at 08:35:17 UTC, Russel Winder wrote:
> Sorry about that, fairly obvious that the backtrace is needed in hindsight. :- )
>
> #0  __GI___libc_free (mem=0xa) at malloc.c:3093
> #1  0x000055555558f174 in dvb_file_free (dvb_file=0x5555555a1320) at dvb_file.d:282
> #2  0x000055555558edcc in types.File_Ptr.~this() (this=...) at types.d:83
> #3  0x0000555555574809 in channels.TransmitterData.__fieldDtor() (this=<error reading variable: Cannot access memory at address 0xa>) at channels.d:144
> #4  0x000055555556aeda in channels.TransmitterData.__aggrDtor() (this=...) at channels.d:144
> #5  0x000055555556ab53 in D main (args=...) at main.d:33
>
> Which indicates that the destructor is being called before the instance has been constructed. Which is a real WTF.

Not quite, this occurs as a TransmitterData object goes out of scope at the end of main(stick a fflush'ed printf there to see):

TransmitterData is a struct that has no destructor defined but has a field of type File_Ptr that does. The compiler generates a destructor, __aggrDtor, which calls the fields that have destructors, __fieldDtor (e.g. the File_Ptr) which in turn calls its destructor, File_Ptr.~this().

As you can see from the stack trace #3, the File_Ptr is null. The solution to this is to either ensure it is initialised in the constructor of TransmitterData, or account for it possibly being null by defining a destructor for TransmitterData.
January 05, 2019
On Thu, 2019-01-03 at 11:23 +0000, Nicholas Wilson via Digitalmars-d-learn wrote:
> On Thursday, 3 January 2019 at 08:35:17 UTC, Russel Winder wrote:
> > Sorry about that, fairly obvious that the backtrace is needed in hindsight. :- )
> > 
> > #0  __GI___libc_free (mem=0xa) at malloc.c:3093
> > #1  0x000055555558f174 in dvb_file_free
> > (dvb_file=0x5555555a1320) at dvb_file.d:282
> > #2  0x000055555558edcc in types.File_Ptr.~this() (this=...) at
> > types.d:83
> > #3  0x0000555555574809 in
> > channels.TransmitterData.__fieldDtor() (this=<error reading
> > variable: Cannot access memory at address 0xa>) at
> > channels.d:144
> > #4  0x000055555556aeda in channels.TransmitterData.__aggrDtor()
> > (this=...) at channels.d:144
> > #5  0x000055555556ab53 in D main (args=...) at main.d:33
> > 
> > Which indicates that the destructor is being called before the instance has been constructed. Which is a real WTF.
> 
> Not quite, this occurs as a TransmitterData object goes out of scope at the end of main(stick a fflush'ed printf there to see):

I am not sure this analysis is correct. The code never reaches the end of main.

> TransmitterData is a struct that has no destructor defined but has a field of type File_Ptr that does. The compiler generates a destructor, __aggrDtor, which calls the fields that have destructors, __fieldDtor (e.g. the File_Ptr) which in turn calls its destructor, File_Ptr.~this().

TransmitterData has a destructor defined but with no code in it. This used to work fine – but I cannot be certain which version of LDC that was.

The problem does seem to be in the construction of the TransmitterData object because a destructor is being called on the File_Ptr field as part of the transmitterData constructor.

> As you can see from the stack trace #3, the File_Ptr is null. The solution to this is to either ensure it is initialised in the constructor of TransmitterData, or account for it possibly being null by defining a destructor for TransmitterData.

For some reason it seems File_Ptr.~this() is being called before
File_Ptr.this() in the TransmitterData.this(). This is totally weird.

Having added some writeln statements:

(gdb) bt
#0  0x00005555555932e0 in dvb_file_free (dvb_file=0x0) at dvb_file.d:276
#1  0x0000555555592fbc in types.File_Ptr.~this() (this=...) at types.d:83
#2  0x000055555558cdf6 in _D3std6format__T14formattedWriteTSQBg5stdio4File17LockingTextWriterTaTS5types8File_PtrZQCtFKQChxAaQBcZk (w=..., fmt=..., _param_2=...) at /usr/lib/ldc/x86_64-linux-gnu/include/d/std/format.d:472
#3  0x000055555558c6a8 in _D3std5stdio4File__T5writeTAyaTS5types8File_PtrTaZQBeMFQBcQBbaZv (this=..., _param_0=..., _param_1=..., _param_2=10 '\n') at channels.d:1586
#4  0x00005555555749ce in _D3std5stdio__T7writelnTAyaTS5types8File_PtrZQBeFQzQxZv (_param_0=..., _param_1=...) at channels.d:3917
#5  0x000055555556af49 in _D8channels15TransmitterData6__ctorMFNcxAyaxkxE10libdvbv5_d8dvb_file16dvb_file_formatsZSQDiQDc (this=..., path=..., delsys=0, format=libdvbv5_d.dvb_file.dvb_file_formats.FILE_DVBV5) at channels.d:143
#6  0x000055555556aa9c in D main (args=...) at main.d:34


-- 
Russel.
===========================================
Dr Russel Winder      t: +44 20 7585 2200
41 Buckmaster Road    m: +44 7770 465 077
London SW11 1EN, UK   w: www.russel.org.uk



January 05, 2019
On Saturday, 5 January 2019 at 07:34:17 UTC, Russel Winder wrote:
> TransmitterData has a destructor defined but with no code in it. This used to work fine – but I cannot be certain which version of LDC that was.
>
> The problem does seem to be in the construction of the TransmitterData object because a destructor is being called on the File_Ptr field as part of the transmitterData constructor.
>
>> As you can see from the stack trace #3, the File_Ptr is null. The solution to this is to either ensure it is initialised in the constructor of TransmitterData, or account for it possibly being null by defining a destructor for TransmitterData.
>
> For some reason it seems File_Ptr.~this() is being called before
> File_Ptr.this() in the TransmitterData.this(). This is totally weird.
>
> Having added some writeln statements:
>
> (gdb) bt
> #0  0x00005555555932e0 in dvb_file_free (dvb_file=0x0) at dvb_file.d:276
> #1  0x0000555555592fbc in types.File_Ptr.~this() (this=...) at types.d:83
> #2  0x000055555558cdf6 in _D3std6format__T14formattedWriteTSQBg5stdio4File17LockingTextWriterTaTS5types8File_PtrZQCtFKQChxAaQBcZk (w=..., fmt=..., _param_2=...) at /usr/lib/ldc/x86_64-linux-gnu/include/d/std/format.d:472

Maybe it is a problem with copying a File_Ptr (e.g. missing a increase of the reference count)? Like, `auto a = File_Ptr(); { auto b = a; }` and b calls the destructor on scope exit.
That would be consistent with having problems copying to object to pass to writeln.
January 05, 2019
On Sat, 2019-01-05 at 10:31 +0000, Nicholas Wilson via Digitalmars-d-learn
wrote:
[…]
> 
> Maybe it is a problem with copying a File_Ptr (e.g. missing a
> increase of the reference count)? Like, `auto a = File_Ptr(); {
> auto b = a; }` and b calls the destructor on scope exit.
> That would be consistent with having problems copying to object
> to pass to writeln.

I found the problem and then two minutes later read your email and bingo we have found the problem.

Previously I had used File_Ptr* and on this occasion I was using File_Ptr and there was no copy constructor because I have @disable this(this). Except that clearly copying a value is not copying a value in this case. Clearly this situation is what is causing the destructor to be called on an unconstructed value. But I have no idea why.

The question now, of course, is should I have been using File_Ptr instead of File_Ptr* in the first place. I am beginning to think I should have been. More thinking needed.

-- 
Russel.
===========================================
Dr Russel Winder      t: +44 20 7585 2200
41 Buckmaster Road    m: +44 7770 465 077
London SW11 1EN, UK   w: www.russel.org.uk



January 05, 2019
On Saturday, 5 January 2019 at 10:52:48 UTC, Russel Winder wrote:
> I found the problem and then two minutes later read your email and bingo we have found the problem.

Well done.

> Previously I had used File_Ptr* and on this occasion I was using File_Ptr and there was no copy constructor because I have @disable this(this). Except that clearly copying a value is not copying a value in this case. Clearly this situation is what is causing the destructor to be called on an unconstructed value. But I have no idea why.

Could you post a minimised example? Its a bit hard to guess without one.

> The question now, of course, is should I have been using File_Ptr instead of File_Ptr* in the first place. I am beginning to think I should have been. More thinking needed.

From the name, File_Ptr sounds like it is wrapping a reference to a resource. So compare with C's FILE/ D's File which is a reference counted wrapper of a FILE*. Would you ever use a File* (or a FILE**)? Probably not, I never have.

January 05, 2019
On Sat, 2019-01-05 at 10:52 +0000, Russel Winder wrote:
> On Sat, 2019-01-05 at 10:31 +0000, Nicholas Wilson via Digitalmars-d-learn
> wrote:
> […]
> > Maybe it is a problem with copying a File_Ptr (e.g. missing a
> > increase of the reference count)? Like, `auto a = File_Ptr(); {
> > auto b = a; }` and b calls the destructor on scope exit.
> > That would be consistent with having problems copying to object
> > to pass to writeln.
> 
> I found the problem and then two minutes later read your email and bingo we have found the problem.
> 
> Previously I had used File_Ptr* and on this occasion I was using File_Ptr
> and
> there was no copy constructor because I have @disable this(this). Except
> that
> clearly copying a value is not copying a value in this case. Clearly this
> situation is what is causing the destructor to be called on an unconstructed
> value. But I have no idea why.
> 
> The question now, of course, is should I have been using File_Ptr instead of
> File_Ptr* in the first place. I am beginning to think I should have been.
> More
> thinking needed.

Switching to using File_Ptr* I now get the SIGSEGV at the end of main as you were thinking before. Oh f###.

This code used to work. :-(

-- 
Russel.
===========================================
Dr Russel Winder      t: +44 20 7585 2200
41 Buckmaster Road    m: +44 7770 465 077
London SW11 1EN, UK   w: www.russel.org.uk



January 05, 2019
On Sat, 2019-01-05 at 11:30 +0000, Nicholas Wilson via Digitalmars-d-learn wrote:
> 
[…]
> Could you post a minimised example? Its a bit hard to guess without one.

Indeed. I should do that to see if I can reproduce the problem to submit a proper bug report.

[…]
>  From the name, File_Ptr sounds like it is wrapping a reference to
> a resource. So compare with C's FILE/ D's File which is a
> reference counted wrapper of a FILE*. Would you ever use a File*
> (or a FILE**)? Probably not, I never have.

File_Ptr is wrapping a dvb_file * from libdvbv5 to try and make things a bit for D and to ensure RAII. libdvbv5 is a C API with classic C approach to handling objects and data structures.

My DStep/with manual binding is at https://github.com/russel/libdvbv5_d and the application using it (which is causing the problems) is at https://github.com/russel/DVBTune

I have a feeling that I am really not doing things in a D idiomatic way.

-- 
Russel.
===========================================
Dr Russel Winder      t: +44 20 7585 2200
41 Buckmaster Road    m: +44 7770 465 077
London SW11 1EN, UK   w: www.russel.org.uk



« First   ‹ Prev
1 2 3