Thread overview | |||||||
---|---|---|---|---|---|---|---|
|
September 03, 2012 Trouble understanding crash when class is returned by value from C++ | ||||
---|---|---|---|---|
| ||||
I'm trying to add at least *some* type of pass-by-value support for C++ classes when wrapping C++ libraries to D. I figured I could fake a value class by using a D struct with a thunk field which matches the size of the C++ object. Returning a C++ object by value works in this plain C++ example (using g++ on win32): test.cpp: http://codepad.org/55pttk3I $ g++ -m32 -g test.cpp -o main.exe -lstdc++ $ main.exe If I take the same code but remove main and instead use a D driver app like so: test.cpp: http://codepad.org/ZqieSXrb main.d: http://codepad.org/6E5sbc7e I compile it: $ g++ -m32 -g -c ./test.cpp -o test.obj $ gdc -m32 -g main.d test.obj -o main.exe -lstdc++ $ main.exe and then I get a crash: The instruction at "0x6fc8ea39" referenced memory at "0x006f6f62". The memory could not be "read". GDB tells me: Program received signal SIGSEGV, Segmentation fault. 0x6fc8ea39 in libstdc++-6!_ZNSsC1ERKSs () from C:\MinGW\bin\libstdc++-6.dll If I replace the std::string field with an ordinary 'char*' the crash is gone, so my wild guess is the crash happens in one of std::string's special member functions (ctor/dtor/etc..). C++ sizeof() tells me FileName is 4 bytes long, so I've matched that in the fake D struct. If I increase the 'thunk' field to 9 bytes the crash disappears. I have a hunch stack corruption might be to blame. I can notice some difference in the ASM listings: C++ plain sample: http://pastebin.com/xw3BhwwR D driver sample: http://pastebin.com/TLa8k5A3 The suspicious thing there is the missing LEA instruction in the D listing. If I change the thunk field to 9 bytes the LEA instruction appears again (and this is when the crash disappears). My ASM-foo is really weak though, so I don't know what any of this means. Anyone know what's going on? |
September 03, 2012 Re: Trouble understanding crash when class is returned by value from C++ | ||||
---|---|---|---|---|
| ||||
Posted in reply to Andrej Mitrovic | My best guess, is the issue is related to the struct being 4 bytes. A similar segfault occurs if you attempt to access in a similar manner using c++. A 4 byte struct will fit into a single register making pointers unnecessary/slower and it's likely some part of the ABI has taken this into consideration and the compiler is optimizing access to this. However, I would imagine that such optimizations would not be allowed with C++ and so by using a class it requires a pointer type and not the optimized struct. The following returns the value of 4 when I inspect the variable refValue instead of the correct address and segfaults. http://codepad.org/eepFTfbX |
September 03, 2012 Re: Trouble understanding crash when class is returned by value from C++ | ||||
---|---|---|---|---|
| ||||
Posted in reply to Daniel Green | On 3 September 2012 16:52, Daniel Green <venix1@gmail.com> wrote: > My best guess, is the issue is related to the struct being 4 bytes. > A similar segfault occurs if you attempt to access in a similar manner using > c++. > > A 4 byte struct will fit into a single register making pointers unnecessary/slower and it's likely some part of the ABI has taken this into consideration and the compiler is optimizing access to this. > > However, I would imagine that such optimizations would not be allowed with C++ and so by using a class it requires a pointer type and not the optimized struct. > > The following returns the value of 4 when I inspect the variable refValue instead of the correct address and segfaults. > > http://codepad.org/eepFTfbX Indeed, C++ classes are always passed in memory by design. Whereas pointers could be passed in registers. The difference between ABI handling of void* and FileName* here matter a lot. And this is one reason why you need to ensure that function signatures match in both D and C/C++ code. extern "C" FileName value_FileName(void* refVal) { return *(FileName*)refVal; } By the way, why extern "C" when extern (C++) works just fine? :-) Regards -- Iain Buclaw *(p < e ? p++ : p) = (c & 0x0f) + '0'; |
September 03, 2012 Re: Trouble understanding crash when class is returned by value from C++ | ||||
---|---|---|---|---|
| ||||
On 9/3/12, Iain Buclaw <ibuclaw@ubuntu.com> wrote: > Indeed, C++ classes are always passed in memory by design. Whereas pointers could be passed in registers. That's cool. I learn something new every day. :) > And this is one > reason why you need to ensure that function signatures match in both D > and C/C++ code. Yeah that's doable when the type is a POD but when it's a class returned by value there is no equivalent in D since D classes are always references, so I can't match the D function signature to the C one. > extern "C" > FileName value_FileName(void* refVal) > { > return *(FileName*)refVal; > } That won't work either since FileName is still in the return type and I can't match the function signature on the D side (it still crashes). The only thing I can think of is to match the C++ function signature to the D side via something like: C++: class FileName { ... } // same as before struct Fake { char __thunk[4]; }; Fake value_FileName(void* refVal) { return *(Fake*)(&(*(FileName*)refVal)); } It's ugly but it does seem to work and matches the D function signature. It would be a lot simpler if the return type was castable to (char[4]), but C/++ doesn't support returning arrays by value. :) > By the way, why extern "C" when extern (C++) works just fine? :-) I'm working on a codegenerator which uses C as the glue language, similar to SWIG. But the plan is to support more features than SWIG and have a faster and less memory-intensive cross-language virtual method invocation mechanism. Unlike SWIG I support passing PODs by value, but passing non-POD classes by value was problematic and I can see now why. Thanks for your help guys! |
September 03, 2012 Re: Trouble understanding crash when class is returned by value from C++ | ||||
---|---|---|---|---|
| ||||
On 3 September 2012 18:15, Andrej Mitrovic <andrej.mitrovich@gmail.com> wrote: > On 9/3/12, Iain Buclaw <ibuclaw@ubuntu.com> wrote: >> Indeed, C++ classes are always passed in memory by design. Whereas > pointers could be passed in registers. > > That's cool. I learn something new every day. :) > >> And this is one >> reason why you need to ensure that function signatures match in both D >> and C/C++ code. > > Yeah that's doable when the type is a POD but when it's a class returned by value there is no equivalent in D since D classes are always references, so I can't match the D function signature to the C one. > >> extern "C" >> FileName value_FileName(void* refVal) >> { >> return *(FileName*)refVal; >> } > > That won't work either since FileName is still in the return type and I can't match the function signature on the D side (it still crashes). The only thing I can think of is to match the C++ function signature to the D side via something like: > Ah, sorry, my bad. I was testing marking D structs as addressable (meaning are always passed in memory) whilst in the middle of looking at the difference between D and C++ codegen. Must have left that turned on still in my copy of gdc. ;-) -- Iain Buclaw *(p < e ? p++ : p) = (c & 0x0f) + '0'; |
Copyright © 1999-2021 by the D Language Foundation