View mode: basic / threaded / horizontal-split · Log in · Help
September 03, 2012
Trouble understanding crash when class is returned by value from C++
I'm trying to add at least *some* type of pass-by-value support for
C++ classes when wrapping C++ libraries to D. I figured I could fake a
value class by using a D struct with a thunk field which matches the
size of the C++ object.

Returning a C++ object by value works in this plain C++ example (using
g++ on win32):
test.cpp: http://codepad.org/55pttk3I
$ g++ -m32 -g test.cpp -o main.exe -lstdc++
$ main.exe

If I take the same code but remove main and instead use a D driver app like so:
test.cpp: http://codepad.org/ZqieSXrb
main.d: http://codepad.org/6E5sbc7e

I compile it:
$ g++ -m32 -g -c ./test.cpp -o test.obj
$ gdc -m32 -g main.d test.obj -o main.exe -lstdc++
$ main.exe

and then I get a crash:
The instruction at "0x6fc8ea39" referenced memory at "0x006f6f62". The
memory could not be "read".

GDB tells me:
Program received signal SIGSEGV, Segmentation fault.
0x6fc8ea39 in libstdc++-6!_ZNSsC1ERKSs ()
  from C:\MinGW\bin\libstdc++-6.dll

If I replace the std::string field with an ordinary 'char*' the crash
is gone, so my wild guess is the crash happens in one of std::string's
special member functions (ctor/dtor/etc..).

C++ sizeof() tells me FileName is 4 bytes long, so I've matched that
in the fake D struct. If I increase the 'thunk' field to 9 bytes the
crash disappears. I have a hunch stack corruption might be to blame.

I can notice some difference in the ASM listings:
C++ plain sample: http://pastebin.com/xw3BhwwR
D driver sample: http://pastebin.com/TLa8k5A3

The suspicious thing there is the missing LEA instruction in the D
listing. If I change the thunk field to 9 bytes the LEA instruction
appears again (and this is when the crash disappears).

My ASM-foo is really weak though, so I don't know what any of this
means. Anyone know what's going on?
September 03, 2012
Re: Trouble understanding crash when class is returned by value from C++
My best guess, is the issue is related to the struct being 4 bytes.
A similar segfault occurs if you attempt to access in a similar manner 
using c++.

A 4 byte struct will fit into a single register making pointers 
unnecessary/slower and it's likely some part of the ABI has taken this 
into consideration and the compiler is optimizing access to this.

However, I would imagine that such optimizations would not be allowed 
with C++ and so by using a class it requires a pointer type and not the 
optimized struct.

The following returns the value of 4 when I inspect the variable 
refValue instead of the correct address and segfaults.

http://codepad.org/eepFTfbX
September 03, 2012
Re: Trouble understanding crash when class is returned by value from C++
On 3 September 2012 16:52, Daniel Green <venix1@gmail.com> wrote:
> My best guess, is the issue is related to the struct being 4 bytes.
> A similar segfault occurs if you attempt to access in a similar manner using
> c++.
>
> A 4 byte struct will fit into a single register making pointers
> unnecessary/slower and it's likely some part of the ABI has taken this into
> consideration and the compiler is optimizing access to this.
>
> However, I would imagine that such optimizations would not be allowed with
> C++ and so by using a class it requires a pointer type and not the optimized
> struct.
>
> The following returns the value of 4 when I inspect the variable refValue
> instead of the correct address and segfaults.
>
> http://codepad.org/eepFTfbX


Indeed,  C++ classes are always passed in memory by design.  Whereas
pointers could be passed in registers.  The difference between ABI
handling of void* and FileName* here matter a lot.  And this is one
reason why you need to ensure that function signatures match in both D
and C/C++ code.

extern "C"
FileName value_FileName(void* refVal)
{
   return *(FileName*)refVal;
}

By the way, why extern "C" when extern (C++) works just fine? :-)


Regards
-- 
Iain Buclaw

*(p < e ? p++ : p) = (c & 0x0f) + '0';
September 03, 2012
Re: Trouble understanding crash when class is returned by value from C++
On 9/3/12, Iain Buclaw <ibuclaw@ubuntu.com> wrote:
> Indeed,  C++ classes are always passed in memory by design.  Whereas
pointers could be passed in registers.

That's cool. I learn something new every day. :)

> And this is one
> reason why you need to ensure that function signatures match in both D
> and C/C++ code.

Yeah that's doable when the type is a POD but when it's a class
returned by value there is no equivalent in D since D classes are
always references, so I can't match the D function signature to the C
one.

> extern "C"
> FileName value_FileName(void* refVal)
> {
>     return *(FileName*)refVal;
> }

That won't work either since FileName is still in the return type and
I can't match the function signature on the D side (it still crashes).
The only thing I can think of is to match the C++ function signature
to the D side via something like:

C++:
class FileName { ... } // same as before

struct Fake
{
   char __thunk[4];
};

Fake value_FileName(void* refVal)
{
   return *(Fake*)(&(*(FileName*)refVal));
}

It's ugly but it does seem to work and matches the D function
signature. It would be a lot simpler if the return type was castable
to (char[4]), but C/++ doesn't support returning arrays by value. :)

> By the way, why extern "C" when extern (C++) works just fine? :-)

I'm working on a codegenerator which uses C as the glue language,
similar to SWIG. But the plan is to support more features than SWIG
and have a faster and less memory-intensive cross-language virtual
method invocation mechanism. Unlike SWIG I support passing PODs by
value, but passing non-POD classes by value was problematic and I can
see now why.

Thanks for your help guys!
September 03, 2012
Re: Trouble understanding crash when class is returned by value from C++
On 3 September 2012 18:15, Andrej Mitrovic <andrej.mitrovich@gmail.com> wrote:
> On 9/3/12, Iain Buclaw <ibuclaw@ubuntu.com> wrote:
>> Indeed,  C++ classes are always passed in memory by design.  Whereas
> pointers could be passed in registers.
>
> That's cool. I learn something new every day. :)
>
>> And this is one
>> reason why you need to ensure that function signatures match in both D
>> and C/C++ code.
>
> Yeah that's doable when the type is a POD but when it's a class
> returned by value there is no equivalent in D since D classes are
> always references, so I can't match the D function signature to the C
> one.
>
>> extern "C"
>> FileName value_FileName(void* refVal)
>> {
>>     return *(FileName*)refVal;
>> }
>
> That won't work either since FileName is still in the return type and
> I can't match the function signature on the D side (it still crashes).
> The only thing I can think of is to match the C++ function signature
> to the D side via something like:
>

Ah, sorry, my bad.  I was testing marking D structs as addressable
(meaning are always passed in memory) whilst in the middle of looking
at the difference between D and C++ codegen. Must have left that
turned on still in my copy of gdc. ;-)


-- 
Iain Buclaw

*(p < e ? p++ : p) = (c & 0x0f) + '0';
Top | Discussion index | About this forum | D home