Thread overview
[Issue 23814] [Codegen] Calling member function of extern(C++) class with multiple inheritance doesn't preserve the EBX register in some cases
Mar 29, 2023
naydef
Mar 31, 2023
Dlang Bot
Apr 03, 2023
RazvanN
Apr 03, 2023
naydef
Apr 03, 2023
naydef
Jun 16, 2023
Walter Bright
Jun 16, 2023
naydef
March 29, 2023
https://issues.dlang.org/show_bug.cgi?id=23814

--- Comment #1 from naydef <naydef@abv.bg> ---
Corrected the code (previous one will not generate the bad code for the call):

extern(C++) interface BaseInterface1
{
public:
    const(char)* func1();
    const(char)* func2();
}

extern(C++) abstract class BaseInterface2
{
public:
    const(char)* func3() {return "func3";}
    const(char)* func4() {return "func4";}
}

extern(C++) class MainClass : BaseInterface2, BaseInterface1
{
    override const(char)* func1() {return "func1_overriden";}
    override const(char)* func2() {return "func2_overriden";}
    override const(char)* func3() {return "func3_overriden";}
    override const(char)* func4() {return "func4_overriden";}
}


void main()
{
    BaseInterface1 cls = new MainClass();

    import core.stdc.stdio;
    printf("We'll now call func4");

    cls.func1();
}


Assembly code

The assembly of the callee (IDA):
-----------------------------------------------------------
.text:00034790 _THUNK0         proc near               ; DATA XREF:
.data:off_87304↓o
.text:00034790
.text:00034790 arg_0           = dword ptr  4
.text:00034790
.text:00034790                 sub     [esp+arg_0], 4
.text:00034795                 call    $+5
.text:0003479A
.text:0003479A loc_3479A:                              ; DATA XREF: _THUNK0+B↓o
.text:0003479A                 pop     ebx
.text:0003479B                 add     ebx, (offset _GLOBAL_OFFSET_TABLE_ -
offset loc_3479A)
.text:000347A1                 jmp     _ZN9MainClass5func1Ev ;
MainClass::func1(void)
.text:000347A1 _THUNK0         endp

-----------------------------------------------------------

Code of the caller:

-----------------------------------------------------------
.text:000348C5                 lea     ecx, (aWeLlNowCallFun - 86FF4h)[eax] ;
"We'll now call func4"
.text:000348CB                 push    ecx
.text:000348CC                 mov     ebx, [ebp+_LOCALGOT6]
.text:000348CF                 call    _printf
.text:000348D4                 add     esp, 10h
.text:000348D7                 sub     esp, 0Ch
.text:000348DA                 push    [ebp+cls]
.text:000348DD                 mov     ebx, [ebp+_LOCALGOT6]
.text:000348E0                 mov     edx, [ebp+cls]
.text:000348E3                 mov     eax, [edx]
.text:000348E5                 call    ds:(_GLOBAL_OFFSET_TABLE_ - 86FF4h)[eax]
;  Call to cls.func1
.text:000348E7                 add     esp, 10h

-----------------------------------------------------------


Issue appears with DMD 2.102.2 compiling on Linux with dub parameter --arch=x86

--
March 31, 2023
https://issues.dlang.org/show_bug.cgi?id=23814

Dlang Bot <dlang-bot@dlang.rocks> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |pull

--- Comment #2 from Dlang Bot <dlang-bot@dlang.rocks> ---
@naydef created dlang/dmd pull request #15063 "Fix Issue 23814 - [Codegen] Calling member function of extern(C++) cl…" fixing this issue:

- Fix Issue 23814 - [Codegen] Calling member function of extern(C++) class
with...

  ... multiple inheritance doesn't preserve the EBX register in some cases

https://github.com/dlang/dmd/pull/15063

--
April 03, 2023
https://issues.dlang.org/show_bug.cgi?id=23814

RazvanN <razvan.nitu1305@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |razvan.nitu1305@gmail.com

--- Comment #3 from RazvanN <razvan.nitu1305@gmail.com> ---
What command line options are you using? I cannot reproduce this for neither 32 or 64 bit.

32 bit output:

00000000 <_ZN9MainClass5func4Ev>:
   0:   55                      push   ebp
   1:   8b ec                   mov    ebp,esp
   3:   83 ec 08                sub    esp,0x8
   6:   e8 00 00 00 00          call   b <_ZN9MainClass5func4Ev+0xb>
   b:   58                      pop    eax
   c:   05 02 00 00 00          add    eax,0x2
  11:   89 45 fc                mov    DWORD PTR [ebp-0x4],eax
  14:   8b 4d fc                mov    ecx,DWORD PTR [ebp-0x4]
  17:   8d 81 40 00 00 00       lea    eax,[ecx+0x40]
  1d:   c9                      leave
  1e:   c3                      ret


Does not use ebx.

64 bit output:

0000000000000000 <_ZN9MainClass5func4Ev>:
   0:   48 8d 05 00 00 00 00    lea    rax,[rip+0x0]        # 7
<_ZN9MainClass5func4Ev+0x7>
   7:   c3                      ret

Does not use rbx.

I don't think this issue is valid.

--
April 03, 2023
https://issues.dlang.org/show_bug.cgi?id=23814

--- Comment #4 from naydef <naydef@abv.bg> ---
I'm using DMD64 D Compiler v2.102.2 on Linux. Compiling with: "dmd app.d -m32"

I see you check func4, while my example code calls func1.
Use the second example code. You can use a debugger and break where func1 is
called in _Dmain, step into the function and you'll see the usage of EBX
register in a function called _THUNK0, which at the end jumps to
_ZN9MainClass5func1Ev.

--
April 03, 2023
https://issues.dlang.org/show_bug.cgi?id=23814

--- Comment #5 from naydef <naydef@abv.bg> ---
I've made the following example (without the patch the generated executable
crashes):

app.d
--------------------------------------
extern(C++) interface BaseInterface1
{
public:
    int func1();
    int func2();
}

extern(C++) abstract class BaseInterface2
{
public:
    int func3() {return 3;}
    int func4() {return 4;}
}

extern(C++) class MainClass : BaseInterface2, BaseInterface1
{
    override int func1() {return 1;}
    override int func2() {return 2;}
}

extern(C++) void cppFunc1(BaseInterface1 obj);


void main()
{
    BaseInterface1 cls = new MainClass();
    cppFunc1(cls);
}
--------------------------------------

app2.cpp
--------------------------------------
class BaseInterface1
{
public:
    virtual int func1();
    virtual int func2();
};

class BaseInterface2
{
public:
    virtual int func3();
    virtual int func4();
};

class MainClass : BaseInterface2, BaseInterface1
{
    virtual int func1();
    virtual int func2();
};

void cppFunc1(BaseInterface1* obj)
{
    int a = obj->func1();
    int b = obj->func2();
}
--------------------------------------

The executable is generated with the following command:
gcc -m32 -O -c app2.cpp -o app2.o;dmd app.d app2.o -m32

Feel free to comment on the code.

--
June 16, 2023
https://issues.dlang.org/show_bug.cgi?id=23814

Walter Bright <bugzilla@digitalmars.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |bugzilla@digitalmars.com

--- Comment #6 from Walter Bright <bugzilla@digitalmars.com> ---
(In reply to RazvanN from comment #3)
>    6:   e8 00 00 00 00          call   b <_ZN9MainClass5func4Ev+0xb>
>    b:   58                      pop    eax
>    c:   05 02 00 00 00          add    eax,0x2

What this code is doing (regardless of whether it is EAX or EBX) is:

1. CALL: to the next instruction. This has the effect of pushing the address of the next instruction on the stack

2. POP reg: puts that address into reg

3. ADD reg,xxxx: reg is now pointing to data that is relative to the code section, likely a virtual function or a "thunk" to one

Can you try it with D classes rather than C++ classes?

--
June 16, 2023
https://issues.dlang.org/show_bug.cgi?id=23814

--- Comment #7 from naydef <naydef@abv.bg> ---
Hm, I don't know how to try it with a D class. The reproduction code relies on a C++ compiler (GCC uses the EBX register right after calling D virtual function, DMD doesn't seem to do that). Also the content of EBX is not used in this THUNK, so nothing depends on it.

I don't understand what "register clobbering" in the question refers to. I'd assume Walter means the correct fix is to preserve the register on THUNK entry and restore on exit. If that's what's meant, then I don't know how to achieve that (I'm not familiar with DMD). Also as the example shows, I see no code relying on EBX content, instead there's a JMP to regular function, so this CALL + POP + ADD sequence seems redundant.

Yea, I'm welcome for better fix...

--