View mode: basic / threaded / horizontal-split · Log in · Help
July 10, 2009
[SO] question on trivial function inlining
Anyone care to add more details?

http://stackoverflow.com/questions/1109995/do-getters-and-setters-impact-performance-in-c-d-java/1110324#1110324
July 10, 2009
Re: [SO] question on trivial function inlining
Benjamin Shropshire:
> Anyone care to add more details?
> http://stackoverflow.com/questions/1109995/do-getters-and-setters-impact-performance-in-c-d-java/1110324#1110324

I think DMD is currently unable to de-virtualize virtual getters and setters. Virtual calls are a bit slower by itself, but they also don't allow inlining, so successive standard optimizations can't be done. So if such accesses to the attribute is a virtual call and this happens in a "hot" part of the code, then it may slow down your code significantly. (if it happens in non-hot parts of the code it has usually no effects. That's why Java Hot Spot doesn't need optimize all your code to produce a very fast program anyway).

I have encouraged Frits van Bommel to improve the devirtualization capabilities of LDC:
http://www.dsource.org/projects/ldc/changeset/1506%3A76936858d1c6
Now LDC is able to do that in few very simple situations, but most times the situation is unchanged compared to DMD. Eventually LLVM will improve, so this situation can improve by itself. But the front-end too may do something about this.

Some documentation about this topic, something older:
http://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=B26C4304DB1DA05ECBD67CA7D9313511?doi=10.1.1.7.7766&rep=rep1&type=pdf

Some more modern:
http://ols.fedoraproject.org/GCC/Reprints-2006/namolaru-reprint.pdf

Bye,
bearophile
July 10, 2009
Re: [SO] question on trivial function inlining
Reply to bearophile,

> Benjamin Shropshire:
> 
>> Anyone care to add more details?
>> http://stackoverflow.com/questions/1109995/do-getters-and-setters-imp
>> act-performance-in-c-d-java/1110324#1110324
[...]
> Bye,
> bearophile

Mind if I echo that back to SO?
July 11, 2009
Re: [SO] question on trivial function inlining
BCS:
> Mind if I echo that back to SO?

OK, if you want you can copy it on the StackOverflow site (even if it's bad advertising for the D language and even if there are some small grammar errors in what I have written, I have written it quickly, and English isn't my first language) :-)

But I'd also like to show that answer of mine to D developers, because (an improvement in) the front-end may help in devirtualizing some class methods.

Bye,
bearophile
July 11, 2009
Re: [SO] question on trivial function inlining
Reply to bearophile,

> BCS:
> 
> OK, if you want you can copy it on the StackOverflow

done there

> But I'd also like to show that answer of mine to D developers,

done here

:)
July 11, 2009
Re: [SO] question on trivial function inlining
BCS:
> > But I'd also like to show that answer of mine to D developers,
> done here
> :)

Do D devs take a look at this D.learn newsgroup, once in a while? :-)

Anyway, just to be sure I have done four more little tests, that show an ideal situation. Real world situations are probably worse.

version(Tango) import tango.stdc.stdio: printf;

class Test1 {
    int x;
}

class Test2 {
    int _x;
    int x() { return this._x; }
    void x(int xx) { this._x = xx; }
}

void main() {
   auto t = new Test1;
   t.x = 4;
   printf("%d\n", t.x);
}

/*
DMD ASM:

_D16getters_setters15Test21xMFZi    comdat
       mov EAX,8[EAX]
       ret

_D16getters_setters15Test21xMFiZv   comdat
       mov ECX,4[ESP]
       mov 8[EAX],ECX
       ret 4

main:
       push    EAX
       mov EAX,offset FLAT:_D16getters_setters15Test17__ClassZ
       push    EAX
       call    near ptr __d_newclass
       mov ECX,offset FLAT:_D16getters_setters15Test26__vtblZ[020h]
       mov dword ptr 8[EAX],4
       push    dword ptr 8[EAX]
       push    ECX
       call    near ptr _printf
       add ESP,0Ch
       xor EAX,EAX
       pop ECX
       ret

--------------------------

LDC ASM:

_D7getset15Test21xMFZi:
   movl    8(%eax), %eax
   ret

_D7getset15Test21xMFiZv:
   movl    4(%esp), %ecx
   movl    %ecx, 8(%eax)
   ret $4

main:
   subl    $12, %esp
   movl    $4, 4(%esp)
   movl    $.str2, (%esp)
   call    printf
   xorl    %eax, %eax
   addl    $12, %esp
   ret $8
*/

Here the LDC code is almost optimal, thanks to Frits van Bommel and others.

==========================

Now with a (virtual) getter and setter:

version(Tango) import tango.stdc.stdio: printf;

class Test1 {
    int x;
}

class Test2 {
    int _x;
    int x() { return this._x; }
    void x(int xx) { this._x = xx; }
}

void main() {
   auto t = new Test2;
   t.x = 4;
   printf("%d\n", t.x);
}

/*
DMD ASM:

_D16getters_setters25Test21xMFZi    comdat
       mov EAX,8[EAX]
       ret

_D16getters_setters25Test21xMFiZv   comdat
       mov ECX,4[ESP]
       mov 8[EAX],ECX
       ret 4

main:
       push    EAX
       mov EAX,offset FLAT:_D16getters_setters25Test27__ClassZ
       push    EBX
       push    ESI
       push    4
       push    EAX
       call    near ptr __d_newclass
       add ESP,4
       mov ECX,[EAX]
       mov EBX,EAX
       call    dword ptr 01Ch[ECX]
       mov EDX,[EBX]
       mov EAX,EBX
       call    dword ptr 018h[EDX]
       mov ESI,offset FLAT:_D16getters_setters25Test26__vtblZ[020h]
       push    EAX
       push    ESI
       call    near ptr _printf
       add ESP,8
       xor EAX,EAX
       pop ESI
       pop EBX
       pop ECX
       ret

---------------------

LDC ASM:

_D7getset25Test21xMFZi:
   movl    8(%eax), %eax
   ret

_D7getset25Test21xMFiZv:
   movl    4(%esp), %ecx
   movl    %ecx, 8(%eax)
   ret $4

main:
   subl    $12, %esp
   movl    $_D7getset25Test27__ClassZ, (%esp)
   call    _d_allocclass
   movl    $_D7getset25Test26__vtblZ, (%eax)
   movl    $0, 4(%eax)
   movl    $4, 8(%eax)
   movl    $4, 4(%esp)
   movl    $.str2, (%esp)
   call    printf
   xorl    %eax, %eax
   addl    $12, %esp
   ret $8
*/

I like LDC and LDC developers (and LLVM) :-)

If 't' is defined as a scoped class, then LDC produces this main:

main:
   subl    $20, %esp
   movl    $_D8getset2b5Test26__vtblZ, 8(%esp)
   movl    $0, 12(%esp)
   movl    $4, 16(%esp)
   movl    $4, 4(%esp)
   movl    $.str2, (%esp)
   call    printf
   leal    8(%esp), %eax
   movl    %eax, (%esp)
   call    _d_callfinalizer
   xorl    %eax, %eax
   addl    $20, %esp
   ret $8

That's better. But recently I have asked ChristianK to improve that some more,
removing part of that call to _d_callfinalizer, leaving only a call to the monitor management:
http://www.dsource.org/projects/ldc/ticket/339
*/

Bye,
bearophile
Top | Discussion index | About this forum | D home