View mode: basic / threaded / horizontal-split · Log in · Help
February 13, 2013
Inexplicable LDC vs GDC speed difference
Following some code posted in d.learn, I've observed a bizarre and (to me) 
inexplicable difference in code speed depending on whether LDC or GDC is used as 
the compiler:
http://forum.dlang.org/post/mailman.1239.1360764028.22503.digitalmars-d-learn@puremagic.com

I'm using latest-from-GitHub versions of both compilers, compiled as release 
versions.

Anyone have any idea what could be the source of the speed difference?
February 14, 2013
Re: Inexplicable LDC vs GDC speed difference
On Wednesday, 13 February 2013 at 15:07:04 UTC, Joseph Rushton 
Wakeling wrote:
> Following some code posted in d.learn, I've observed a bizarre 
> and (to me) inexplicable difference in code speed depending on 
> whether LDC or GDC is used as the compiler:
> http://forum.dlang.org/post/mailman.1239.1360764028.22503.digitalmars-d-learn@puremagic.com
>
> I'm using latest-from-GitHub versions of both compilers, 
> compiled as release versions.
>
> Anyone have any idea what could be the source of the speed 
> difference?

The speed difference is partly caused by the fact that GDC 
doesn't inline juliaFunction and squarePlusMag. I have a build of 
GDC with always_inline attribute enabled (I just copied a few 
lines of code from some old version of GDC to d-builtins.c of a 
recent version), so I tried adding pragma(attribute, 
always_inline) to those functions. It seems that GDC is unable to 
inline them for some reason.

When I added always_inline to juliaFunction, I got this error:

error: inlining failed in call to always_inline 
?main.Julia!(float).juliaFunction?: function body can be 
overwritten at link time

Here's reduced code that gives the same error when always_inline 
is added to bar:

int bar()(int x)
{
    if (x)
        return 0;

    return 1;
}

int foo(int a)
{
    return bar( a);
}

bar can be inlined if I remove the first pair of parentheses (so 
that it isn't a template).



When I add always_inline to squarePlusMag I get:

error: inlining failed in call to always_inline 
?main.Julia!(float).ComplexStruct.squarePlusMag?: mismatched 
arguments

Reduced code that gives the same error when always_inline is 
added to bar:

struct S
{
    int bar(const S s)
    {
        return 0;
    }
}

int foo()
{
    S s;
    return s.bar(s);
}

bar can be inlined if I remove const.

I have compiled all the samples with -c -O3 -finline-functions 
-frelease.
February 14, 2013
Re: Inexplicable LDC vs GDC speed difference
Am Thu, 14 Feb 2013 01:18:21 +0100
schrieb "jerro" <a@a.com>:

> 
> When I added always_inline to juliaFunction, I got this error:
> 
> error: inlining failed in call to always_inline 
> ?main.Julia!(float).juliaFunction?: function body can be 
> overwritten at link time
> 
> Here's reduced code that gives the same error when always_inline 
> is added to bar:
> 
> int bar()(int x)
> {
>      if (x)
>          return 0;
> 
>      return 1;
> }
> 
> int foo(int a)
> {
>      return bar( a);
> }
> 
> bar can be inlined if I remove the first pair of parentheses (so 
> that it isn't a template).
> 

I'll have a look at this soon. I already have an idea what could be
wrong.

> 
> When I add always_inline to squarePlusMag I get:
> 
> error: inlining failed in call to always_inline 
> ?main.Julia!(float).ComplexStruct.squarePlusMag?: mismatched 
> arguments
> 
> Reduced code that gives the same error when always_inline is 
> added to bar:
> 
> struct S
> {
>      int bar(const S s)
>      {
>          return 0;
>      }
> }
> 
> int foo()
> {
>      S s;
>      return s.bar(s);
> }
> 
> bar can be inlined if I remove const.
> 
> I have compiled all the samples with -c -O3 -finline-functions 
> -frelease.
February 19, 2013
Re: Inexplicable LDC vs GDC speed difference
Am Thu, 14 Feb 2013 01:18:21 +0100
schrieb "jerro" <a@a.com>:

> On Wednesday, 13 February 2013 at 15:07:04 UTC, Joseph Rushton 
> Wakeling wrote:	
> 
> Here's reduced code that gives the same error when always_inline 
> is added to bar:
> 
> int bar()(int x)
> {
>      if (x)
>          return 0;
> 
>      return 1;
> }
> 
> int foo(int a)
> {
>      return bar( a);
> }
> 
> bar can be inlined if I remove the first pair of parentheses (so 
> that it isn't a template).

https://github.com/D-Programming-GDC/GDC/pull/50


> 
> struct S
> {
>      int bar(const S s)
>      {
>          return 0;
>      }
> }
> 
> int foo()
> {
>      S s;
>      return s.bar(s);
> }
> 
> bar can be inlined if I remove const.
> 
> I have compiled all the samples with -c -O3 -finline-functions 
> -frelease.

I posted this to our bugzilla, I'm not sure if I'll have the time to
look at this one.

http://gdcproject.org/bugzilla/show_bug.cgi?id=37
February 22, 2013
Re: Inexplicable LDC vs GDC speed difference
On 02/19/2013 09:34 AM, Johannes Pfau wrote:
> I posted this to our bugzilla, I'm not sure if I'll have the time to
> look at this one.
>
> http://gdcproject.org/bugzilla/show_bug.cgi?id=37

Just to note -- with your pull request #50 now included in GDC, things speed up 
very slightly.  Removing the const from the Julia value code, results become 
comparable to g++ and the C++ implementation (i.e. about 4.3 s for the 'double' 
case).

LDC still produces a faster executable for this particular code, but then, 
clang++ also produces a faster executable than g++.
February 22, 2013
Re: Inexplicable LDC vs GDC speed difference
On 22 February 2013 18:59, Joseph Rushton Wakeling <
joseph.wakeling@webdrake.net> wrote:

> On 02/19/2013 09:34 AM, Johannes Pfau wrote:
>
>> I posted this to our bugzilla, I'm not sure if I'll have the time to
>> look at this one.
>>
>> http://gdcproject.org/**bugzilla/show_bug.cgi?id=37<http://gdcproject.org/bugzilla/show_bug.cgi?id=37>
>>
>
> Just to note -- with your pull request #50 now included in GDC, things
> speed up very slightly.  Removing the const from the Julia value code,
> results become comparable to g++ and the C++ implementation (i.e. about 4.3
> s for the 'double' case).
>
> LDC still produces a faster executable for this particular code, but then,
> clang++ also produces a faster executable than g++.
>


Cool, cheers for checking.


-- 
Iain Buclaw

*(p < e ? p++ : p) = (c & 0x0f) + '0';
Top | Discussion index | About this forum | D home