Jump to page: 1 2 3
Thread overview
Inline Functions
Feb 24, 2009
bearophile
Feb 24, 2009
Mason Green
Feb 24, 2009
Denis Koroskin
Feb 24, 2009
Lutger
Feb 24, 2009
Bill Baxter
Feb 24, 2009
dsimcha
Feb 24, 2009
bearophile
Feb 24, 2009
dsimcha
Feb 25, 2009
grauzone
Feb 26, 2009
Sergey Gromov
Feb 26, 2009
Walter Bright
Feb 26, 2009
Sergey Gromov
Feb 27, 2009
Walter Bright
Feb 27, 2009
TomD
Mar 02, 2009
Sergey Gromov
Mar 02, 2009
Mason Green
Mar 02, 2009
Sergey Gromov
Feb 25, 2009
Walter Bright
Feb 25, 2009
Walter Bright
Feb 26, 2009
Walter Bright
Feb 26, 2009
Walter Bright
Feb 26, 2009
Walter Bright
February 24, 2009
Hello,

I'm looking for ways to optimize Blaze, the D port of Box2D, and running into some frustrations.  In fact, the same Java port (http://www.jbox2d.org/v2demos/) is currently running circles around Blaze, performance wise....

I have a sneaking suspicion that this is the result of the many thousands of vector math operations that are performed each cycle during my stress test.

Is there a way to force inline function calls?  I'm compiling my code with '-release -O -inline', but this seems not to have much of an effect on performance.  When I remove -inline there doesn't seem to be much of a difference in execution speed.

FYI, I'm using DMD v1.035 on Windowd x32.

Thanks,
Mason
February 24, 2009
Mason Green:

>I'm looking for ways to optimize Blaze, the D port of Box2D, and running into some frustrations.  In fact, the same Java port (http://www.jbox2d.org/v2demos/) is currently running circles around Blaze, performance wise....<

In the beginning the Java code used to run very slowly, but today Java (and C# on dotnet) is getting closer to well compiled C++ code (and it's much simpler to write than C++).

A JavaVM like HotSpot is more refined than the backend of DMD, its GC is much more refined and more efficient, it's much better in inlining virtual methods, its data structures are usually better performance-tuned, etc. The D language is newer than Java, and it has enjoyed far less money, developers and users.

Have you profiled your D code? What has the profiling told you? Have you seen where you allocate memory, to move such allocations away from inner loops, or just reduce their number?

Bye,
bearophile
February 24, 2009
bearophile:

Thanks for the reply.

> A JavaVM like HotSpot is more refined than the backend of DMD, its GC is much more refined and more efficient, it's much better in inlining virtual methods, its data structures are usually better performance-tuned, etc. The D language is newer than Java, and it has enjoyed far less money, developers and users.>

Very well put! But, do you know if there is a way to force inlining where I want it?  Someone mentioned to me that template mixins may work...?  I would rather not inline all the code by hand, as I would like to trust the compiler.

> Have you profiled your D code? What has the profiling told you? Have you seen where you allocate memory, to move such allocations away from inner loops, or just reduce their number? >

No, I have not profiled the D code other than using an FPS counter... :-) To be honest, I'm fairly light on experience when it comes to profiling. Do you have any suggestions on how to make it happen?

Bye,
Mason

February 24, 2009
On Tue, 24 Feb 2009 22:08:26 +0300, Mason Green <mason.green@gmail.com> wrote:

> bearophile:
>
> Thanks for the reply.
>
>> A JavaVM like HotSpot is more refined than the backend of DMD, its GC is much more refined and more efficient, it's much better in inlining virtual methods, its data structures are usually better performance-tuned, etc. The D language is newer than Java, and it has enjoyed far less money, developers and users.>
>
> Very well put! But, do you know if there is a way to force inlining where I want it?  Someone mentioned to me that template mixins may work...?  I would rather not inline all the code by hand, as I would like to trust the compiler.
>
>> Have you profiled your D code? What has the profiling told you? Have you seen where you allocate memory, to move such allocations away from inner loops, or just reduce their number? >
>
> No, I have not profiled the D code other than using an FPS counter... :-) To be honest, I'm fairly light on experience when it comes to profiling. Do you have any suggestions on how to make it happen?
>
> Bye,
> Mason
>

DMD has profiling built-in. Just recompile your code with -profile flag, run once and analyze output.
February 24, 2009
Mason Green wrote:

> bearophile:
> 
> Thanks for the reply.
> 
>> A JavaVM like HotSpot is more refined than the backend of DMD, its GC is
much more refined and more efficient, it's much better in inlining virtual methods, its data structures are usually better performance-tuned, etc. The D language is newer than Java, and it has enjoyed far less money, developers and users.>
> 
> Very well put! But, do you know if there is a way to force inlining where
I want it?  Someone mentioned to me that template mixins may work...?  I would rather not inline all the code by hand, as I would like to trust the compiler.

You could use mixins, but that won't lead to pretty code. It's useful to
know which kinds of code can get inlined by dmd. I don't have much knowledge
of this, but the most common things that won't get inlined are loops,
delegates and virtual functions iirc.

>> Have you profiled your D code? What has the profiling told you? Have you
seen where you allocate memory, to move such allocations away from inner loops, or just reduce their number? >
> 
> No, I have not profiled the D code other than using an FPS counter... :-)
To be honest, I'm fairly light on experience when it comes to profiling. Do you have any suggestions on how to make it happen?

dmd's builtin profiler can be useful. Some time ago I have written a small utility to help make it's output more readable: http://www.dsource.org/projects/scrapple/wiki/PtraceUtility



February 24, 2009
I seem to remember from a previous discussion about  optimizing a ray-tracer that DMD will not inline functions that take reference parameters.   Can anyone else confirm this?

--bb
February 24, 2009
== Quote from Bill Baxter (wbaxter@gmail.com)'s article
> I seem to remember from a previous discussion about  optimizing a
> ray-tracer that DMD will not inline functions that take reference
> parameters.   Can anyone else confirm this?
> --bb

Here's a test program I wrote and the relevant parts of the disassembly.  It was compiled w/ -O -inline -release.  I think you're right, strange as it seems.  I wonder why ref is never inlined.

void main() {
    uint foo;
    inc(foo);
}

void inc(ref uint num) {
    num++;
}

__Dmain PROC NEAR
;  COMDEF __Dmain
        push    eax
        lea     eax, [esp]
        mov     dword ptr [esp], 0
        call    _D4test3incFKkZv
        xor     eax, eax
        pop     ecx
        ret
__Dmain ENDP

_text$__Dmain ENDS

_text$_D4test3incFKkZv SEGMENT DWORD PUBLIC 'CODE'

_D4test3incFKkZv PROC NEAR
;  COMDEF _D4test3incFKkZv
        inc     dword ptr [eax]
        ret
_D4test3incFKkZv ENDP
February 24, 2009
dsimcha:

>I think you're right, strange as it seems.  I wonder why ref is never inlined.<

Do you want something like a forced_inline attribute in D? :-)

Bye,
bearophile
February 24, 2009
== Quote from bearophile (bearophileHUGS@lycos.com)'s article
> dsimcha:
> >I think you're right, strange as it seems.  I wonder why ref is never inlined.<
> Do you want something like a forced_inline attribute in D? :-)
> Bye,
> bearophile

No, actually, I like the idea of leaving these small micro-optimizations to the compiler.  It's just that I can't figure out what's special about functions that take ref parameters.  Maybe there is a good reason for this behavior.  I don't know.  It's just that if there is a good reason, I can't think of it.

Also, if you really, really, _really_ want to force a function to be inlined, you can probably simulate this with templates or mixins or something.  IMHO wanting to absolutely insist that something be inlined is too much of an edge case to have pretty syntax and special language constructs for.
February 25, 2009
Both LDC and GDC inline the function. (LDC actually reduces your code to nothing, so I had to change it a bit to see if the call was really inlined.)
« First   ‹ Prev
1 2 3