Jump to page: 1 24  
Page
Thread overview
D slower than C++ by a factor of _two_ for simple raytracer (gdc)
Feb 15, 2008
downs
Feb 15, 2008
Daniel Lewis
Feb 15, 2008
bearophile
Feb 15, 2008
bearophile
Feb 15, 2008
Marius Muja
Feb 15, 2008
bearophile
Feb 15, 2008
downs
Feb 15, 2008
downs
Feb 15, 2008
downs
Feb 15, 2008
Walter Bright
Returning large structs == bad
Feb 15, 2008
downs
Feb 15, 2008
downs
Feb 15, 2008
Tim Burrell
Feb 15, 2008
downs
Feb 15, 2008
downs
Feb 15, 2008
Walter Bright
Feb 15, 2008
downs
Re: Returning large structs == no difference
Feb 15, 2008
downs
narrowed down the problem area
Feb 15, 2008
downs
Feb 15, 2008
downs
Feb 15, 2008
Tim Burrell
Feb 15, 2008
downs
Feb 15, 2008
Tim Burrell
Feb 15, 2008
downs
Feb 15, 2008
downs
Feb 16, 2008
Sergey Gromov
Feb 16, 2008
Sergey Gromov
Feb 16, 2008
bearophile
Feb 16, 2008
Sergey Gromov
Feb 15, 2008
bearophile
GDC's std.math still not being inlined
Feb 15, 2008
downs
Feb 15, 2008
Saaa
February 15, 2008
My platform is GDC 4.1.2 vs G++ 4.1.1.

I played around with the simple ray tracer code I'd ported to D a while back, still being dissatisfied with the timings of 21s (D) vs 16s (C++).

During this, I found a nice optimization that brought my D code down to 17s, within less than a second of C++!

"Glee" I thought!

Then I applied the same optimization to the C++ source and it dropped to 8s.

I haven't been able to get the D code even close to this new speed level.

The outputs of both programs are identical save for off-by-one differences.

The source code for the C++ version is http://paste.dprogramming.com/dpvpm7jv

D version is http://paste.dprogramming.com/dpzal0jd

Before you ask, yes I've tried turning the structs into classes, the classes into structs and the refs into pointers. That usually made no difference, or worsened it.

Both programs were built with -O3 -ffast-math, the D version additionally with -frelease.
Both compilers were built with roughly similar configure flags. The GDC used is the latest available in SVN, and based on DMD 1.022.

Does anybody know how to bring the D results in line with, or at least closer to, the C++ version?

Ideas appreciated,

 --downs
February 15, 2008
downs Wrote:

> My platform is GDC 4.1.2 vs G++ 4.1.1.
> 
> I played around with the simple ray tracer code I'd ported to D a while back, still being dissatisfied with the timings of 21s (D) vs 16s (C++).
> 
> During this, I found a nice optimization that brought my D code down to 17s, within less than a second of C++!
> 
> "Glee" I thought!
> 
> Then I applied the same optimization to the C++ source and it dropped to 8s.
> 
> I haven't been able to get the D code even close to this new speed level.
> 
> The outputs of both programs are identical save for off-by-one differences.
> 
> The source code for the C++ version is http://paste.dprogramming.com/dpvpm7jv
> 
> D version is http://paste.dprogramming.com/dpzal0jd
> 
> Before you ask, yes I've tried turning the structs into classes, the classes into structs and the refs into pointers. That usually made no difference, or worsened it.
> 
> Both programs were built with -O3 -ffast-math, the D version additionally with -frelease.
> Both compilers were built with roughly similar configure flags. The GDC used is the latest available in SVN, and based on DMD 1.022.
> 
> Does anybody know how to bring the D results in line with, or at least closer to, the C++ version?
> 
> Ideas appreciated,
> 
>  --downs

Don't have a GC, or statically load all of Phobos just to do simple raytracing?
February 15, 2008
Well, I'm on Windows, but comparing DMC and DMD on that code, DMD is slightly faster.  I know that gdc isn't really optimizing everything yet....

That said, cl (v15) beats dmc and dmd at like 60% the time, but this has less to do with the language itself.

I wonder how gcc and dmd compare here...

-[Unknown]


downs wrote:
> My platform is GDC 4.1.2 vs G++ 4.1.1.
> 
> I played around with the simple ray tracer code I'd ported to D a while back, still being dissatisfied with the timings of 21s (D) vs 16s (C++).
> 
> During this, I found a nice optimization that brought my D code down to 17s, within less than a second of C++!
> 
> "Glee" I thought!
> 
> Then I applied the same optimization to the C++ source and it dropped to 8s.
> 
> I haven't been able to get the D code even close to this new speed level.
> 
> The outputs of both programs are identical save for off-by-one differences.
> 
> The source code for the C++ version is http://paste.dprogramming.com/dpvpm7jv
> 
> D version is http://paste.dprogramming.com/dpzal0jd
> 
> Before you ask, yes I've tried turning the structs into classes, the classes into structs and the refs into pointers. That usually made no difference, or worsened it.
> 
> Both programs were built with -O3 -ffast-math, the D version additionally with -frelease.
> Both compilers were built with roughly similar configure flags. The GDC used is the latest available in SVN, and based on DMD 1.022.
> 
> Does anybody know how to bring the D results in line with, or at least closer to, the C++ version?
> 
> Ideas appreciated,
> 
>  --downs
February 15, 2008
downs:
> My platform is GDC 4.1.2 vs G++ 4.1.1.

DMD doesn't optimize much for speed, and programs compiled with GDC aren't that far from DMD ones, I don't know why. I'd like GDC to emit C++ code (later to be compiled by GCC) so I can see the spots where it emits slow-looking C++ code.
DMD isn't much good at inlining, etc, so probably your methods are all function calls, struct methods too.

If you translate your D raytracer to Java6 with HotSpot you will probably find that your D code is probably 20-50% slower than the Java one, despite the Java one being a bit higher level :-) (Thanks to HotSpot and the GC).

If you can stand the ugliness, you can probably reduce your running time by 10-15% using my TinyVector structs instead of your Vec struct, you can find them in my d libs: (V.2.70 at the moment, their development is going well, http://www.fantascienza.net/leonardo/so/libs_d.zip ). That TinyVector comes from extensive testing of mine. You probably may require 10-20 minutes of time to adapt your raytracer to using TinyVector, but it's not too much difficult. The result will be ugly...

Bye,
bearophile
February 15, 2008
bearophile>If you can stand the ugliness, you can probably reduce your running time by 10-15% using my TinyVector structs instead of your Vec struct,<

Note that I expect such speedup on DMD, where I have developed them. I don't know what's the outcome on GDC (that you are using).

Bye,
bearophile
February 15, 2008
bearophile wrote:
> downs:
>> My platform is GDC 4.1.2 vs G++ 4.1.1.
> 
> DMD doesn't optimize much for speed, and programs compiled with GDC aren't that far from DMD ones, I don't know why. I'd like GDC to emit C++ code (later to be compiled by GCC) so I can see the spots where it emits slow-looking C++ code.

In my experience GDC code is faster than DMD code (in some cases significantly faster).

> DMD isn't much good at inlining, etc, so probably your methods are all function calls, struct methods too.
> 
> If you translate your D raytracer to Java6 with HotSpot you will probably find that your D code is probably 20-50% slower than the Java one, despite the Java one being a bit higher level :-) (Thanks to HotSpot and the GC).
> 
> If you can stand the ugliness, you can probably reduce your running time by 10-15% using my TinyVector structs instead of your Vec struct, you can find them in my d libs: (V.2.70 at the moment, their development is going well, http://www.fantascienza.net/leonardo/so/libs_d.zip ). That TinyVector comes from extensive testing of mine. You probably may require 10-20 minutes of time to adapt your raytracer to using TinyVector, but it's not too much difficult. The result will be ugly...
> 
> Bye,
> bearophile
February 15, 2008
Marius Muja Wrote:
> In my experience GDC code is faster than DMD code (in some cases significantly faster).

My experience is similar to the results you can see here, that is about the same on average, better for some things, worse for other ones:
http://shootout.alioth.debian.org/sandbox/benchmark.php?test=all&lang=gdc
(I was using GDC based on MinGW based on GCC 3.2. You can find a good newer MinGW here: http://nuwen.net/mingw.html but I don't know if it works with GDC).

Note for downs: have you tried -fprofile-generate/-fprofile-use flags for the C++ code? They improve the C++ raytracer speed some.

Bye,
bearophile
February 15, 2008
bearophile wrote:
> downs:
>> My platform is GDC 4.1.2 vs G++ 4.1.1.
> 
> DMD doesn't optimize much for speed, and programs compiled with GDC aren't that far from DMD ones, I don't know why. I'd like GDC to emit C++ code (later to be compiled by GCC) so I can see the spots where it emits slow-looking C++ code.
> DMD isn't much good at inlining, etc, so probably your methods are all function calls, struct methods too.
> 
> If you translate your D raytracer to Java6 with HotSpot you will probably find that your D code is probably 20-50% slower than the Java one, despite the Java one being a bit higher level :-) (Thanks to HotSpot and the GC).
> 
> If you can stand the ugliness, you can probably reduce your running time by 10-15% using my TinyVector structs instead of your Vec struct, you can find them in my d libs: (V.2.70 at the moment, their development is going well, http://www.fantascienza.net/leonardo/so/libs_d.zip ). That TinyVector comes from extensive testing of mine. You probably may require 10-20 minutes of time to adapt your raytracer to using TinyVector, but it's not too much difficult. The result will be ugly...
> 
> Bye,
> bearophile

The weird thing is:
even if I inline the one spot where gdc ignores its opportunity to inline a function, so that I have the _same_ call-counts as G++ (as measured with -g -pg), even then, the D code is slower.
So it doesn't depend on missing inlining opportunities. Or am I missing something?

 --downs

PS: for reference, the missing bit is GDC not always inlining Sphere::ray_sphere. If you look, it's only ever called for cases where the final type is obvious.
February 15, 2008
bearophile wrote:
> Note for downs: have you tried -fprofile-generate/-fprofile-use flags for the C++ code? They improve the C++ raytracer speed some.
> 
> Bye,
> bearophile

My point is not in making GDC's crushing defeat even crushinger :)

But thanks for the advice, anyway.

 --downs
February 15, 2008
downs wrote:
> bearophile wrote:
>>
>> If you can stand the ugliness, you can probably reduce your running time by 10-15% using my TinyVector structs instead of your Vec struct, you can find them in my d libs: (V.2.70 at the moment, their development is going well, http://www.fantascienza.net/leonardo/so/libs_d.zip ). That TinyVector comes from extensive testing of mine. You probably may require 10-20 minutes of time to adapt your raytracer to using TinyVector, but it's not too much difficult. The result will be ugly...
>>
>> Bye,
>> bearophile

To clarify: I know I can get the D code to be as fast as the C++ code if I optimize it more, or use custom structs, etc.

That's not the point. The point is getting a comparison of C++ and D using equivalent code.

But, again, thanks for the advice.
« First   ‹ Prev
1 2 3 4