Thread overview | |||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
January 17, 2007 Why is this D code slower than C++? | ||||
---|---|---|---|---|
| ||||
Attachments: | Jacco Bikker wrote several raytracing articles on DevMaster.net. I took his third article and ported it to D. I was surprised to find that the D code is approx. 4 times slower than C++. The raytracer_d renders in approx. 21 sec and the raytracer_cpp renders in approx. 5 sec. I am using the DMD and DMC compilers on Windows. How can the D code be made to run faster? Thanks, Bradley |
January 17, 2007 Re: Why is this D code slower than C++? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Bradley Smith | Bradley Smith wrote:
> Jacco Bikker wrote several raytracing articles on DevMaster.net. I took his third article and ported it to D. I was surprised to find that the D code is approx. 4 times slower than C++.
>
> The raytracer_d renders in approx. 21 sec and the raytracer_cpp renders in approx. 5 sec. I am using the DMD and DMC compilers on Windows.
>
> How can the D code be made to run faster?
>
> Thanks,
> Bradley
>
Your build_d.bat is missing the -release flag? Don't know how much it will gain though.
L.
|
January 17, 2007 Re: Why is this D code slower than C++? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Bradley Smith | Bradley Smith wrote:
> Jacco Bikker wrote several raytracing articles on DevMaster.net. I took his third article and ported it to D. I was surprised to find that the D code is approx. 4 times slower than C++.
>
> The raytracer_d renders in approx. 21 sec and the raytracer_cpp renders in approx. 5 sec. I am using the DMD and DMC compilers on Windows.
>
> How can the D code be made to run faster?
>
> Thanks,
> Bradley
>
That is pretty weird.
I noticed that it doesn't work properly with -release add to the compiler flags. If I do add it I just get a lot of flashing of my desktop icons when I run it, rather than a window popping up with a raytracer inside. Any idea why?
Anyway, after some tweaking of the D version I got it down to 15 sec, vs 10 sec for C++ version on my machine.
Mainly the kinds of thing I did were to make more things inout parameters so they don't get passed by value. Also it looks like maybe your template math functions like DOT and LENGTH aren't getting inlined. Replacing those with the inline code in hotspots like the sphere intersect function sped things up.
Here's was the version of Sphere.Intersect I ended up with:
int Intersect( inout Ray a_Ray, inout float a_Dist ) {
vector3 v = a_Ray.origin;
v -= m_Centre;
//float b = -DOT!(float, vector3) ( v, a_Ray.direction );
vector3 dir = a_Ray.direction;
float b = -(v.x * dir.x +
v.y * dir.y +
v.z * dir.z);
float det = (b * b) - (v.x*v.x+v.y*v.y+v.z*v.z) + m_SqRadius;
int retval = MISS;
if (det > 0) {
det = sqrt( det );
float i2 = b + det;
if (i2 > 0) {
float i1 = b - det;
if (i1 < 0) {
if (i2 < a_Dist) {
a_Dist = i2;
return INPRIM;
}
} else {
if (i1 < a_Dist) {
a_Dist = i1;
return HIT;
}
}
}
}
return retval;
}
The inout on the Ray parameter and the other changes to this function alone change my D runtime from 22 sec to 15 sec.
I also tried making similar changes to the C++ version, but they didn't seem to affect the runtime at all.
--bb
|
January 17, 2007 Re: Why is this D code slower than C++? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Bill Baxter | == Quote from Bill Baxter (dnewsgroup@billbaxter.com)'s article > I noticed that it doesn't work properly with -release add to the compiler flags. That is because in testapp.d the call of RegisterClass is put into an assertion. On my machine the -release flag brings another 25%. > The inout on the Ray parameter and the other changes to this function alone change my D runtime from 22 sec to 15 sec. The compiler should be smart enough to detect, that the Ray parameter is not used as an lvalue and thus can be replaced by a reference. |
January 17, 2007 Re: Why is this D code slower than C++? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Lionello Lunesu | dmd -O -inline -release: 23.2 secs dmc -o+speed: 7,6 secs Averaged over 3 runs. This is without Bill's "inout" optimization, but with RegisterClass fixed. L. |
January 17, 2007 Re: Why is this D code slower than C++? (real results) | ||||
---|---|---|---|---|
| ||||
Posted in reply to Lionello Lunesu | OK, ignore my previous post (it was with a debug build of Phobos). dmd -O -inline -release: 17.7 secs dmc -o+speed: 7.6 secs Averaged over 3 runs. This is without Bill's "inout" optimization, but with RegisterClass fixed. Also, I've also included a std.gc.disable() and I've replaced a "long" with "int", but these changes did not have any effect. L. |
January 17, 2007 Re: Why is this D code slower than C++? | ||||
---|---|---|---|---|
| ||||
Posted in reply to %u | %u wrote:
> == Quote from Bill Baxter (dnewsgroup@billbaxter.com)'s article
>> I noticed that it doesn't work properly with -release add to the
>> compiler flags.
> That is because in testapp.d the call of RegisterClass is put into
> an assertion.
>
> On my machine the -release flag brings another 25%.
>
>> The inout on the Ray parameter and the other changes to this
>> function alone change my D runtime from 22 sec to 15 sec.
>
> The compiler should be smart enough to detect, that the Ray
> parameter is not used as an lvalue and thus can be replaced by a
> reference.
In that respect I'd like to see 'byref' be a synonym for 'inout' as well, so we can tweak those things w/o relying on the compiler, or by using a keyword (inout) that doesn't really fit the situation in which it's being used.
|
January 17, 2007 Re: Why is this D code slower than C++? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Bradley Smith Attachments: | Thanks for all the suggestions. It helps, but not enough to make the D code faster than the C++. It is now 2.6 times slower. The render times are now approx. 13 sec for raytracer_d and approx. 5 sec for raytracer_cpp.
Here are the changes I've made. Attached is the new code.
Call RegisterClass outside of assert. (Broken if -release used)
Apply -release option. (Increases speed in an unknown way)
Converted templates to regular functions. (Templates not being inlined)
Manually inlined DOT function. (Function not being inlined)
Any other suggestions?
Thanks,
Bradley
Bradley Smith wrote:
> Jacco Bikker wrote several raytracing articles on DevMaster.net. I took his third article and ported it to D. I was surprised to find that the D code is approx. 4 times slower than C++.
>
> The raytracer_d renders in approx. 21 sec and the raytracer_cpp renders in approx. 5 sec. I am using the DMD and DMC compilers on Windows.
>
> How can the D code be made to run faster?
>
> Thanks,
> Bradley
>
|
January 17, 2007 Re: Why is this D code slower than C++? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Bradley Smith | I really hope you'll get it faster than the C++ variant.
Might -profile shed some light?
Or maybe I lurk here in learn for a reason :D
> Thanks for all the suggestions. It helps, but not enough to make the D code faster than the C++. It is now 2.6 times slower. The render times are now approx. 13 sec for raytracer_d and approx. 5 sec for raytracer_cpp.
>
> Here are the changes I've made. Attached is the new code.
>
> Call RegisterClass outside of assert. (Broken if -release used)
> Apply -release option. (Increases speed in an unknown way)
> Converted templates to regular functions. (Templates not being inlined)
> Manually inlined DOT function. (Function not being inlined)
>
>
> Any other suggestions?
|
January 17, 2007 Re: Why is this D code slower than C++? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Bradley Smith | On Wed, 17 Jan 2007 11:18:10 -0800, Bradley Smith <digitalmars-com@baysmith.com> wrote: >Thanks for all the suggestions. It helps, but not enough to make the D code faster than the C++. It is now 2.6 times slower. The render times are now approx. 13 sec for raytracer_d and approx. 5 sec for raytracer_cpp. ... > >Any other suggestions? I haven't actually looked at the code, but I'll take a guess anyway. Raytracing is heavy on the floating point math. As Walter Bright acknowledges, the DMD compiler does not handle the optimisation of float arithmetic as well as some C++ compilers. You could try the GNU D compiler - GDC. Since it is using the standard GNU compiler suite backend code generator, it will probably handle the optimisation better. A second option is to split out some key inner-loop calculations and handle them in C, using D for the less performance-sensitive code. Calling C code from D is easy enough, though calling C++ is more of a hassle. This hack could be considered temporary, as the D float performance will no doubt be improved in time. Alternatively, if you don't mind losing portability, you could try using inline assembler for those key inner-loop calculations. If you're a real speed freak, you might even try using SIMD instructions to get 4 float calculations per instruction (and IIRC most SIMD instructions complete in a single clock cycle these days). The down side to that would be lower floating point precision, but for raytracing I wouldn't expect that to be a big deal. -- Remove 'wants' and 'nospam' from e-mail. |
Copyright © 1999-2021 by the D Language Foundation