January 17, 2007 Re: Why is this D code slower than C++? | ||||
---|---|---|---|---|
| ||||
Posted in reply to nobody_ | nobody_ wrote: > I really hope you'll get it faster than the C++ variant. > > Might -profile shed some light? > Or maybe I lurk here in learn for a reason :D > > > >>Thanks for all the suggestions. It helps, but not enough to make the D >>code faster than the C++. It is now 2.6 times slower. The render times >>are now approx. 13 sec for raytracer_d and approx. 5 sec for raytracer_cpp. >> >>Here are the changes I've made. Attached is the new code. >> >> Call RegisterClass outside of assert. (Broken if -release used) >> Apply -release option. (Increases speed in an unknown way) >> Converted templates to regular functions. (Templates not being inlined) >> Manually inlined DOT function. (Function not being inlined) >> >> >>Any other suggestions? > > > I ran it with -profile and it takes about 25 min. here's the log http://www.webpages.uidaho.edu/~shro8822/trace.log |
January 17, 2007 Re: Why is this D code slower than C++? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Steve Horne | On Wed, 17 Jan 2007 22:34:31 +0000, Steve Horne <stephenwantshornenospam100@aol.com> wrote: >On Wed, 17 Jan 2007 11:18:10 -0800, Bradley Smith <digitalmars-com@baysmith.com> wrote: > >>Thanks for all the suggestions. It helps, but not enough to make the D code faster than the C++. It is now 2.6 times slower. The render times are now approx. 13 sec for raytracer_d and approx. 5 sec for raytracer_cpp. > >... > >> >>Any other suggestions? > >I haven't actually looked at the code, but I'll take a guess anyway. > >Raytracing is heavy on the floating point math. As Walter Bright acknowledges, the DMD compiler does not handle the optimisation of float arithmetic as well as some C++ compilers. On second thoughts, if you're comparing with the DMC compiler for C++, floating point math performance seems a less likely issue. It seems odd that there's such a difference between the DMD and DMC compilers. You'd think the DMD compiler would use much the same back-end code generation that DMC does. -- Remove 'wants' and 'nospam' from e-mail. |
January 18, 2007 Re: Why is this D code slower than C++? | ||||
---|---|---|---|---|
| ||||
Posted in reply to BCS | BCS Wrote:
> here's the log
>
> http://www.webpages.uidaho.edu/~shro8822/trace.log
That looks like the use of foreach lets the performance go down. Maybe its due to the numerous calls of delegates.
|
January 18, 2007 Re: Why is this D code slower than C++? | ||||
---|---|---|---|---|
| ||||
Posted in reply to %u | %u wrote:
> BCS Wrote:
>> here's the log
>>
>> http://www.webpages.uidaho.edu/~shro8822/trace.log
>
> That looks like the use of foreach lets the performance go down. Maybe its due to the numerous calls of delegates.
No, it shows foreach there because a lot of stuff got inlined and it's only seen by the profiler as the foreach's body. In my experience, more meaningful results can be obtained if -profile is used without -inline.
--
Tomasz Stachowiak
|
January 18, 2007 Re: Why is this D code slower than C++? | ||||
---|---|---|---|---|
| ||||
Posted in reply to BCS Attachments: | > > I ran it with -profile and it takes about 25 min. Talk about overhead :) cpp took about 7 minutes (log attached) > > here's the log > > http://www.webpages.uidaho.edu/~shro8822/trace.log |
January 18, 2007 Re: Why is this D code slower than C++? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Bradley Smith | Bradley Smith wrote:
> Thanks for all the suggestions. It helps, but not enough to make the D code faster than the C++. It is now 2.6 times slower. The render times are now approx. 13 sec for raytracer_d and approx. 5 sec for raytracer_cpp.
>
> Here are the changes I've made. Attached is the new code.
>
> Call RegisterClass outside of assert. (Broken if -release used)
> Apply -release option. (Increases speed in an unknown way)
> Converted templates to regular functions. (Templates not being inlined)
> Manually inlined DOT function. (Function not being inlined)
You left out changing Intersect's Ray argument to be inout. And generally all Ray (and possibly vector3 parameters) to be inout to avoid the cost of copying them on the stack.
Also converting vector expressions like
vector3 v = a_Ray.origin - m_Centre;
to
vector3 v = a_Ray.origin;
v -= m_Centre;
makes a difference. Changing that one line in the Sphere.Intersect routine changes my runtime from 12.2 to 14.3 sec.
Interestingly the same sort of transformation to the C++ code didn't seem to make much difference. It could be related in part to the C++ vector parameters on the operators all taking const vector& (references) vs the D ones being just plain vector3. Chaging all the operators in the D version to inout may help speed too.
With those changes on my Intel Xeon 3.6GHz CPU the run times are about 10.1 sec vs 12.2 sec. D still not as fast as the C++, but close.
--bb
|
January 18, 2007 Re: Why is this D code slower than C++? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Bill Baxter | Bill Baxter wrote:
> Bradley Smith wrote:
>> Thanks for all the suggestions. It helps, but not enough to make the D code faster than the C++. It is now 2.6 times slower. The render times are now approx. 13 sec for raytracer_d and approx. 5 sec for raytracer_cpp.
>>
>> Here are the changes I've made. Attached is the new code.
>>
>> Call RegisterClass outside of assert. (Broken if -release used)
>> Apply -release option. (Increases speed in an unknown way)
>> Converted templates to regular functions. (Templates not being inlined)
>> Manually inlined DOT function. (Function not being inlined)
>
> You left out changing Intersect's Ray argument to be inout. And generally all Ray (and possibly vector3 parameters) to be inout to avoid the cost of copying them on the stack.
>
> Also converting vector expressions like
> vector3 v = a_Ray.origin - m_Centre;
> to
> vector3 v = a_Ray.origin;
> v -= m_Centre;
>
> makes a difference. Changing that one line in the Sphere.Intersect routine changes my runtime from 12.2 to 14.3 sec.
>
> Interestingly the same sort of transformation to the C++ code didn't seem to make much difference. It could be related in part to the C++ vector parameters on the operators all taking const vector& (references) vs the D ones being just plain vector3. Chaging all the operators in the D version to inout may help speed too.
>
> With those changes on my Intel Xeon 3.6GHz CPU the run times are about 10.1 sec vs 12.2 sec. D still not as fast as the C++, but close.
>
> --bb
One more thing to try (now that auto classes are allocated on the stack) is to convert the structs to classes and pass those around. Of course you can't return those from things like opSub(), so you'd have to always use opXxxAssign(), etc. I haven't gone over the code in detail, so maybe this is not really feasible but maybe worth a shot?
IIRC, one of the problems with using 'inout' as function params. is that those are excluded from consideration for in-lining with the current D compiler front-end.
|
January 18, 2007 Re: Why is this D code slower than C++? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Bradley Smith | Bradley Smith wrote: > Thanks for all the suggestions. It helps, but not enough to make the D code faster than the C++. It is now 2.6 times slower. The render times are now approx. 13 sec for raytracer_d and approx. 5 sec for raytracer_cpp. > > Here are the changes I've made. Attached is the new code. > > Call RegisterClass outside of assert. (Broken if -release used) > Apply -release option. (Increases speed in an unknown way) > Converted templates to regular functions. (Templates not being inlined) Are you sure? I know templates can be/are inlined and I guess I haven't noticed anywhere they aren't were I'd expect a regularly defined function to be inlined. > Manually inlined DOT function. (Function not being inlined) > > > Any other suggestions? > > Thanks, > Bradley > > Bradley Smith wrote: >> Jacco Bikker wrote several raytracing articles on DevMaster.net. I took his third article and ported it to D. I was surprised to find that the D code is approx. 4 times slower than C++. >> >> The raytracer_d renders in approx. 21 sec and the raytracer_cpp renders in approx. 5 sec. I am using the DMD and DMC compilers on Windows. >> >> How can the D code be made to run faster? >> >> Thanks, >> Bradley >> |
January 18, 2007 Re: Why is this D code slower than C++? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Dave | Dave wrote:
> Bradley Smith wrote:
>> Thanks for all the suggestions. It helps, but not enough to make the D code faster than the C++. It is now 2.6 times slower. The render times are now approx. 13 sec for raytracer_d and approx. 5 sec for raytracer_cpp.
>>
>> Here are the changes I've made. Attached is the new code.
>>
>> Call RegisterClass outside of assert. (Broken if -release used)
>> Apply -release option. (Increases speed in an unknown way)
>> Converted templates to regular functions. (Templates not being inlined)
>
> Are you sure? I know templates can be/are inlined and I guess I haven't noticed anywhere they aren't were I'd expect a regularly defined function to be inlined.
I changed a bunch parameters to inout after discovering that it made a difference for the Intersect method. It could be that I had the template parameters as inout at the time when getting rid of the templates seemed to make a difference.
That's evil that inout disables inlining.
Seems like inout params would be easier to inline than regular parameters, but I guess not.
--bb
|
January 18, 2007 Re: Why is this D code slower than C++? | ||||
---|---|---|---|---|
| ||||
Posted in reply to %u | %u wrote:
> == Quote from Bill Baxter (dnewsgroup@billbaxter.com)'s article
>> I noticed that it doesn't work properly with -release add to the
>> compiler flags.
> That is because in testapp.d the call of RegisterClass is put into
> an assertion.
>
> On my machine the -release flag brings another 25%.
>
>> The inout on the Ray parameter and the other changes to this
>> function alone change my D runtime from 22 sec to 15 sec.
>
> The compiler should be smart enough to detect, that the Ray
> parameter is not used as an lvalue and thus can be replaced by a
> reference.
No, it can't.. Passing a struct by ref will result in unexpected behavior if it changes in some other thread. As always, the default should be safe no matter what, and that means copying the struct's contents.
I guess a new modifier like "byref" is the only option..
L.
|
Copyright © 1999-2021 by the D Language Foundation