January 18, 2007
Try this version. In MSVC C++, float -> double, funcf -> func for the floating funcs (sqrtf, expf). It improves the time from 8.6 to 5.7 seconds on my computer. The same process makes the D version slower.

Bradley Smith wrote:
> 
> Bradley Smith wrote:
>> As Bill Baxter pointed out, I missed an optimization on version 2. The pass by reference optimization using the inout on the Intersect's Ray argument. I had applied inout only to the Raytrace's Ray argument.
>>
>> The further optimization brings the following approx. timings:
>>
>>         time     factor
>>   dmc    5 sec    1.0
>>   dmd    9 sec    1.8
>>   gdc    13 sec   2.6
>     gdc    10 sec   2.0      <-- correction
>>   msvc   5 sec    1.0
>>   g++    doesn't compile
>>
> 
> Here is a correction to the gdc results. The wrong optimization flag was used. The build_d_gdc.bat should have "-O3" rather than "-O".



January 19, 2007
%u wrote:
> Bradley Smith Wrote:
>> What technical documentation would be proper? What would it
>> contain?
> 
> As always such depends on the requirements of the presumed readers.
> 
> If you are able to change your position from the view of the porter to the view of a verifier or freshly introduced maintainer of the port, then you will have an impression of what you would want to look at first.
> 
> It is a pity as it stands, that the question for the content of the technical documentation raises at all.

Dude, it's a toy raytracer ported from some free code someone posted to a website somewhere.  Why should it come with gobs of documentation?

But anyway, the original code was part of a series of tutorials.  I think the version Bradley posted was probably from this installment:
   http://www.devmaster.net/articles/raytracing_series/part3.php
As the series goes on, the author adds more and more fancy features to the raytracer.  Anyway, the tutorials are already far more documentation than you'll find for most free code out in the wild.

--bb
January 19, 2007
Yes, I see that behavior too. Using doubles, here is what I get.

   dmc    6 sec
   dmd    19 sec
   gdc    17 sec
   msvc   4 sec

It is also interesting that the msvc gets better where the dmc gets worse. I wouldn't stake to much on it though, since these are approximate timings.

Thanks,
  Bradley

Daniel Giddings wrote:
> Try this version. In MSVC C++, float -> double, funcf -> func for the floating funcs (sqrtf, expf). It improves the time from 8.6 to 5.7 seconds on my computer. The same process makes the D version slower.
> 
> Bradley Smith wrote:
>>
>> Bradley Smith wrote:
>>> As Bill Baxter pointed out, I missed an optimization on version 2. The pass by reference optimization using the inout on the Intersect's Ray argument. I had applied inout only to the Raytrace's Ray argument.
>>>
>>> The further optimization brings the following approx. timings:
>>>
>>>         time     factor
>>>   dmc    5 sec    1.0
>>>   dmd    9 sec    1.8
>>>   gdc    13 sec   2.6
>>     gdc    10 sec   2.0      <-- correction
>>>   msvc   5 sec    1.0
>>>   g++    doesn't compile
>>>
>>
>> Here is a correction to the gdc results. The wrong optimization flag was used. The build_d_gdc.bat should have "-O3" rather than "-O".
> 
January 19, 2007
You must have made a mistake somewhere, because the rendered image from D and C++ are not the same!

The image from the D exe has a lone white pixel (also present in the 'float' versions, both D and cpp), but that white pixel is gone in the cpp version (both dmc and msvc).

L.
January 19, 2007
Lionello Lunesu wrote:
> You must have made a mistake somewhere, because the rendered image from D and C++ are not the same!
> 
> The image from the D exe has a lone white pixel (also present in the 'float' versions, both D and cpp), but that white pixel is gone in the cpp version (both dmc and msvc).
> 
> L.

Sorry, I thought the .d files were also using 'double', but they're not.. This explains the different outcome.

L.
January 19, 2007
Bill Baxter wrote:
> Dave wrote:
>> Bradley Smith wrote:
>>> Thanks for all the suggestions. It helps, but not enough to make the D code faster than the C++. It is now 2.6 times slower. The render times are now approx. 13 sec for raytracer_d and approx. 5 sec for raytracer_cpp.
>>>
>>> Here are the changes I've made. Attached is the new code.
>>>
>>>   Call RegisterClass outside of assert. (Broken if -release used)
>>>   Apply -release option. (Increases speed in an unknown way)
>>>   Converted templates to regular functions. (Templates not being inlined)
>>
>> Are you sure? I know templates can be/are inlined and I guess I haven't noticed anywhere they aren't were I'd expect a regularly defined function to be inlined.
> 
> I changed a bunch parameters to inout after discovering that it made a difference for the Intersect method.  It could be that I had the template parameters as inout at the time when getting rid of the templates seemed to make a difference.
> 
> That's evil that inout disables inlining.
> Seems like inout params would be easier to inline than regular parameters, but I guess not.

I agree and have been wondering about that for some time - my guess is that it caused some type of bug early on and Walter didn't have the time to loop back and fix.

> 
> --bb
January 19, 2007
nobody_ wrote:
> I think this thread is worth posting as a (D-performance) tutorial or something.
> Alot of interesting performance issues have come up, of which most were unknown to me :)
> 

Hopefully the need for a tutorial on performance will soon be deprecated by better optimizations and a faster GC <g>

> What do you think? 
> 
> 
January 20, 2007

Dave wrote:
> Bradley Smith wrote:
>> Thanks for all the suggestions. It helps, but not enough to make the D code faster than the C++. It is now 2.6 times slower. The render times are now approx. 13 sec for raytracer_d and approx. 5 sec for raytracer_cpp.
>>
>> Here are the changes I've made. Attached is the new code.
>>
>>   Call RegisterClass outside of assert. (Broken if -release used)
>>   Apply -release option. (Increases speed in an unknown way)
>>   Converted templates to regular functions. (Templates not being inlined)
> 
> Are you sure? I know templates can be/are inlined and I guess I haven't noticed anywhere they aren't were I'd expect a regularly defined function to be inlined.
> 

You are correct. I have confirmed that the templates and regular functions are inlined. However, the way they are inlined appears to perform much more moving of data around than manually inlining. Perhaps the extra data moving is the cause of the performance degredation by using the function or template.

I can also confirm that using inout on the function parameters will cause it to not be inlined.

Thanks,
  Bradley
January 22, 2007
The Java implementation is also faster.

         time     factor   memory
   dmc    5 sec    1.0      5 MB
   java   8 sec    1.6     72 MB  (Java 1.6.0 -server)
   dmd    9 sec    1.8      5 MB
   java  19 sec    3.8     19 MB  (Java 1.6.0 -client)

However, Java uses much more memory.

All three implementations are in the attached zip.

Thanks,
   Bradley



1 2 3 4 5
Next ›   Last »