Jump to page: 1 238  
Page
Thread overview
Slow performance compared to C++, ideas?
May 31, 2013
finalpatch
May 31, 2013
bearophile
May 31, 2013
finalpatch
May 31, 2013
bearophile
May 31, 2013
Rob T
May 31, 2013
finalpatch
May 31, 2013
Walter Bright
May 31, 2013
finalpatch
May 31, 2013
Juan Manuel Cabo
May 31, 2013
Walter Bright
May 31, 2013
finalpatch
May 31, 2013
estew
May 31, 2013
deadalnix
May 31, 2013
Kiith-Sa
May 31, 2013
Jacob Carlborg
May 31, 2013
Manu
May 31, 2013
Sean Cavanaugh
May 31, 2013
Nicolas Guillemot
Jun 01, 2013
deadalnix
Jun 01, 2013
Marco Leise
Jun 02, 2013
deadalnix
May 31, 2013
Michal Minich
May 31, 2013
Jonathan M Davis
May 31, 2013
Manu
May 31, 2013
Manu
Jun 01, 2013
Walter Bright
Jun 01, 2013
Marco Leise
Jun 01, 2013
Jonathan M Davis
Jun 02, 2013
Jacob Carlborg
Jun 02, 2013
Jonathan M Davis
Jun 02, 2013
Manu
Jun 02, 2013
Jonathan M Davis
Jun 02, 2013
Timon Gehr
Jun 02, 2013
Jonathan M Davis
Jun 03, 2013
Manu
Jun 03, 2013
Kapps
Jun 03, 2013
deadalnix
Jun 03, 2013
Jacob Carlborg
Jun 03, 2013
Manu
Jun 03, 2013
Paulo Pinto
Jun 03, 2013
Jonathan M Davis
Jun 03, 2013
Manu
Jun 03, 2013
Paulo Pinto
Jun 03, 2013
w0rp
Jun 03, 2013
Simen Kjaeraas
Jun 03, 2013
Paulo Pinto
Jun 03, 2013
deadalnix
Jun 03, 2013
Dicebot
Jun 03, 2013
Jonathan M Davis
Jun 04, 2013
deadalnix
Jun 03, 2013
Paulo Pinto
Jun 03, 2013
deadalnix
Jun 03, 2013
Manu
Jun 03, 2013
Byron Heads
Jun 03, 2013
Manu
Jun 03, 2013
David Nadlinger
Jun 05, 2013
Michal Minich
Jun 05, 2013
Regan Heath
Jun 05, 2013
Michal Minich
Jun 05, 2013
Paulo Pinto
Jun 04, 2013
Jonathan M Davis
Jun 04, 2013
Jonathan M Davis
Jun 05, 2013
Kapps
Jun 05, 2013
Walter Bright
Jun 05, 2013
Walter Bright
Jun 05, 2013
Adam D. Ruppe
Jun 05, 2013
Paulo Pinto
Jun 05, 2013
Adam D. Ruppe
Jun 05, 2013
Walter Bright
Jun 06, 2013
Walter Bright
Jun 06, 2013
Adam D. Ruppe
Jun 06, 2013
Jonathan M Davis
Jun 06, 2013
Michel Fortin
Jun 06, 2013
Jonathan M Davis
Jun 12, 2013
Daniel Murphy
Jun 12, 2013
Jonathan M Davis
Jun 12, 2013
deadalnix
Jun 06, 2013
Rob T
Jun 06, 2013
deadalnix
Jun 06, 2013
Rob T
Jun 06, 2013
H. S. Teoh
Jun 06, 2013
Rob T
Jun 05, 2013
Simen Kjaeraas
Jun 06, 2013
deadalnix
Jun 06, 2013
Max Samukha
Jun 05, 2013
Paulo Pinto
Jun 05, 2013
bearophile
Jun 05, 2013
Manu
Jun 06, 2013
deadalnix
Jun 06, 2013
deadalnix
Jun 06, 2013
Paulo Pinto
Jun 06, 2013
deadalnix
Jun 06, 2013
Kapps
Jun 06, 2013
Kapps
Jun 06, 2013
deadalnix
Jun 06, 2013
deadalnix
Jun 06, 2013
Walter Bright
Jun 06, 2013
Walter Bright
Jun 07, 2013
deadalnix
Jun 07, 2013
Jakob Ovrum
Jun 07, 2013
Walter Bright
Jun 07, 2013
deadalnix
Jun 07, 2013
Manu
Jun 06, 2013
Timon Gehr
Jun 06, 2013
Walter Bright
Jun 06, 2013
Jonathan M Davis
Jun 06, 2013
Walter Bright
Jun 06, 2013
Walter Bright
Jun 06, 2013
Jonathan M Davis
Jun 06, 2013
Jonathan M Davis
Jun 06, 2013
Jonathan M Davis
Jun 06, 2013
Walter Bright
Jun 06, 2013
Jonathan M Davis
Jun 06, 2013
Walter Bright
Jun 06, 2013
Flamaros
Jun 07, 2013
Walter Bright
Jun 07, 2013
deadalnix
Jun 07, 2013
Jakob Ovrum
Jun 07, 2013
Walter Bright
Jun 07, 2013
Jonathan M Davis
Jun 07, 2013
Walter Bright
Jun 07, 2013
Brad Roberts
Jun 07, 2013
Simen Kjaeraas
Jun 07, 2013
Walter Bright
Jun 07, 2013
Timon Gehr
Jun 07, 2013
Walter Bright
Jun 08, 2013
Timon Gehr
Jun 09, 2013
Walter Bright
Jun 09, 2013
Timon Gehr
Jun 09, 2013
Simen Kjaeraas
Jun 10, 2013
Andrej Mitrovic
Jun 07, 2013
Peter Alexander
Jun 07, 2013
Dicebot
Jun 07, 2013
Rob T
Jun 07, 2013
Walter Bright
Jun 08, 2013
Rob T
Jun 07, 2013
Peter Williams
Jun 07, 2013
Walter Bright
Jun 08, 2013
Peter Williams
Jun 07, 2013
Jonathan M Davis
Jun 07, 2013
Jonathan M Davis
Jun 07, 2013
Jonathan M Davis
Jun 07, 2013
Walter Bright
Jun 07, 2013
deadalnix
Jun 07, 2013
Walter Bright
Jun 08, 2013
David Nadlinger
Jun 08, 2013
Manu
Jun 08, 2013
Walter Bright
Jun 08, 2013
Manu
Jun 08, 2013
Flamaros
Jun 09, 2013
Manu
Jun 09, 2013
Rob T
Jun 08, 2013
Walter Bright
Jun 09, 2013
Manu
Jun 08, 2013
deadalnix
Jun 08, 2013
Dicebot
Jun 08, 2013
deadalnix
Jun 08, 2013
Dicebot
Jun 08, 2013
deadalnix
Jun 11, 2013
Paulo Pinto
Jun 07, 2013
Walter Bright
Jun 07, 2013
Jonathan M Davis
Jun 07, 2013
Walter Bright
Jun 07, 2013
Dicebot
Jun 06, 2013
Manu
Jun 07, 2013
deadalnix
Jun 07, 2013
Walter Bright
Jun 07, 2013
Manu
Jun 07, 2013
deadalnix
Jun 07, 2013
Walter Bright
Jun 07, 2013
Walter Bright
Jun 04, 2013
Manu
Jun 04, 2013
Declan
Jun 04, 2013
Manu
Jun 04, 2013
Manu
Jun 04, 2013
deadalnix
Jun 04, 2013
Manu
Jun 04, 2013
deadalnix
Jun 04, 2013
Manu
Jun 04, 2013
Adam D. Ruppe
Jun 04, 2013
John Colvin
Jun 04, 2013
deadalnix
Jun 04, 2013
John Colvin
Jun 04, 2013
deadalnix
Jun 04, 2013
deadalnix
Jun 04, 2013
Jacob Carlborg
Jun 04, 2013
Max Samukha
Jun 06, 2013
David Nadlinger
Jun 06, 2013
deadalnix
Jun 06, 2013
Dmitry Olshansky
Jun 04, 2013
Manu
Jun 04, 2013
Simen Kjaeraas
Jun 04, 2013
Simen Kjaeraas
Jun 04, 2013
Manu
Jun 04, 2013
Rob T
Jun 04, 2013
deadalnix
Jun 04, 2013
Rob T
Jun 05, 2013
Manu
Jun 04, 2013
Jonathan M Davis
Jun 04, 2013
Dicebot
Jun 04, 2013
Rob T
Jun 04, 2013
deadalnix
Jun 04, 2013
Jonathan M Davis
Jun 04, 2013
deadalnix
Jun 04, 2013
Walter Bright
Jun 04, 2013
Dicebot
Jun 04, 2013
Walter Bright
Jun 04, 2013
Flamaros
Jun 04, 2013
Paulo Pinto
Jun 04, 2013
deadalnix
Jun 04, 2013
Jonathan M Davis
Jun 04, 2013
Jacob Carlborg
Jun 04, 2013
Sean Cavanaugh
Jun 04, 2013
Walter Bright
Jun 04, 2013
Sean Cavanaugh
Jun 04, 2013
Walter Bright
Jun 04, 2013
Sean Cavanaugh
Jun 04, 2013
Manu
Jun 04, 2013
Jonathan M Davis
Jun 04, 2013
Zach the Mystic
Jun 04, 2013
Zach the Mystic
Jun 04, 2013
Jonathan M Davis
Jun 04, 2013
Jacob Carlborg
Jun 04, 2013
Jacob Carlborg
Jun 04, 2013
Peter Alexander
Jun 04, 2013
Manu
Jun 04, 2013
Dicebot
Jun 04, 2013
Manu
Jun 04, 2013
Jerry
Jun 04, 2013
Jonathan M Davis
Jun 04, 2013
Jerry
Jun 06, 2013
Walter Bright
Jun 06, 2013
Walter Bright
Jun 07, 2013
Andrej Mitrovic
Jun 07, 2013
David Nadlinger
Jun 07, 2013
Andrej Mitrovic
Jun 07, 2013
Manu
Jun 07, 2013
Jonathan M Davis
Jun 07, 2013
Namespace
Jun 04, 2013
Simen Kjaeraas
Jun 06, 2013
Robert
Jun 06, 2013
Robert
Jun 02, 2013
Manu
Jun 02, 2013
Martin Nowak
May 31, 2013
FeepingCreature
May 31, 2013
FeepingCreature
May 31, 2013
finalpatch
May 31, 2013
Juan Manuel Cabo
May 31, 2013
finalpatch
Jun 02, 2013
Martin Nowak
May 31, 2013
nazriel
May 31, 2013
Juan Manuel Cabo
May 31, 2013
nazriel
May 31, 2013
finalpatch
May 31, 2013
deadalnix
May 31, 2013
dennis luehring
May 31, 2013
nazriel
May 31, 2013
finalpatch
May 31, 2013
nazriel
May 31, 2013
Manu
May 31, 2013
bearophile
Jun 01, 2013
David Nadlinger
Jun 02, 2013
bearophile
Jun 01, 2013
David Nadlinger
May 31, 2013
Namespace
May 31, 2013
bearophile
May 31, 2013
Manu
May 31, 2013
bearophile
May 31, 2013
Manu
May 31, 2013
bearophile
May 31, 2013
Manu
Jun 01, 2013
Benjamin Thaut
Jun 01, 2013
Paulo Pinto
Jun 02, 2013
Manu
Jun 02, 2013
Paulo Pinto
Jun 02, 2013
John Colvin
Jun 02, 2013
Paulo Pinto
Jun 02, 2013
Walter Bright
Jun 02, 2013
Manu
Jun 02, 2013
Roy Obena
Jun 02, 2013
John Colvin
Jun 03, 2013
Manu
Jun 04, 2013
SomeDude
Jun 02, 2013
bearophile
May 31, 2013
Timon Gehr
May 31, 2013
Manu
May 31, 2013
finalpatch
May 31, 2013
Manu
May 31, 2013
Manu
May 31, 2013
Dicebot
Jun 01, 2013
Jonathan M Davis
Jun 01, 2013
Juan Manuel Cabo
Jun 01, 2013
deadalnix
Jun 01, 2013
Marco Leise
Jun 02, 2013
deadalnix
Jun 01, 2013
Juan Manuel Cabo
Jun 01, 2013
Jonathan M Davis
Jun 01, 2013
Paulo Pinto
Jun 01, 2013
finalpatch
Jun 01, 2013
bearophile
Jun 01, 2013
finalpatch
Jun 02, 2013
finalpatch
Jun 02, 2013
finalpatch
Jun 02, 2013
David Nadlinger
Jun 02, 2013
Iain Buclaw
Jun 02, 2013
Walter Bright
Jun 06, 2013
Jonathan M Davis
Jun 06, 2013
Michal Minich
Jun 06, 2013
Jonathan M Davis
Jun 06, 2013
Michal Minich
Jun 06, 2013
H. S. Teoh
May 31, 2013
Recently I ported a simple ray tracer I wrote in C++11 to D. Thanks to the similarity between D and C++ it was almost a line by line translation, in other words, very very close. However, the D verson runs much slower than the C++11 version. On Windows, with MinGW GCC and GDC, the C++ version is twice as fast as the D version. On OSX, I used Clang++ and LDC, and the C++11 version was 4x faster than D verson.  Since the comparison were between compilers that share the same codegen backends I suppose that's a relatively fair comparison.  (flags used for GDC: -O3 -fno-bounds-check -frelease,  flags used for LDC: -O3 -release)

I really like the features offered by D but it's the raw performance that's worrying me. From what I read D should offer similar performance when doing similar things but my own test results is not consistent with this claim. I want to know whether this slowness is inherent to the language or it's something I was not doing right (very possible because I have only a few days of experience with D).

Below is the link to the D and C++ code, in case anyone is interested to have a look.

https://dl.dropboxusercontent.com/u/974356/raytracer.d
https://dl.dropboxusercontent.com/u/974356/raytracer.cpp
May 31, 2013
finalpatch:

> I really like the features offered by D but it's the raw performance that's worrying me.

From my experience if you know what you are doing, you are able to write that kind of numerical D code that LDC compiles with a performance very close to C++, and sometimes higher. But you need to be careful about some things.

Don't do this:
foreach (y; (iota(height)))

Use this, because those abstractions are not for free:
foreach (y;  0 .. height)

Be careful with foreach on arrays of structs, because it perform copies that are slow if the structs aren't very small.

Be careful with classes, because on default their methods are virtual. Sometimes in D you want to use structs for performance reasons.

Sometimes in inner loops it's better to use a classic for instead of a foreach.

LDC needs far more flags to compile a raytracer well. LDC even support link time optimization, but you need even more obscure flags.

Also the ending brace of classes and structs doesn't need a semicolon in D.

Bye,
bearophile
May 31, 2013
Hi bearophile,

Thanks for the reply. I changed it to 0..height and it has no measurable effect to the runtime.

The reason I used iota(height) was to test std.parallelism.parallel. On Windows if I do foreach (y; parallel(iota(height))) I do get almost 4x speed up on a quadcore computer. However, on OSX, parallel() either does nothing (LDC) or makes it slower than single threaded(DMD).

On Friday, 31 May 2013 at 01:42:53 UTC, bearophile wrote:
> Don't do this:
> foreach (y; (iota(height)))
>
> Use this, because those abstractions are not for free:
> foreach (y;  0 .. height)
May 31, 2013
finalpatch:

> Thanks for the reply. I changed it to 0..height and it has no measurable effect to the runtime.

Have you also fixed all the other things? :-) Probably you have to keep fixing potentially slow spots until you find the truly slow ones.

Bye,
bearophile
May 31, 2013
I don't know if this is the case with the code in question (I have not looked at it), but sometimes there will be a significant effect on performance caused by the use of the garbage collector. This is an area in need of radical improvements.

You have to minimize situations where there's a lot of allocations going on while the GC is enabled because that will fire up the GC more often than is required and it can slow down your app significantly; A 2x or more performance penalty is certainly possible. It can also make performance unpredictable with large delays at inappropriate points in the execution.

BTW, you should post questions like this into d.learn rather than in the general discussion area.

--rt
May 31, 2013
Hi Rob,

I have tried put GC.disable() and GC.enable() around the rendering call and it made no difference.

On Friday, 31 May 2013 at 02:13:36 UTC, Rob T wrote:
> I don't know if this is the case with the code in question (I have not looked at it), but sometimes there will be a significant effect on performance caused by the use of the garbage collector. This is an area in need of radical improvements.
>
> You have to minimize situations where there's a lot of allocations going on while the GC is enabled because that will fire up the GC more often than is required and it can slow down your app significantly; A 2x or more performance penalty is certainly possible. It can also make performance unpredictable with large delays at inappropriate points in the execution.
>
> BTW, you should post questions like this into d.learn rather than in the general discussion area.
>
> --rt

May 31, 2013
On 5/30/2013 6:26 PM, finalpatch wrote:
> Recently I ported a simple ray tracer I wrote in C++11 to D. Thanks to the
> similarity between D and C++ it was almost a line by line translation, in other
> words, very very close. However, the D verson runs much slower than the C++11
> version. On Windows, with MinGW GCC and GDC, the C++ version is twice as fast as
> the D version. On OSX, I used Clang++ and LDC, and the C++11 version was 4x
> faster than D verson.  Since the comparison were between compilers that share
> the same codegen backends I suppose that's a relatively fair comparison.  (flags
> used for GDC: -O3 -fno-bounds-check -frelease,  flags used for LDC: -O3 -release)

For max speed using dmd, use the flags:

   -O -release -inline -noboundscheck

The -inline is especially important.


> I really like the features offered by D but it's the raw performance that's
> worrying me. From what I read D should offer similar performance when doing
> similar things but my own test results is not consistent with this claim. I want
> to know whether this slowness is inherent to the language or it's something I
> was not doing right (very possible because I have only a few days of experience
> with D).
>
> Below is the link to the D and C++ code, in case anyone is interested to have a
> look.
>
> https://dl.dropboxusercontent.com/u/974356/raytracer.d
> https://dl.dropboxusercontent.com/u/974356/raytracer.cpp

May 31, 2013
Hi Walter,

Thanks for the reply. I have already tried these flags. However, DMD's codegen is lagging behind GCC and LLVM at the moment, so even with these flags, the runtime is ~10x longer than the C++ version compiled with clang++ (2sec with DMD, 200ms with clang++ on a Core2 Mac Pro). I know this is comparing apples to oranges though, that's why I was comparing GDC vs G++ and LDC vs Clang++.

On Friday, 31 May 2013 at 02:19:40 UTC, Walter Bright wrote:
> For max speed using dmd, use the flags:
>
>    -O -release -inline -noboundscheck
>
> The -inline is especially important.
May 31, 2013
On 05/30/2013 11:31 PM, finalpatch wrote:
> Hi Walter,
> 
> Thanks for the reply. I have already tried these flags. However, DMD's codegen is lagging behind GCC and LLVM at the moment, so even with these flags, the runtime is ~10x longer than the C++ version compiled with clang++ (2sec with DMD, 200ms with clang++ on a Core2 Mac Pro). I know this is comparing apples to oranges though, that's why I was comparing GDC vs G++ and LDC vs Clang++.
> 
> On Friday, 31 May 2013 at 02:19:40 UTC, Walter Bright wrote:
>> For max speed using dmd, use the flags:
>>
>>    -O -release -inline -noboundscheck
>>
>> The -inline is especially important.


Have you tried:

     dmd -profile

it compiles in trace generation, so that when you run the program you get a .log file which tells you the slowest functions and other info.

Please not that the resulting code compiled with -profile is slower because it is instrumented.

--jm


May 31, 2013
On 5/30/13 9:26 PM, finalpatch wrote:
> https://dl.dropboxusercontent.com/u/974356/raytracer.d
> https://dl.dropboxusercontent.com/u/974356/raytracer.cpp

Manu's gonna love this one: make all methods final.

Andrei
« First   ‹ Prev
1 2 3 4 5 6 7 8 9 10 11