Jump to page: 1 2 3
Thread overview
Naive node.js faster than naive LDC2?
Aug 21, 2020
James Lu
Aug 21, 2020
James Lu
Aug 21, 2020
James Lu
Aug 21, 2020
H. S. Teoh
Aug 22, 2020
MoonlightSentinel
Aug 22, 2020
James Lu
Aug 22, 2020
H. S. Teoh
Aug 22, 2020
James Lu
Aug 22, 2020
Arun
Aug 22, 2020
H. S. Teoh
Aug 22, 2020
aberba
Aug 22, 2020
bachmeier
Aug 22, 2020
H. S. Teoh
Aug 22, 2020
James Lu
Aug 22, 2020
jmh530
Aug 22, 2020
kinke
Aug 22, 2020
Arjan
Aug 22, 2020
Avrina
Aug 22, 2020
Avrina
Aug 24, 2020
Avrina
Aug 22, 2020
kinke
Aug 24, 2020
bachmeier
August 21, 2020
Code: https://gist.github.com/CrazyPython/364f11465dab90d611ecc81490682680

LDC 1.23.0 (Installed from dlang.org)

ldc2 -release -mcpu=native -O3 -ffast-math --fp-contract=fast

Node v14.40 (V8 8.1.307.31)

Dlang trials: 2957 2560 2048 Average: 2521
Node.JS trials: 1988 2567 1863 Average: 2139

Notes:

 - I had to reinstall Dlang from the install script
 - I was initially confused why -mtune=native didn't work, and had to read documentation. Would have been nice if the compiler told me -mcpu=native was what I needed.
 - I skipped -march=native. Did not find information on the wiki https://wiki.dlang.org/Using_LDC
 - Node.js compiles faster and uses a compilation cache

Mandatory citation: https://github.com/brion/mandelbrot-shootout
August 21, 2020
On Friday, 21 August 2020 at 23:10:53 UTC, James Lu wrote:
> Code: https://gist.github.com/CrazyPython/364f11465dab90d611ecc81490682680
>
> LDC 1.23.0 (Installed from dlang.org)
>
> ldc2 -release -mcpu=native -O3 -ffast-math --fp-contract=fast
>
> Node v14.40 (V8 8.1.307.31)
>
> Dlang trials: 2957 2560 2048 Average: 2521
> Node.JS trials: 1988 2567 1863 Average: 2139
>
> Notes:
>
>  - I had to reinstall Dlang from the install script
>  - I was initially confused why -mtune=native didn't work, and had to read documentation. Would have been nice if the compiler told me -mcpu=native was what I needed.
>  - I skipped -march=native. Did not find information on the wiki https://wiki.dlang.org/Using_LDC
>  - Node.js compiles faster and uses a compilation cache
>
> Mandatory citation: https://github.com/brion/mandelbrot-shootout

With the double type:

Node: 2211 2574 2306
Dlang: 2520 1891 1676

August 21, 2020
On Friday, 21 August 2020 at 23:14:12 UTC, James Lu wrote:
> On Friday, 21 August 2020 at 23:10:53 UTC, James Lu wrote:
>> Code: https://gist.github.com/CrazyPython/364f11465dab90d611ecc81490682680
>>
>> LDC 1.23.0 (Installed from dlang.org)
>>
>> ldc2 -release -mcpu=native -O3 -ffast-math --fp-contract=fast
>>
>> Node v14.40 (V8 8.1.307.31)
>>
>> Dlang trials: 2957 2560 2048 Average: 2521
>> Node.JS trials: 1988 2567 1863 Average: 2139
>>
>> Notes:
>>
>>  - I had to reinstall Dlang from the install script
>>  - I was initially confused why -mtune=native didn't work, and had to read documentation. Would have been nice if the compiler told me -mcpu=native was what I needed.
>>  - I skipped -march=native. Did not find information on the wiki https://wiki.dlang.org/Using_LDC
>>  - Node.js compiles faster and uses a compilation cache
>>
>> Mandatory citation: https://github.com/brion/mandelbrot-shootout
>
> With the double type:
>
> Node: 2211 2574 2306
> Dlang: 2520 1891 1676

Bonus: Direct translation of Dlang to Node.js, Node.js faster https://gist.github.com/CrazyPython/8bafd16837ec8ad4c5a638b9d305fc96

Dlang: 4076 3622 2934 (3544 average)
Node.js: 2624 2334 2316 (2424 average)

LDC2 is 46% slower!
August 21, 2020
On Fri, Aug 21, 2020 at 11:22:27PM +0000, James Lu via Digitalmars-d wrote: [...]
> Bonus: Direct translation of Dlang to Node.js, Node.js faster https://gist.github.com/CrazyPython/8bafd16837ec8ad4c5a638b9d305fc96
> 
> Dlang: 4076 3622 2934 (3544 average)
> Node.js: 2624 2334 2316 (2424 average)
> 
> LDC2 is 46% slower!

Using a class for Complex (and a non-final one at that!!) introduces tons of allocation overhead per iteration, plus virtual function call overhead.  You should be using a struct instead.  I betcha this one change will make a big difference in performance.

Also, what's the command you're using to compile the program?  If you're doing performance comparison, you should specify -O2 or -O3.


T

-- 
Knowledge is that area of ignorance that we arrange and classify. -- Ambrose Bierce
August 22, 2020
On Friday, 21 August 2020 at 23:49:44 UTC, H. S. Teoh wrote:
> You should be using a struct instead.

Maybe try `creal`?
August 21, 2020
On Fri, Aug 21, 2020 at 04:49:44PM -0700, H. S. Teoh via Digitalmars-d wrote: [...]
> Using a class for Complex (and a non-final one at that!!) introduces tons of allocation overhead per iteration, plus virtual function call overhead.  You should be using a struct instead.  I betcha this one change will make a big difference in performance.
[...]

OK, so I copied the code and changed the class to struct, and compared the results. Both versions are compiled with ldc2 -O3.

	class version:
	7 secs, 125 ms, 608 μs, and 9 hnsecs
	7 secs, 155 ms, 328 μs, and 6 hnsecs
	7 secs, 158 ms, 966 μs, and 4 hnsecs

	struct version:
	6 secs, 55 ms, 140 μs, and 4 hnsecs
	6 secs, 125 ms, 974 μs, and 5 hnsecs
	6 secs, 126 ms, 945 μs, and 4 hnsecs

For performance comparisons, take the best of n (because the others are merely measuring more system noise).  This represents about a 15% performance increase in switching to struct instead of class.

I thought it might make a difference to optimize for my CPU with -mcpu=native, so here are the numbers:

	class version:
	7 secs, 100 ms, 602 μs, and 6 hnsecs
	7 secs, 100 ms, 437 μs, and 7 hnsecs
	7 secs, 121 ms, 594 μs, and 4 hnsecs

	struct version:
	6 secs, 73 ms, 534 μs, and 3 hnsecs
	5 secs, 662 ms, 626 μs, and 5 hnsecs
	6 secs, 103 ms, 871 μs, and 2 hnsecs

Again taking the best of 3, that's about a 20% performance increase between changing from class to struct.

//

Just for laughs, I tested with dmd -O -inline:

	class version:
	7 secs, 255 ms, 748 μs, and 5 hnsecs
	7 secs, 249 ms, 683 μs, and 9 hnsecs
	7 secs, 593 ms, 847 μs, and 8 hnsecs

	struct version:
	7 secs, 646 ms, 685 μs, and 5 hnsecs
	7 secs, 618 ms, 642 μs, and 7 hnsecs
	7 secs, 606 ms, 85 μs, and 4 hnsecs

Surprisingly, the class version does *better* than the struct version when compiled with dmd.  (Wow, is dmd codegen *that* bad that it outweighs even class allocation overhead?? :-D)  But both are worse than even the class version with ldc2 -O3 (even without -mcpu=native).

So yeah.  I wouldn't trust dmd with a 10-foot pole when it comes to runtime performance.  The struct version compiled with `ldc2 -O3 -mcpu=native` beats the struct version compiled with dmd by a 26% margin.  That's pretty sad.


T

-- 
An imaginary friend squared is a real enemy.
August 22, 2020
On Friday, 21 August 2020 at 23:49:44 UTC, H. S. Teoh wrote:
> On Fri, Aug 21, 2020 at 11:22:27PM +0000, James Lu via Digitalmars-d wrote: [...]
>> Bonus: Direct translation of Dlang to Node.js, Node.js faster https://gist.github.com/CrazyPython/8bafd16837ec8ad4c5a638b9d305fc96
>> 
>> Dlang: 4076 3622 2934 (3544 average)
>> Node.js: 2624 2334 2316 (2424 average)
>> 
>> LDC2 is 46% slower!
>
> Using a class for Complex (and a non-final one at that!!) introduces tons of allocation overhead per iteration, plus virtual function call overhead.  You should be using a struct instead.  I betcha this one change will make a big difference in performance.
>
> Also, what's the command you're using to compile the program?  If you're doing performance comparison, you should specify -O2 or -O3.
>
>
> T

I showed with and without class. V8's analyzer might be superior to LDC's in removing the allocation overhead. I used the same compilation flags as the original:

ldc2 -release -mcpu=native -O3 -ffast-math --fp-contract=fast
August 22, 2020
On Friday, 21 August 2020 at 23:49:44 UTC, H. S. Teoh wrote:
> On Fri, Aug 21, 2020 at 11:22:27PM +0000, James Lu via Digitalmars-d wrote: [...]
>> Bonus: Direct translation of Dlang to Node.js, Node.js faster https://gist.github.com/CrazyPython/8bafd16837ec8ad4c5a638b9d305fc96
>> 
>> Dlang: 4076 3622 2934 (3544 average)
>> Node.js: 2624 2334 2316 (2424 average)
>> 
>> LDC2 is 46% slower!
>
> Using a class for Complex (and a non-final one at that!!) introduces tons of allocation overhead per iteration, plus virtual function call overhead.  You should be using a struct instead.  I betcha this one change will make a big difference in performance.
>
> Also, what's the command you're using to compile the program?  If you're doing performance comparison, you should specify -O2 or -O3.
>

He mentioned this in his first post.

LDC 1.23.0 (Installed from dlang.org)

ldc2 -release -mcpu=native -O3 -ffast-math --fp-contract=fast
August 22, 2020
On Friday, 21 August 2020 at 23:10:53 UTC, James Lu wrote:
> Code: https://gist.github.com/CrazyPython/364f11465dab90d611ecc81490682680
>
> LDC 1.23.0 (Installed from dlang.org)
>
> ldc2 -release -mcpu=native -O3 -ffast-math --fp-contract=fast
>
> Node v14.40 (V8 8.1.307.31)
>
> Dlang trials: 2957 2560 2048 Average: 2521
> Node.JS trials: 1988 2567 1863 Average: 2139
>
> Notes:
>
>  - I had to reinstall Dlang from the install script
>  - I was initially confused why -mtune=native didn't work, and had to read documentation. Would have been nice if the compiler told me -mcpu=native was what I needed.
>  - I skipped -march=native. Did not find information on the wiki https://wiki.dlang.org/Using_LDC
>  - Node.js compiles faster and uses a compilation cache
>
> Mandatory citation: https://github.com/brion/mandelbrot-shootout

I have no desire to dig into it myself, but I'll just note that if you check the CLBG, you'll see that it's not hard to write C and C++ programs for this benchmark that are many times slower than Node JS. The worst of them takes seven times longer to run.
August 22, 2020
On Saturday, 22 August 2020 at 00:10:43 UTC, H. S. Teoh wrote:
> On Fri, Aug 21, 2020 at 04:49:44PM -0700, H. S. Teoh via Digitalmars-d wrote: [...]
>> 
> Surprisingly, the class version does *better* than the struct version when compiled with dmd.  (Wow, is dmd codegen *that* bad that it outweighs even class allocation overhead?? :-D)

Or maybe DMD is not trying to win any performance context... just focusing on fast compilation for quick prototyping. Something you wouldn't getting otherwise without DMD.
« First   ‹ Prev
1 2 3