Thread overview | ||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
March 05, 2018 Speed of math function atan: comparison D and C++ | ||||
---|---|---|---|---|
| ||||
I'm considering shifting a large existing C++ codebase into D (it's a scientific code making much use of functions like atan, log etc). I've compared the raw speed of atan between C++ (Apple LLVM version 7.3.0 (clang-703.0.29)) and D (dmd v2.079.0, also ldc2 1.7.0) by doing long loops of such functions. I can't get the D to run faster than about half the speed of C++. Are there benchmarks for such scientific functions published somewhere? |
March 05, 2018 Re: Speed of math function atan: comparison D and C++ | ||||
---|---|---|---|---|
| ||||
Posted in reply to J-S Caux | On 05/03/2018 6:35 PM, J-S Caux wrote:
> I'm considering shifting a large existing C++ codebase into D (it's a scientific code making much use of functions like atan, log etc).
>
> I've compared the raw speed of atan between C++ (Apple LLVM version 7.3.0 (clang-703.0.29)) and D (dmd v2.079.0, also ldc2 1.7.0) by doing long loops of such functions.
>
> I can't get the D to run faster than about half the speed of C++.
>
> Are there benchmarks for such scientific functions published somewhere
Gonna need to disassemble and compare them.
atan should work out to only be a few instructions (inline assembly) from what I've looked at in the source.
Also you should post the code you used for each.
|
March 05, 2018 Re: Speed of math function atan: comparison D and C++ | ||||
---|---|---|---|---|
| ||||
Posted in reply to rikki cattermole | On Monday, 5 March 2018 at 05:40:09 UTC, rikki cattermole wrote:
> On 05/03/2018 6:35 PM, J-S Caux wrote:
>> I'm considering shifting a large existing C++ codebase into D (it's a scientific code making much use of functions like atan, log etc).
>>
>> I've compared the raw speed of atan between C++ (Apple LLVM version 7.3.0 (clang-703.0.29)) and D (dmd v2.079.0, also ldc2 1.7.0) by doing long loops of such functions.
>>
>> I can't get the D to run faster than about half the speed of C++.
>>
>> Are there benchmarks for such scientific functions published somewhere
>
> Gonna need to disassemble and compare them.
>
> atan should work out to only be a few instructions (inline assembly) from what I've looked at in the source.
>
> Also you should post the code you used for each.
So the codes are trivial, simply some check of raw speed:
double x = 0.0;
for (int a = 0; a < 1000000000; ++a) x += atan(1.0/(1.0 + sqrt(1.0 + a)));
for C++ and
double x = 0.0;
for (int a = 0; a < 1_000_000_000; ++a) x += atan(1.0/(1.0 + sqrt(1.0 + a)));
for D. C++ exec takes 40 seconds, D exec takes 68 seconds.
|
March 05, 2018 Re: Speed of math function atan: comparison D and C++ | ||||
---|---|---|---|---|
| ||||
Posted in reply to J-S Caux | On 05/03/2018 7:01 PM, J-S Caux wrote:
> On Monday, 5 March 2018 at 05:40:09 UTC, rikki cattermole wrote:
>> On 05/03/2018 6:35 PM, J-S Caux wrote:
>>> I'm considering shifting a large existing C++ codebase into D (it's a scientific code making much use of functions like atan, log etc).
>>>
>>> I've compared the raw speed of atan between C++ (Apple LLVM version 7.3.0 (clang-703.0.29)) and D (dmd v2.079.0, also ldc2 1.7.0) by doing long loops of such functions.
>>>
>>> I can't get the D to run faster than about half the speed of C++.
>>>
>>> Are there benchmarks for such scientific functions published somewhere
>>
>> Gonna need to disassemble and compare them.
>>
>> atan should work out to only be a few instructions (inline assembly) from what I've looked at in the source.
>>
>> Also you should post the code you used for each.
>
> So the codes are trivial, simply some check of raw speed:
>
> double x = 0.0;
> for (int a = 0; a < 1000000000; ++a) x += atan(1.0/(1.0 + sqrt(1.0 + a)));
>
> for C++ and
>
> double x = 0.0;
> for (int a = 0; a < 1_000_000_000; ++a) x += atan(1.0/(1.0 + sqrt(1.0 + a)));
>
> for D. C++ exec takes 40 seconds, D exec takes 68 seconds.
Yes, but that doesn't show me how you benchmarked.
|
March 05, 2018 Re: Speed of math function atan: comparison D and C++ | ||||
---|---|---|---|---|
| ||||
Posted in reply to rikki cattermole | On Monday, 5 March 2018 at 05:40:09 UTC, rikki cattermole wrote:
> atan should work out to only be a few instructions (inline assembly) from what I've looked at in the source.
>
> Also you should post the code you used for each.
Should be 3-4 instructions. Load input to the FPU (Optional? Depends on if it already has the value loaded), Atan, Fwait (optional?), Retrieve value.
Off hand that i remember, FPU instructions run in their own separated space and should more or less take up only a few cycles by themselves to run (and also run in parallel to the CPU code).
At which point if the code is running half the speed of C++'s, that means probably bad optimization elsewhere, or even the control settings for the FPU.
I really haven't looked that in depth to the FPU stuff since about 2000...
|
March 05, 2018 Re: Speed of math function atan: comparison D and C++ | ||||
---|---|---|---|---|
| ||||
Posted in reply to J-S Caux | On Monday, 5 March 2018 at 06:01:27 UTC, J-S Caux wrote: > On Monday, 5 March 2018 at 05:40:09 UTC, rikki cattermole wrote: >> On 05/03/2018 6:35 PM, J-S Caux wrote: >>> I'm considering shifting a large existing C++ codebase into D (it's a scientific code making much use of functions like atan, log etc). >>> >>> I've compared the raw speed of atan between C++ (Apple LLVM version 7.3.0 (clang-703.0.29)) and D (dmd v2.079.0, also ldc2 1.7.0) by doing long loops of such functions. >>> >>> I can't get the D to run faster than about half the speed of C++. >>> >>> Are there benchmarks for such scientific functions published somewhere >> >> Gonna need to disassemble and compare them. >> >> atan should work out to only be a few instructions (inline assembly) from what I've looked at in the source. >> >> Also you should post the code you used for each. > > So the codes are trivial, simply some check of raw speed: > > double x = 0.0; > for (int a = 0; a < 1000000000; ++a) x += atan(1.0/(1.0 + sqrt(1.0 + a))); > > for C++ and > > double x = 0.0; > for (int a = 0; a < 1_000_000_000; ++a) x += atan(1.0/(1.0 + sqrt(1.0 + a))); > > for D. C++ exec takes 40 seconds, D exec takes 68 seconds. Depending on your platform, the size of `double` could be different between C++ and D. Could you check that the size and precision are indeed the same? Also, benchmark method is just as important as benchmark code. Did you use DMD or LDC as the D compiler? In this case it shouldn't matter, but try with LDC if you haven't. Also ensure that you've used the right flags: `-release -inline -O`. If the D version is still slower, you could try using the C version of the function Simply change `import std.math: atan;` to `core.stdc.math: atan;` [0] [0]: https://dlang.org/phobos/core_stdc_math.html#.atan |
March 05, 2018 Re: Speed of math function atan: comparison D and C++ | ||||
---|---|---|---|---|
| ||||
Posted in reply to J-S Caux | On Monday, 5 March 2018 at 05:35:28 UTC, J-S Caux wrote:
> I'm considering shifting a large existing C++ codebase into D (it's a scientific code making much use of functions like atan, log etc).
>
> I've compared the raw speed of atan between C++ (Apple LLVM version 7.3.0 (clang-703.0.29)) and D (dmd v2.079.0, also ldc2 1.7.0) by doing long loops of such functions.
>
> I can't get the D to run faster than about half the speed of C++.
>
> Are there benchmarks for such scientific functions published somewhere?
What compiled flags did you used to compile both C++ and D versions?
|
March 05, 2018 Re: Speed of math function atan: comparison D and C++ | ||||
---|---|---|---|---|
| ||||
Posted in reply to J-S Caux | On Monday, 5 March 2018 at 06:01:27 UTC, J-S Caux wrote: > On Monday, 5 March 2018 at 05:40:09 UTC, rikki cattermole wrote: >> On 05/03/2018 6:35 PM, J-S Caux wrote: >>> I'm considering shifting a large existing C++ codebase into D (it's a scientific code making much use of functions like atan, log etc). >>> >>> I've compared the raw speed of atan between C++ (Apple LLVM version 7.3.0 (clang-703.0.29)) and D (dmd v2.079.0, also ldc2 1.7.0) by doing long loops of such functions. >>> >>> I can't get the D to run faster than about half the speed of C++. > > double x = 0.0; > for (int a = 0; a < 1000000000; ++a) x += atan(1.0/(1.0 + sqrt(1.0 + a))); > > for C++ and > > double x = 0.0; > for (int a = 0; a < 1_000_000_000; ++a) x += atan(1.0/(1.0 + sqrt(1.0 + a))); > > for D. C++ exec takes 40 seconds, D exec takes 68 seconds. The performance problem with this code is that LDC does not yet do cross-module inlining by default. GDC does. If you pass `-enable-cross-module-inlining` to LDC, things should be faster. In particular, std.sqrt is not inlined although it is profitable to do so (it becomes one machine instruction). Things become worse when using core.stdc.math.sqrt, because no implementation source available: no inlining possible. Another problem is that std.math.atan(double) just calls std.math.atan(real). Calculations are more expensive on platforms where real==80bits (i.e. x86), and that's not solvable with a compile flag. What it takes is someone to write the double and float versions of atan (and other math functions), but it requires someone with the right knowledge to do it. Your tests (and reporting about them) are much appreciated. Please do file bug reports for these things. Perhaps you can take a stab at implementing double-versions of the functions you need? cheers, Johan |
March 05, 2018 Re: Speed of math function atan: comparison D and C++ | ||||
---|---|---|---|---|
| ||||
Posted in reply to Uknown | On Monday, 5 March 2018 at 09:48:49 UTC, Uknown wrote: > Depending on your platform, the size of `double` could be different between C++ and D. Could you check that the size and precision are indeed the same? > Also, benchmark method is just as important as benchmark code. Did you use DMD or LDC as the D compiler? In this case it shouldn't matter, but try with LDC if you haven't. Also ensure that you've used the right flags: > `-release -inline -O`. > > If the D version is still slower, you could try using the C version of the function > Simply change `import std.math: atan;` to `core.stdc.math: atan;` [0] > > [0]: https://dlang.org/phobos/core_stdc_math.html#.atan Thanks all for the info. I've tested these two very basic representative codes: https://www.dropbox.com/s/b5o4i8h43qh1saf/test.cc?dl=0 https://www.dropbox.com/s/zsaikhdoyun3olk/test.d?dl=0 Results: C++: g++ (Apple LLVM version 7.3.0): 9.5 secs g++ (GCC 7.1.0): 10.7 secs D: dmd : 35.5 secs dmd -release -inline -O : 29.5 secs ldc2 : 34.4 secs ldc2 -release -O : 31.5 secs But now: using the core.stdc.math atan as per Uknown's suggestion: D: dmd: 9 secs dmd -release -inline -O : 6.8 secs ldc2 : 10 secs ldc2 -release -O : 6.5 secs <- best So indeed the difference is between the `std.math atan` versus the `core.stdc.math atan`. Thanks Uknown! Just knowing this trick could make the difference between me and other scientists switching over to D... But now comes the question: can the D fundamental maths functions be propped up to be as fast as the C ones? |
March 05, 2018 Re: Speed of math function atan: comparison D and C++ | ||||
---|---|---|---|---|
| ||||
Posted in reply to J-S Caux | On Monday, 5 March 2018 at 18:39:21 UTC, J-S Caux wrote:
> But now comes the question: can the D fundamental maths functions be propped up to be as fast as the C ones?
Probably, if someone takes the time to look at the bottlenecks.
|
Copyright © 1999-2021 by the D Language Foundation