Consider the following example:
import std.algorithm, std.range, std.stdio;
long binary_search(long n, long val, long i) {
return iota(1, i + 1).map!(x => n / x == val).assumeSorted.upperBound(false).length;
}
// A solution for https://atcoder.jp/contests/abc230/tasks/abc230_e
long solve(long n) {
long ans = 0, i = n;
while (i != 0) {
long val = n / i;
auto cnt = binary_search(n, val, i);
ans += cnt * val;
i -= cnt;
}
return ans;
}
void main() {
long ans = 0;
foreach (n ; 1 .. 100000)
ans += solve(n);
writeln(ans);
}
Benchmarks with GDC 11.2.0, LDC 1.27.1 (LLVM 12.0.0) and DMD 2.091.1:
$ dmd -O -g -release -inline test.d && time ./test
55836809328
real 0m10.654s
user 0m11.001s
sys 0m0.052s
$ gdc-11.2.0 -O3 -g -frelease -flto test.d && time ./a.out
55836809328
real 0m6.520s
user 0m6.519s
sys 0m0.000s
$ ldc2 -O -g -release test.d && time ./test
55836809328
real 0m1.904s
user 0m1.903s
sys 0m0.000s
LDC produces significantly faster code here and one of the major contributing factors is that LDC is able to avoid heap allocations (as can be confirmed by running a profiler). It is possible to force LDC to also use heap allocations via adding '--disable-gc2stack' option and the performance drops:
$ ldc2 -O -g -release --disable-gc2stack test.d && time ./test
55836809328
real 0m3.621s
user 0m3.620s
sys 0m0.000s
So only LDC is doing a proper job and lives up to its state of the art optimizing compiler reputation. But not everything is perfect even with LDC. In another thread https://forum.dlang.org/thread/t6rijv$nlb$1@digitalmars.com Steven Schveighoffer mentioned @nogc annotations. Now if I add @nogc attribute to 'binary_search' function, then LDC refuses to compile the source code:
$ ldc2 -O -g -release test.d && time ./test
test.d(3): Error: function `test.binary_search` is `@nogc` yet allocates closures with the GC
test.d(4): test.binary_search.__lambda4 closes over variable n at test.d(3)
What do you think about all of this?