Thread overview
Re: Basic benchmark
Dec 14, 2008
bearophile
nbody
Dec 14, 2008
The Anh Tran
Dec 14, 2008
bearophile
Dec 14, 2008
bearophile
December 14, 2008
Tomas Lindquist Olsen:
> ...
> $ dmd bench.d -O -release -inline
> long arith:  55630 ms
> nested loop:  5090 ms
> 
> $ ldc bench.d -O3 -release -inline
> long arith:  13870 ms
> nested loop:   120 ms
> 
> $ gcc bench.c -O3 -s -fomit-frame-pointer
> long arith: 13600 ms
> nested loop:  170 ms
>...

Very nice results.

If you have a little more time I have another small C and D benchmark to offer you, to be tested with GCC and LDC. It's the C version of the "nbody" benchmarks of the Shootout, a very close translation to D (file name "nbody_d1.d") and my faster D version (file name "nbody_d2.d") (the faster D version is relative to DMD compiler, of course).
I haven't tried LDC yet, so I can't be sure of what the timings will tell.

Thank you for your work,
bearophile


December 14, 2008
bearophile wrote:
> Tomas Lindquist Olsen:
>> ...
>> $ dmd bench.d -O -release -inline
>> long arith:  55630 ms
>> nested loop:  5090 ms
>>
>> $ ldc bench.d -O3 -release -inline
>> long arith:  13870 ms
>> nested loop:   120 ms
>>
>> $ gcc bench.c -O3 -s -fomit-frame-pointer
>> long arith: 13600 ms
>> nested loop:  170 ms
>> ...
> 
> Very nice results.
> 
> If you have a little more time I have another small C and D benchmark to offer you, to be tested with GCC and LDC. It's the C version of the "nbody" benchmarks of the Shootout, a very close translation to D (file name "nbody_d1.d") and my faster D version (file name "nbody_d2.d") (the faster D version is relative to DMD compiler, of course).
> I haven't tried LDC yet, so I can't be sure of what the timings will tell.
> 
> Thank you for your work,
> bearophile

IMHO, spectralnorm is 'a little bit' better than nbody.
:)
December 14, 2008
Lindquist and another gentle person on IRC have given their timings relative to the D and C versions of the 'nbody' code I have shown in the attach in the precedent email (they have tested the first D code version only).

Timings N=20_000_000, on an athlon64 x2 3800+ CPU:
  gcc: 10.8  s
  ldc: 14.2  s
  dmd: 15.5  s
  gdc:

------------

Timings N=10_000_000, on an AMD 2500+ CPU:
  gcc:  8.78 s
  ldc: 12.26 s
  dmd: 13.9  s
  gdc:  9.82 s

Compiler arguments used on the AMD 2500+ CPU:
  GCC: -O3 -s -fomit-frame-pointer
  DMD: -release -O
  GDC: -O3 -s -fomit-frame-pointer
  LDC: -ofmain -O3 -release -inline

This time the results seems good enough to me.

This benchmark is relative to FP computations, the faster language for this naive physics simulation is Fortran90, as can be seen in the later pages of the Shootout).
(I'd like to test one last one, the 'recursive' benchmark, but it's for later).

Bye,
bearophile
December 14, 2008
(The other gentle person on IRC was wilsonk).
The timing results for the nbody benchmark (the code is in attach in one my last posts) as found by
wilsonk on IRC, N=10_000_000, on an AMD 2500+ CPU:
  64-bit GCC C code: 3.31 s
  64-bit LDC D code: 5.74 s

You can see the ratio is very similar to the 32 bit one (but absolute timings are quite lower).

------------------------

Then the timings for the recursive4 benchmark (the code is in attach in this post):
On an AMD 2500+ CPU, by wilsonk, 64 bit timings, recursive4:
  C code GCC, N=13: 22.93 s
  D code LDC, N=13: 28.88 s


Timings by Elrood, recursive4 benchmark, on a 32-bit WinXP, AMD x2 3600 CPU:
  C code GCC, N=13: ~25 s
  D code LDC, N=13: >60 s

For this benchmark the LLVM shows to need some improvement still :-)

Bye,
bearophile