Woeful performance of D compared to C++ (page 2)

Bill Lear wrote: > > The rand() call is definitely the most expensive. When I remove it > from both the C++ and the D program, the times plummet (to 0.003 > and 0.013 seconds, respectively --- still, however, leaving the D > program running in 4.3 times that of the C++ program;-). > Yes, but if you make it so that the C++ compiler can't so easily remove the loop, then they are the same :) int main(int argc, char *argv[]) { unsigned char doors = 0; //const unsigned int n = 100000000; unsigned int n = argc > 1 ? atoi(argv[1]) : 10000000; <IMHO, that's almost always a worthless optimization for "real-world" code and even "good" benchmarks :)>. > > Bill > -- > Bill Lear > r * e * @ * o * y * a * c * m > * a * l * z * p * r * . * o *

Kirk McDonald <kirklin.mcdonald@gmail.com> writes: > Bill Lear wrote: > [elided stupidity] > > Here is the dmd.conf search path (as documented on > http://www.digitalmars.com/d/dcompiler.html): > > 1. current working directory > 2. $HOME > 3. the directory the dmd executable is in > 4. /etc/dmd.conf > > If you simply extracted the dmd archive into /opt, then it will find the dmd.conf file alongside the binary before it finds the one at /etc/dmd.conf. Either remove the one next to the binary or edit it. I think I once knew this but somehow forgot. Score one for stupidty. Works perfecly. Thank you. Bill -- Bill Lear r * e * @ * o * y * a * c * m * a * l * z * p * r * . * o *

January 18, 2007

Re: Woeful performance of D compared to C++

Posted by Sean Kelly
in reply to rael

Permalink

Sean Kelly

Posted in reply to rael

Permalink

I tried running these under Tango with DMD on Win32 (as it's the setup I currently have).  Here are my slightly altered programs to make the two a bit more comparable.  First, the D code:

import tango.stdc.stdlib;
import tango.stdc.stdio;

void main() {
    const uint n = 10_000_000;
    ubyte doors;
    uint wins, wins_switching;

    for (uint i; i < n; ++i) {
        doors |= cast(ubyte)(1 << rand() % 3);

        if (doors & 1) {
            ++wins;
        } else {
            ++wins_switching;
        }

        doors = 0;
    }

    printf("Wins switching: %d [%f%%]\n", wins_switching,
             (wins_switching / cast(double) n) * 100);
    printf("Wins without switching: %d [%f%%]\n", wins,
             (wins / cast(double) n) * 100);
}

And now the C++ code:

#include <cstdlib>
#include <cstdio>

int main() {
    unsigned char doors = 0;
    const unsigned int n = 10000000;
    unsigned int wins = 0, wins_switching = 0;

    for (unsigned int i = 0; i < n; ++i) {
        unsigned char r = 1 << (rand() % 3);
        doors |= r; // place the car behind a random door

        if (doors & 1) { // choose zero'th door, same as random choice
            ++wins;
        } else {
            ++wins_switching;
        }

        doors ^= r; // zero the door with car
    }

    const double d = n / 100;

    printf("Wins switching: %d [%f%%]\n", wins_switching,
             (wins_switching / (double) n) * 100);
    printf("Wins without switching: %d [%f%%]\n", wins,
             (wins / (double) n) * 100);
}

C:> dmd -O -inline -release dtest
C:> dmc -o ctest.cpp

Here are the results for three runs of the D app:

Execution time: 1.323 s
Execution time: 1.005 s
Execution time: 1.125 s

And three runs of the C++ app:

Execution time: 1.149 s
Execution time: 1.202 s
Execution time: 1.304 s

The numbers above aren't quite as accurate as those using "time" on Unix, but they're sufficient for a rough comparison.  That said, DMD and DMC perform pretty much the same once the variable of IOStreams vs. writefln is removed.


Sean

Bill Lear wrote: > > The rand() call is definitely the most expensive. When I remove it > from both the C++ and the D program, the times plummet (to 0.003 > and 0.013 seconds, respectively --- still, however, leaving the D > program running in 4.3 times that of the C++ program;-). With execution times that short, you're really comparing the startup time of a D application vs. a C++ application. And D application startup time includes the initialization of a garbage collector, in the default case. If you really wanted to compare apples to apples here I'd rip out the default GC and replace it with one that has no initialization cost. Sean

Walter Bright wrote: > Dave wrote: >> >> D's rand() is slow. > > True. C's rand() is fast, but is known to be not very random. As you pointed out, D users can use either as required. Maybe there should be a randfast() in the standard lib? I imagine this confusion will come up again. -Joel

Lionello Lunesu wrote: > This might solve the performace in this case, but Walter, have you checked the thread "Why is this D code slower than C++" in digitalmars.D.learn ? The first thing I'd try is using DMD's built-in profiler: dmd -profile test.d

Sean Kelly wrote: > Bill Lear wrote: >> >> The rand() call is definitely the most expensive. When I remove it >> from both the C++ and the D program, the times plummet (to 0.003 >> and 0.013 seconds, respectively --- still, however, leaving the D >> program running in 4.3 times that of the C++ program;-). > > With execution times that short, you're really comparing the startup time of a D application vs. a C++ application. And D application startup time includes the initialization of a garbage collector, in the default case. If you really wanted to compare apples to apples here I'd rip out the default GC and replace it with one that has no initialization cost. There are easier solutions to get better timings. See: http://www.digitalmars.com/techtips/timing_code.html

Walter Bright wrote: > Lionello Lunesu wrote: >> This might solve the performace in this case, but Walter, have you checked the thread "Why is this D code slower than C++" in digitalmars.D.learn ? > > The first thing I'd try is using DMD's built-in profiler: > > dmd -profile test.d Been done. The main thing it shows is that the Sphere.Intersect routine is a hotspot. The other hotspot is the big recursive Raytrace function itself, but that's not so useful without a line-by-line breakdown since basically everything happens inside there. The D trace.log is at: http://www.webpages.uidaho.edu/~shro8822/trace.log The C++ log was attached to a post: http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D.learn&article_id=5958 Though I'm not sure it's useful to compare them, because I think it was two different machines that ran the two. --bb

Hi, I've be investigating about performance of different programming languages/compiler using some micro-benchmarks like the one posted in this thread. I observed that in many of them library implementations are much more important than the language itself. Some of my results are posted here http://pauloherrera.blogspot.com/ . In the case of random number generators the performance difference among different implementations/algorithms in the same language can be orders of magnitude. I don't know why all libraries do not implement the Mersenne-Twister algorithm that is considered as the fastest and highest quality (most random). Paulo Walter Bright wrote: > Dave wrote: >> >> D's rand() is slow. > > True. C's rand() is fast, but is known to be not very random. As you pointed out, D users can use either as required.

janderson Wrote: > Walter Bright wrote: > > Dave wrote: > >> > >> D's rand() is slow. > > > > True. C's rand() is fast, but is known to be not very random. As you pointed out, D users can use either as required. > > Maybe there should be a randfast() in the standard lib? I imagine this confusion will come up again. > > -Joel I recommend a built-in mersenne twist function, usually called mt_rand(). -- Jeff

Forums