Jump to page: 1 2
Thread overview
DMDScript vs. others
Jan 20, 2005
Bob
Jan 20, 2005
Lionello Lunesu
Jan 21, 2005
Bob
Jan 20, 2005
Dave
Jan 20, 2005
Walter
Jan 21, 2005
Bob
Jan 21, 2005
Matthias Becker
Jan 21, 2005
Walter
Jan 21, 2005
Dave
Jan 21, 2005
Bob
Jan 21, 2005
Dave
January 20, 2005
Tested the (in)famous 'sieve' on an older P3:


Results (times in secs):

2.55  vcpp
4.12  djgpp
8.36  dmc
8.83  lccW32
10.02  dmd
751.00  dms
2403.00  js



Speed relative to D language:

3.92941  vcpp
2.43204  djgpp
1.19856  dmc
1.13477  lccw32
1.00000  [*] dmd
0.01334  dms
0.00417  js



Remarks:

- Scripting languages results extrapolated,
don't expect me to wait 2400+ seconds.
- The old dscript version 1.02 (2002-11-30) was already
much faster than JS (it would have scored 981 secs).
- VC seems to like 'sieve', its test results with
other apps were good but not by such a huge margin.



More info:

JS (V5.6):
10 iterations, 1899 primes, elapsed time = 2403

dms (V1.03):
10 iterations, 1899 primes, elapsed time = 751

dmd (V0.111) [-O]:
10000 iterations, 1899 primes, elapsed time = 10.024

LccW32 (V3.3) [-O -p6]:
10000 iterations, 1899 primes, elapsed time = 8.833

dmc (V8.41) [-O -6]:
10000 iterations, 1899 primes, elapsed time = 8.362

DJGpp (gcc V3.43) [-O3 -march=pentium3]:
10000 iterations, 1899 primes, elapsed time = 4.12088

VCpp (V13.10.3077) [/O2 /G6]:
10000 iterations, 1899 primes, elapsed time = 2.553


January 20, 2005
We should check if dmd is the only one using a real bit-array for that 'flags' variable. If so, it suffers from extra shifts/ands. A byte[] should be tested too.

L.


January 20, 2005
Lionello Lunesu wrote:

> We should check if dmd is the only one using a real bit-array for that 'flags' variable. If so, it suffers from extra shifts/ands. A byte[] should be tested too.

I think you mean "wbit[]" ;-)

--anders
January 20, 2005
On Thu, 20 Jan 2005 18:12:50 +0000, Bob wrote:
> 
> Tested the (in)famous 'sieve' on an older P3:
<snip>
> dmd (V0.111) [-O]:
> 10000 iterations, 1899 primes, elapsed time = 10.024
> 
> LccW32 (V3.3) [-O -p6]:
> 10000 iterations, 1899 primes, elapsed time = 8.833
> 
> dmc (V8.41) [-O -6]:
> 10000 iterations, 1899 primes, elapsed time = 8.362
> 
> DJGpp (gcc V3.43) [-O3 -march=pentium3]:
> 10000 iterations, 1899 primes, elapsed time = 4.12088
> 
> VCpp (V13.10.3077) [/O2 /G6]:
> 10000 iterations, 1899 primes, elapsed time = 2.553
>

PIII, 800 Mhz, Win2K

sieve (10000):
dmd v0.111 [-O -inline -release]: 1.67 secs.
VCpp v13.10.3077 [/O2 /G6]: 1.65 secs.

ary (50000):
dmd: 2.70 secs.
VCpp: 2.72 secs.

heapsort (1000000):
dmd: 2.37 secs.
VCpp: 2.15 secs.

DMD is very competitive. Used char[] for sieve.d because the C
version did (apples to apples). Also used -release to turn off array bound
checking for D.

The C code came from: http://dada.perl.it/shootout/

;---
sieve.d:
;---
import std.string;
void main(char[][] args)
{
    int n = args.length > 1 ? atoi(args[1]) : 1;

    char flags[8192 + 1];
    int  count;

    while(n--) {
        count = 0;
        flags[2..length] = 1;
        for(int i = 2; i < flags.length; i++) {
            if(flags[i]) {
                // remove all multiples of prime: i
                for(int j = i + i; j < flags.length; j += i) flags[j] = 0;
                count++;
            }
        }
    }

    printf("Count: %d\n", count);
}
;---

;---
ary.d
;---
import std.string;
void main(char[][] args) {
    int n = args.length > 1 ? atoi(args[1]) : 1;

    int[] x = new int[n];
    int[] y = new int[n];

    for(int i = 0; i < n; i++) {
        x[i] = i + 1;
    }

    for(int k = 0; k < 1000; k++) {
        for(int i = n - 1; i >= 0; i--) {
            y[i] += x[i];
        }
    }

    printf("%d %d\n",y[0],y[y.length - 1]);
}
;---

;---
heapsort.d
;---
import std.string;

void main(char[][] args)
{
    int n = args.length > 1 ? atoi(args[1]) : 1;

    double[] ary;

    ary.length = n + 1;
    for(int i = 1; i <= n; i++) {
       ary[i] = gen_random(1);
    }

    heapsort(n, ary);

    printf("%.10g\n", ary[n]);
}

void heapsort(int n, double[] ra) {
    int i, j;
    int ir = n;
    int l = (n >> 1) + 1;
    double rra;

    for (;;) {
        if (l > 1) {
            rra = ra[--l];
        } else {
            rra = ra[ir];
            ra[ir] = ra[1];
            if (--ir == 1) {
                ra[1] = rra;
                return;
            }
        }
        i = l;
        j = l << 1;
        while (j <= ir) {
            if (j < ir && ra[j] < ra[j+1]) { ++j; }
            if (rra < ra[j]) {
                ra[i] = ra[j];
                j += (i = j);
            } else {
                j = ir + 1;
            }
        }
        ra[i] = rra;
    }
}

const int IM = 139968;
const int IA = 3877;
const int IC = 29573;

double gen_random(double max) {
    static int last = 42;
    return( max * (last = (last * IA + IC) % IM) / IM );
}

;---

- Dave

January 20, 2005
"Bob" <Bob_member@pathlink.com> wrote in message news:csosb2$cq8$1@digitaldaemon.com...
> dmd (V0.111) [-O]:
> 10000 iterations, 1899 primes, elapsed time = 10.024

For max speed on D apps, use [-O -release]. Without the -release, the array overflow checking is turned on!


January 21, 2005
Hey Bob,

I basically 'ported' this to C/++ by wrapping the sieve.ds script with main() and type'ing the variables.

On a P3, P4 or AMD64 the ratio of VC to DMC is about 1.2:1. I couldn't find a way to make DMC run as slow as posted - are you sure there isn't a typo somewhere?

- Dave

In article <csosb2$cq8$1@digitaldaemon.com>, Bob says...
>
>
>Tested the (in)famous 'sieve' on an older P3:
>
>
>Results (times in secs):
>
>2.55  vcpp
>4.12  djgpp
>8.36  dmc
>8.83  lccW32
>10.02  dmd
>751.00  dms
>2403.00  js
>
>
>
>Speed relative to D language:
>
>3.92941  vcpp
>2.43204  djgpp
>1.19856  dmc
>1.13477  lccw32
>1.00000  [*] dmd
>0.01334  dms
>0.00417  js
>
>
>
>Remarks:
>
>- Scripting languages results extrapolated,
>don't expect me to wait 2400+ seconds.
>- The old dscript version 1.02 (2002-11-30) was already
>much faster than JS (it would have scored 981 secs).
>- VC seems to like 'sieve', its test results with
>other apps were good but not by such a huge margin.
>
>
>
>More info:
>
>JS (V5.6):
>10 iterations, 1899 primes, elapsed time = 2403
>
>dms (V1.03):
>10 iterations, 1899 primes, elapsed time = 751
>
>dmd (V0.111) [-O]:
>10000 iterations, 1899 primes, elapsed time = 10.024
>
>LccW32 (V3.3) [-O -p6]:
>10000 iterations, 1899 primes, elapsed time = 8.833
>
>dmc (V8.41) [-O -6]:
>10000 iterations, 1899 primes, elapsed time = 8.362
>
>DJGpp (gcc V3.43) [-O3 -march=pentium3]:
>10000 iterations, 1899 primes, elapsed time = 4.12088
>
>VCpp (V13.10.3077) [/O2 /G6]:
>10000 iterations, 1899 primes, elapsed time = 2.553
>
>


January 21, 2005
No typo - just global variables are the cause.

Speed gets much better if vars are being made local
to main(). In this case I can confirm your findings.
In case of 'globalisation', however, dmd, dmc and lcc
are way behind vcpp. Just djgpp gets in its vicinity.

I have posted a bug in dmd to the bugs forum.
If you are intersted, you might check out the
'sieve' coding I have used from there.


Remark:
After checking assembly listings it turnes out that
vcpp does full optimization on global vars, which
might or might not be desirable. Djgpp has the best
compromise of code and speed, still refraining from
keeping some global variables in the CPU registers.
However, good code comes at a price: Djgpp is
compiling way slower than any of the other compilers
mentioned, thus making development cycles a real test
for patience.



In article <csps4d$1ji3$1@digitaldaemon.com>, Dave says...
>
>
>Hey Bob,
>
>I basically 'ported' this to C/++ by wrapping the sieve.ds script with main() and type'ing the variables.
>
>On a P3, P4 or AMD64 the ratio of VC to DMC is about 1.2:1. I couldn't find a way to make DMC run as slow as posted - are you sure there isn't a typo somewhere?
>
>- Dave
>
>In article <csosb2$cq8$1@digitaldaemon.com>, Bob says...
>>
>>
>>Tested the (in)famous 'sieve' on an older P3:
>>
>>
>>Results (times in secs):
>>
>>2.55  vcpp
>>4.12  djgpp
>>8.36  dmc
>>8.83  lccW32
>>10.02  dmd
>>751.00  dms
>>2403.00  js
>>
>>
>>
>>Speed relative to D language:
>>
>>3.92941  vcpp
>>2.43204  djgpp
>>1.19856  dmc
>>1.13477  lccw32
>>1.00000  [*] dmd
>>0.01334  dms
>>0.00417  js
>>
>>
>>
>>Remarks:
>>
>>- Scripting languages results extrapolated,
>>don't expect me to wait 2400+ seconds.
>>- The old dscript version 1.02 (2002-11-30) was already
>>much faster than JS (it would have scored 981 secs).
>>- VC seems to like 'sieve', its test results with
>>other apps were good but not by such a huge margin.
>>
>>
>>
>>More info:
>>
>>JS (V5.6):
>>10 iterations, 1899 primes, elapsed time = 2403
>>
>>dms (V1.03):
>>10 iterations, 1899 primes, elapsed time = 751
>>
>>dmd (V0.111) [-O]:
>>10000 iterations, 1899 primes, elapsed time = 10.024
>>
>>LccW32 (V3.3) [-O -p6]:
>>10000 iterations, 1899 primes, elapsed time = 8.833
>>
>>dmc (V8.41) [-O -6]:
>>10000 iterations, 1899 primes, elapsed time = 8.362
>>
>>DJGpp (gcc V3.43) [-O3 -march=pentium3]:
>>10000 iterations, 1899 primes, elapsed time = 4.12088
>>
>>VCpp (V13.10.3077) [/O2 /G6]:
>>10000 iterations, 1899 primes, elapsed time = 2.553
>>
>>
>
>


January 21, 2005
Thanks for your info. With "-release" dmd compiled
programs are getting speedwise into the dmc league.
Quite good for an alpha version - I am impressed!



In article <csp4pk$o45$1@digitaldaemon.com>, Walter says...
>
>
>"Bob" <Bob_member@pathlink.com> wrote in message news:csosb2$cq8$1@digitaldaemon.com...
>> dmd (V0.111) [-O]:
>> 10000 iterations, 1899 primes, elapsed time = 10.024
>
>For max speed on D apps, use [-O -release]. Without the -release, the array overflow checking is turned on!
>
>


January 21, 2005
The whole thing was tested using a char[] flag array, because bit[] arrays are indeed slower - you were right guessing that.



In article <csosm0$d8d$1@digitaldaemon.com>, Lionello Lunesu says...
>
>We should check if dmd is the only one using a real bit-array for that 'flags' variable. If so, it suffers from extra shifts/ands. A byte[] should be tested too.
>
>L.
>
>


January 21, 2005
Bob wrote:

>>We should check if dmd is the only one using a real bit-array for that 'flags' variable. If so, it suffers from extra shifts/ands. A byte[] should be tested too.
>
> The whole thing was tested using a char[] flag array,
> because bit[] arrays are indeed slower - you were right
> guessing that.

So bit[] is not only buggy (append), but also slower ? :-)

Of course, it does save memory for large bit arrays...
(while small [0|1|2|3] bit arrays are actually larger)
Just wondering if it's really worth it, especially since
it seems to come at the expense of getting a boolean type.
But since "the fundamental data type is the bit"... <sic>
and bit[] is touted as a main D feature, I guess it stays.

http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D/12625:
> Henceforth, byte/char shall be known as a "wbit" when used as a bool
> and int/long shall similarly be known as a "dbit" when used as a bool.

--anders

PS. "char" and "long" are the C types, known as "byte" and "int" in D.
« First   ‹ Prev
1 2