Thread overview | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
January 20, 2005 DMDScript vs. others | ||||
---|---|---|---|---|
| ||||
Tested the (in)famous 'sieve' on an older P3: Results (times in secs): 2.55 vcpp 4.12 djgpp 8.36 dmc 8.83 lccW32 10.02 dmd 751.00 dms 2403.00 js Speed relative to D language: 3.92941 vcpp 2.43204 djgpp 1.19856 dmc 1.13477 lccw32 1.00000 [*] dmd 0.01334 dms 0.00417 js Remarks: - Scripting languages results extrapolated, don't expect me to wait 2400+ seconds. - The old dscript version 1.02 (2002-11-30) was already much faster than JS (it would have scored 981 secs). - VC seems to like 'sieve', its test results with other apps were good but not by such a huge margin. More info: JS (V5.6): 10 iterations, 1899 primes, elapsed time = 2403 dms (V1.03): 10 iterations, 1899 primes, elapsed time = 751 dmd (V0.111) [-O]: 10000 iterations, 1899 primes, elapsed time = 10.024 LccW32 (V3.3) [-O -p6]: 10000 iterations, 1899 primes, elapsed time = 8.833 dmc (V8.41) [-O -6]: 10000 iterations, 1899 primes, elapsed time = 8.362 DJGpp (gcc V3.43) [-O3 -march=pentium3]: 10000 iterations, 1899 primes, elapsed time = 4.12088 VCpp (V13.10.3077) [/O2 /G6]: 10000 iterations, 1899 primes, elapsed time = 2.553 |
January 20, 2005 Re: DMDScript vs. others | ||||
---|---|---|---|---|
| ||||
Posted in reply to Bob | We should check if dmd is the only one using a real bit-array for that 'flags' variable. If so, it suffers from extra shifts/ands. A byte[] should be tested too. L. |
January 20, 2005 Re: DMDScript vs. others | ||||
---|---|---|---|---|
| ||||
Posted in reply to Lionello Lunesu | Lionello Lunesu wrote:
> We should check if dmd is the only one using a real bit-array for that 'flags' variable. If so, it suffers from extra shifts/ands. A byte[] should be tested too.
I think you mean "wbit[]" ;-)
--anders
|
January 20, 2005 Re: DMDScript vs. others | ||||
---|---|---|---|---|
| ||||
Posted in reply to Bob | On Thu, 20 Jan 2005 18:12:50 +0000, Bob wrote: > > Tested the (in)famous 'sieve' on an older P3: <snip> > dmd (V0.111) [-O]: > 10000 iterations, 1899 primes, elapsed time = 10.024 > > LccW32 (V3.3) [-O -p6]: > 10000 iterations, 1899 primes, elapsed time = 8.833 > > dmc (V8.41) [-O -6]: > 10000 iterations, 1899 primes, elapsed time = 8.362 > > DJGpp (gcc V3.43) [-O3 -march=pentium3]: > 10000 iterations, 1899 primes, elapsed time = 4.12088 > > VCpp (V13.10.3077) [/O2 /G6]: > 10000 iterations, 1899 primes, elapsed time = 2.553 > PIII, 800 Mhz, Win2K sieve (10000): dmd v0.111 [-O -inline -release]: 1.67 secs. VCpp v13.10.3077 [/O2 /G6]: 1.65 secs. ary (50000): dmd: 2.70 secs. VCpp: 2.72 secs. heapsort (1000000): dmd: 2.37 secs. VCpp: 2.15 secs. DMD is very competitive. Used char[] for sieve.d because the C version did (apples to apples). Also used -release to turn off array bound checking for D. The C code came from: http://dada.perl.it/shootout/ ;--- sieve.d: ;--- import std.string; void main(char[][] args) { int n = args.length > 1 ? atoi(args[1]) : 1; char flags[8192 + 1]; int count; while(n--) { count = 0; flags[2..length] = 1; for(int i = 2; i < flags.length; i++) { if(flags[i]) { // remove all multiples of prime: i for(int j = i + i; j < flags.length; j += i) flags[j] = 0; count++; } } } printf("Count: %d\n", count); } ;--- ;--- ary.d ;--- import std.string; void main(char[][] args) { int n = args.length > 1 ? atoi(args[1]) : 1; int[] x = new int[n]; int[] y = new int[n]; for(int i = 0; i < n; i++) { x[i] = i + 1; } for(int k = 0; k < 1000; k++) { for(int i = n - 1; i >= 0; i--) { y[i] += x[i]; } } printf("%d %d\n",y[0],y[y.length - 1]); } ;--- ;--- heapsort.d ;--- import std.string; void main(char[][] args) { int n = args.length > 1 ? atoi(args[1]) : 1; double[] ary; ary.length = n + 1; for(int i = 1; i <= n; i++) { ary[i] = gen_random(1); } heapsort(n, ary); printf("%.10g\n", ary[n]); } void heapsort(int n, double[] ra) { int i, j; int ir = n; int l = (n >> 1) + 1; double rra; for (;;) { if (l > 1) { rra = ra[--l]; } else { rra = ra[ir]; ra[ir] = ra[1]; if (--ir == 1) { ra[1] = rra; return; } } i = l; j = l << 1; while (j <= ir) { if (j < ir && ra[j] < ra[j+1]) { ++j; } if (rra < ra[j]) { ra[i] = ra[j]; j += (i = j); } else { j = ir + 1; } } ra[i] = rra; } } const int IM = 139968; const int IA = 3877; const int IC = 29573; double gen_random(double max) { static int last = 42; return( max * (last = (last * IA + IC) % IM) / IM ); } ;--- - Dave |
January 20, 2005 Re: DMDScript vs. others | ||||
---|---|---|---|---|
| ||||
Posted in reply to Bob | "Bob" <Bob_member@pathlink.com> wrote in message news:csosb2$cq8$1@digitaldaemon.com... > dmd (V0.111) [-O]: > 10000 iterations, 1899 primes, elapsed time = 10.024 For max speed on D apps, use [-O -release]. Without the -release, the array overflow checking is turned on! |
January 21, 2005 Re: DMDScript vs. others | ||||
---|---|---|---|---|
| ||||
Posted in reply to Bob | Hey Bob, I basically 'ported' this to C/++ by wrapping the sieve.ds script with main() and type'ing the variables. On a P3, P4 or AMD64 the ratio of VC to DMC is about 1.2:1. I couldn't find a way to make DMC run as slow as posted - are you sure there isn't a typo somewhere? - Dave In article <csosb2$cq8$1@digitaldaemon.com>, Bob says... > > >Tested the (in)famous 'sieve' on an older P3: > > >Results (times in secs): > >2.55 vcpp >4.12 djgpp >8.36 dmc >8.83 lccW32 >10.02 dmd >751.00 dms >2403.00 js > > > >Speed relative to D language: > >3.92941 vcpp >2.43204 djgpp >1.19856 dmc >1.13477 lccw32 >1.00000 [*] dmd >0.01334 dms >0.00417 js > > > >Remarks: > >- Scripting languages results extrapolated, >don't expect me to wait 2400+ seconds. >- The old dscript version 1.02 (2002-11-30) was already >much faster than JS (it would have scored 981 secs). >- VC seems to like 'sieve', its test results with >other apps were good but not by such a huge margin. > > > >More info: > >JS (V5.6): >10 iterations, 1899 primes, elapsed time = 2403 > >dms (V1.03): >10 iterations, 1899 primes, elapsed time = 751 > >dmd (V0.111) [-O]: >10000 iterations, 1899 primes, elapsed time = 10.024 > >LccW32 (V3.3) [-O -p6]: >10000 iterations, 1899 primes, elapsed time = 8.833 > >dmc (V8.41) [-O -6]: >10000 iterations, 1899 primes, elapsed time = 8.362 > >DJGpp (gcc V3.43) [-O3 -march=pentium3]: >10000 iterations, 1899 primes, elapsed time = 4.12088 > >VCpp (V13.10.3077) [/O2 /G6]: >10000 iterations, 1899 primes, elapsed time = 2.553 > > |
January 21, 2005 Re: DMDScript vs. others | ||||
---|---|---|---|---|
| ||||
Posted in reply to Dave | No typo - just global variables are the cause. Speed gets much better if vars are being made local to main(). In this case I can confirm your findings. In case of 'globalisation', however, dmd, dmc and lcc are way behind vcpp. Just djgpp gets in its vicinity. I have posted a bug in dmd to the bugs forum. If you are intersted, you might check out the 'sieve' coding I have used from there. Remark: After checking assembly listings it turnes out that vcpp does full optimization on global vars, which might or might not be desirable. Djgpp has the best compromise of code and speed, still refraining from keeping some global variables in the CPU registers. However, good code comes at a price: Djgpp is compiling way slower than any of the other compilers mentioned, thus making development cycles a real test for patience. In article <csps4d$1ji3$1@digitaldaemon.com>, Dave says... > > >Hey Bob, > >I basically 'ported' this to C/++ by wrapping the sieve.ds script with main() and type'ing the variables. > >On a P3, P4 or AMD64 the ratio of VC to DMC is about 1.2:1. I couldn't find a way to make DMC run as slow as posted - are you sure there isn't a typo somewhere? > >- Dave > >In article <csosb2$cq8$1@digitaldaemon.com>, Bob says... >> >> >>Tested the (in)famous 'sieve' on an older P3: >> >> >>Results (times in secs): >> >>2.55 vcpp >>4.12 djgpp >>8.36 dmc >>8.83 lccW32 >>10.02 dmd >>751.00 dms >>2403.00 js >> >> >> >>Speed relative to D language: >> >>3.92941 vcpp >>2.43204 djgpp >>1.19856 dmc >>1.13477 lccw32 >>1.00000 [*] dmd >>0.01334 dms >>0.00417 js >> >> >> >>Remarks: >> >>- Scripting languages results extrapolated, >>don't expect me to wait 2400+ seconds. >>- The old dscript version 1.02 (2002-11-30) was already >>much faster than JS (it would have scored 981 secs). >>- VC seems to like 'sieve', its test results with >>other apps were good but not by such a huge margin. >> >> >> >>More info: >> >>JS (V5.6): >>10 iterations, 1899 primes, elapsed time = 2403 >> >>dms (V1.03): >>10 iterations, 1899 primes, elapsed time = 751 >> >>dmd (V0.111) [-O]: >>10000 iterations, 1899 primes, elapsed time = 10.024 >> >>LccW32 (V3.3) [-O -p6]: >>10000 iterations, 1899 primes, elapsed time = 8.833 >> >>dmc (V8.41) [-O -6]: >>10000 iterations, 1899 primes, elapsed time = 8.362 >> >>DJGpp (gcc V3.43) [-O3 -march=pentium3]: >>10000 iterations, 1899 primes, elapsed time = 4.12088 >> >>VCpp (V13.10.3077) [/O2 /G6]: >>10000 iterations, 1899 primes, elapsed time = 2.553 >> >> > > |
January 21, 2005 Re: DMDScript vs. others | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter | Thanks for your info. With "-release" dmd compiled programs are getting speedwise into the dmc league. Quite good for an alpha version - I am impressed! In article <csp4pk$o45$1@digitaldaemon.com>, Walter says... > > >"Bob" <Bob_member@pathlink.com> wrote in message news:csosb2$cq8$1@digitaldaemon.com... >> dmd (V0.111) [-O]: >> 10000 iterations, 1899 primes, elapsed time = 10.024 > >For max speed on D apps, use [-O -release]. Without the -release, the array overflow checking is turned on! > > |
January 21, 2005 Re: DMDScript vs. others | ||||
---|---|---|---|---|
| ||||
Posted in reply to Lionello Lunesu | The whole thing was tested using a char[] flag array, because bit[] arrays are indeed slower - you were right guessing that. In article <csosm0$d8d$1@digitaldaemon.com>, Lionello Lunesu says... > >We should check if dmd is the only one using a real bit-array for that 'flags' variable. If so, it suffers from extra shifts/ands. A byte[] should be tested too. > >L. > > |
January 21, 2005 Re: DMDScript vs. others | ||||
---|---|---|---|---|
| ||||
Posted in reply to Bob | Bob wrote: >>We should check if dmd is the only one using a real bit-array for that 'flags' variable. If so, it suffers from extra shifts/ands. A byte[] should be tested too. > > The whole thing was tested using a char[] flag array, > because bit[] arrays are indeed slower - you were right > guessing that. So bit[] is not only buggy (append), but also slower ? :-) Of course, it does save memory for large bit arrays... (while small [0|1|2|3] bit arrays are actually larger) Just wondering if it's really worth it, especially since it seems to come at the expense of getting a boolean type. But since "the fundamental data type is the bit"... <sic> and bit[] is touted as a main D feature, I guess it stays. http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D/12625: > Henceforth, byte/char shall be known as a "wbit" when used as a bool > and int/long shall similarly be known as a "dbit" when used as a bool. --anders PS. "char" and "long" are the C types, known as "byte" and "int" in D. |
Copyright © 1999-2021 by the D Language Foundation