May 04, 2003
The main reason I published the Venus library at this time is to be able to discuss certain benchmark results.

The positive message is that bit arrays are implemented very efficiently. I compared the sieve of Erathostenes using char [] and bit [] and found little difference:

                           usec
elements          char []          bit []
10000                 382             511
100000               8265            5647
1000000            234187           96923
10000000          3669514         4839531

At a certain size - corresponding to the processor cache size -
the bit version is even faster than the char version!
(I usee a 750 MHz Athlon)

The negative message is that the gc is a desaster. I first
noticed a long delay after printing results and before the
program really ended and added gc timings (10 M elements):

  BenchSieveArray   :   3669514.239 usec
  gc.fullCollect 3000981.411 usec
  BenchSieveBitArray:   4839531.336 usec
  gc.fullCollect 3188625.354 usec

This means that collecting the single large blocks of memory takes almost as long as doing the primes!

============== code mentioned

/*
** t_sieve.d
** Copyright (C) 2003 Helmut Leitner
** given to public domain
*/

import venus.all;

//const int BENCHSIEVESIZE=8192;
const int BENCHSIEVESIZE=10000;

char flags[];

void BenchSieveArray()
{
    int iter;
    int i;
    int k;
    int count=0;
    int prime;
    int j;
    int limit=BENCHSIEVESIZE-1;
    flags=new char[BENCHSIEVESIZE+1];

    i=0;
    for(i=1; i<=limit; i++) flags[i]=1;
    for(i=1; i<=limit; i++) {
        if(flags[i]) {
           prime=i+i+1;
           count=count+1;
           k=i+prime;
           if(k > limit) continue;
           for(j=k; j<=limit; j+=prime) flags[j]=0;
        }
    }
    //printf("count=%d\n",count);
}

bit bflags[];

void BenchSieveBitArray()
{
    int iter;
    int i;
    int k;
    int count=0;
    int prime;
    int j;
    int limit=BENCHSIEVESIZE-1;

    bflags=new bit[BENCHSIEVESIZE+1];

    i=0;
    for(i=1; i<=limit; i++) bflags[i]=1;
    for(i=1; i<=limit; i++) {
        if(bflags[i]) {
           prime=i+i+1;
           count=count+1;
           k=i+prime;
           if(k > limit) continue;
           for(j=k; j<=limit; j+=prime) bflags[j]=0;
        }
    }
    //printf("count=%d\n",count);
}

int main (char [][] args)
{
    FpBenchPrint(BenchSieveArray   ,"BenchSieveArray   :  ");
    DelegateBenchPrint( delegate void () { gc.fullCollect(); },"gc.fullCollect");

    FpBenchPrint(BenchSieveBitArray,"BenchSieveBitArray:  ");
    DelegateBenchPrint( delegate void () { gc.fullCollect(); },"gc.fullCollect");

    return 0;
}

=================

--
Helmut Leitner    leitner@hls.via.at Graz, Austria   www.hls-software.com
May 04, 2003
"Helmut Leitner" <helmut.leitner@chello.at> wrote in message news:3EB5770D.7962738E@chello.at...
> The negative message is that the gc is a desaster. I first noticed a long delay after printing results and before the program really ended and added gc timings (10 M elements):

The gc can certainly be better implemented.