Refining AA's (page 2)

Manfred Nowak wrote: > Sean Kelly wrote: > > [...] >> the King James Bible from Project Gutenberg is my standard >> test input > [...] > > Standard wc-example yields about 100 ms spent for the operations in the AA for the King James Bible. What are your criteria for performance, if this is indeed the largest possible test data? I was cleaning up some directories about a week ago and lost my AA test in the process. I'll rewrite something when I find some free time. But the metrics I was comparing were memory usage, cache misses, and test execution time, with execution time obviously being the most important factor. I didn't make any attempt to reduce or eliminate spurious memory allocations, though when I re-do the tests I'm going to use a Slice class I wrote in C++ instead of std::string. Sean

Manfred Nowak wrote: > Sean Kelly wrote: > > [...] >> (as it's the largest non-random text file I could >> find). > > I do not understand, why randomness is not suitable for general usage. I wanted to see results given a real-world example. Using a massive dataset of randomly generated gibberish, ie: "jkhfdsa jksvldijsd sajasd d asdlk wpejvccx..." seems like it would skew results slightly by reducing the chance of collisions. I suppose a fancier method might use a Markhov generator to create arbitrarily large pseudo-English text to process, but I was only testing casually and didn't want to spend the time writing the code. Sean

Manfred Nowak wrote: > > 3. According to the answers in the thread my implementation currently is not suitable for general use, because the abstract problem I used was restricted to fixed length keys. I am experimenting with an appropriate extension. My unordered_set implementation doesn't support rehashing yet--I wrote it for use in a compiler and didn't have the time to bother with it. If you have a succinct test example I could run it on I'll give it a shot, but it'll have a bit of an unfair advantage until I get around to finishing it. Sean

In article <Xns9720E101439BFsvv1999hotmailcom@63.105.9.61>, Manfred Nowak says... > >Dave wrote: > >[...] >> Please post your results too. > >Preliminary results after some tuning of the parameters of my scalable implementation: > >size ins acc t.W t.me gain >4 16 9999984 1338 1137 -15,02% >7 130 9999870 1345 1136 -15,54% >10 1000 9999000 1395 1147 -17,78% >14 16000 9984000 1554 1140 -26,64% >17 130000 9870000 2092 1256 -39,96% >20 970000 9030000 3354 1763 -47,44% >22 2400000 7600000 7367 2183 -70,37% >24 4900000 5100000 21227 3282 -84,54% >26 8000000 2000000 58396 15691 -73,13% >27 8900000 1100000 108875 51929 -52,30% >28 9400000 600000 112160 193036 72,11%* > >legend: >size:= binary logarithm of the size of the abstract problem >ins:= number of insertions into the AA >acc:= number of accesses to elements of the AA >t.W:= total time in ms needed by Walters implementation >t.me:= total time in ms needed by my implementation >gain:= relative time gain after subtracting 770 ms, which is the >time needed with no accesses to AA's >*:= data insufficient > >comments: > >1. My implementation uses more space than Walters. Therefore the time gain starts to stall at about 5M elements inserted and it collapses when about 9M elements are inserted, based on 2GB RAM of my machine. > >2. One try with an extreme setting of the parameters of the implementation gave the hint, that a time reduction to about 15% of Walters implementation up to 5M elements might be possible, but dmd needed nearly an hour to produce the executable for that one try :-( > >3. According to the answers in the thread my implementation currently is not suitable for general use, because the abstract problem I used was restricted to fixed length keys. I am experimenting with an appropriate extension. > >-manfred What D implementation (DMD or GDC) and version? Thanks, - Dave

December 02, 2005

Re: Refining AA's

Posted by Manfred Nowak
in reply to Sean Kelly

Permalink

Manfred Nowak

Posted in reply to Sean Kelly

Permalink

Sean Kelly wrote:

[...]
> If you have a succinct test example I could run
> it on I'll give it a shot, but it'll have a bit of an unfair
> advantage until I get around to finishing it.

I dont understand you fully. If you mean to give your implemetation a try on my framweork, here it is:

-manfred

/+
   The abstract problem asks for the time needed for n value
   changes for keys somehow randomly choosen out of
   the range [ 0 .. s-1].

   The abstract problem has two parameters n and s.

   n is the total number of operations on the AA.
   Initially I set it, so that the resulting total time
   was within a resonable limit, i.e. 60 s. But it turned out,
   to be also 10% more than the number of elements, that fit into
   the RAM of the target machine. Therefore the latter aproximation
   should be choosen.

   s is the size of the space from which the keys are drawn. It is
   reasonable to choose s to be a power of 2.

   The "somehow" in the description above
   reduces the randomness of the draw according to the
   well known 80:20 rule, i.e. 80% of the operations are executed
   on 20% of the key space.

   Compilation parameters:
   -version=a
     run will give the time needed without any accesses to AA
   -version=aaa
     run will give the time needed for
     the current implementation of the AA
   -version=aa
     run will give the time needed for
     the candidate
   -version=aa -debug=asrt
     will test the candidates correctness
     against the current implementation
+/
void main(){
  alias HighPerformanceCounter Timer;
  Timer time;

  time= new Timer;
  const uint ttotal= 64*1024*1024, // Problem size s
             tsmall= ttotal/ 5;    // i.e. the target space
  const uint ntotal= 10_000_000,   // total number of operations n
             nsmall= ntotal/4,
             nlarge= ntotal-nsmall;
  time.start();
    version( aaa){
      uint[ uint] aaa;
      uint r= void;
      for( int i= 1; i<= nlarge; i++){
        r= rand();
	aaa[ (r)%tsmall]= i;
      }
      for ( int i= 1; i<= nsmall; i++){
        r= rand();
	aaa[ (r)%ttotal]= i;
      }
      writef( aaa.length, " ");
    } else {
      version( aa){
	AA aa= new AA;
        debug(asrt)uint[ uint] a;
        uint r= void;
	for( int i= 1; i<= nlarge; i++){
          r= rand();
	  aa.insert( (r)%tsmall, i);
          debug(asrt) a[ (r)%tsmall] =i;
          debug assert( aa.retrieveRval( (r)%tsmall) == i);
          debug {
            writef( a.length ,  " " , aa.length, " ", (r)%tsmall);
            for( int d= aa.depth; d>0; d--)
              writef( " ", aa.position( (r)%tsmall,d));
            writefln( "");
          }
          debug assert( a.length == aa.length);
          debug foreach( uint k, uint v; a){
            assert( aa.retrieveRval( k) == v);
          }
        }
	for( int i= 1; i<= nsmall; i++){
          r= rand();
	     aa.insert( (r)%ttotal, i);
          debug(asrt) a[ (r)%ttotal] =i;
          debug assert( aa.retrieveRval( (r)%ttotal) == i);
          debug assert( a.length == aa.length());
          debug foreach( uint k, uint v; a){
            assert( aa.retrieveRval( k) == v);
          }
        }
        debug(asrt) assert( a.length == aa.length());
        debug(asrt)  foreach( uint k, uint v; a){
          assert( aa.retrieveRval( k) == v);
        }
        writef( aa.length(), " ");
      } else {
        uint r= void;
	   for( int i= 1; i<= nlarge; i++){
          r= rand();
	    ( (r)%tsmall, i);
        }
	   for( int i= 1; i<= nsmall; i++){
          r= rand();
	     ( (r)%ttotal, i);
        }
      }
    }
  time.stop();
  writefln( time.milliseconds());
  }
}

Dave wrote: [...] > What D implementation (DMD or GDC) and version? Using DMD 0.140 under WinXP32. Times reported are lowest of three consecutive runs. Times reported are the real times reported by the time command of cygwin. Remark: Although WinXP32 is limited to less than 4GB of RAM, the memory management seem to have difficulties to deliver memory chunks, when a single thread needs more than 1 GB, which is far beyond the absolute limit. -manfred

In article <Xns972166DCE52Asvv1999hotmailcom@63.105.9.61>, Manfred Nowak says... > >Dave wrote: >[...] >> What D implementation (DMD or GDC) and version? > >Using DMD 0.140 under WinXP32. Cool - is this something you've actually plugged into internal/aaA.d already to test? If so post the code if you want, and I'll give it a shot with the benchmarks I sent earlier and post the results (I already have the input files handy, etc.). - Dave >Times reported are lowest of three consecutive runs. >Times reported are the real times reported by the time command of >cygwin. > >Remark: >Although WinXP32 is limited to less than 4GB of RAM, the memory >management seem to have difficulties to deliver memory chunks, when a >single thread needs more than 1 GB, which is far beyond the absolute >limit. > >-manfred

Sean Kelly wrote: >[...] > a fancier method might use a Markhov generator to create arbitrarily large pseudo-English text to process Word counts do not realize anything about a language, except the word frequencies and the corresponding word lengthes. You can get these from the known language institutes and build a test of the size you wish, by randomizing over that frequencies and ... But I still do not understand why word counting any example from a natural language could be a measure for the performance of an AA in the general case. -manfred

Manfred Nowak wrote: > Sean Kelly wrote: > >> [...] >> a fancier method might use a Markhov generator to create >> arbitrarily large pseudo-English text to process > > Word counts do not realize anything about a language, except the word frequencies and the corresponding word lengthes. You can get these from the known language institutes and build a test of the size you wish, by randomizing over that frequencies and ... > > But I still do not understand why word counting any example from a natural language could be a measure for the performance of an AA in the general case. Only because it's an example of typical use, so it seems a reasonable metric to use. Sean

Sean Kelly wrote: [...] >> why word counting any example >> from a natural language could be a measure for the performance >> of an AA in the general case. > > Only because it's an example of typical use, so it seems a reasonable metric to use. What other examples of typical usage do you know and how should the measures taken for those example be weighted? -manfred

Forums