Array append performance (Was: Re: Lisp like lists and a problem with TANGO fromStringz) (page 2)

bearophile wrote: > To: Newsgroups: digitalmars.D > Subject: Re: Lisp like lists and a problem with TANGO fromStringz > > Oskar Linde>Here is a way to reserve space for arrays:< > > I did try a function equal to that one, but have you timed it? My timing results don't show any advantage of using such reserve. There are more advantages to do it than speed. Memory fragmentation and wastage for example. <snip> > OUTPUT TIMINGS: > > DMD 1.026, N = 12_000_000: > append: 3.3 s > reserve + append: 3.3 s > allocate + assign: 1.0 s It is hard to draw too many conclusions from such a simple test. For one thing, the DMD GC will on array appends try to grow the allocated GC chunk without reallocating. In an application such as this (with virtually no memory fragmentation), there is a good chance that it is able to do that almost all the time. DMD/GDC also seems unable to inline the _d_arrayappendcTp call which would be one reason for the rather poor performance overall. There are cases where the reallocation helps though, even though the performance overall is embarrassing compared to not appending. GDC 0.24, N = 120_000_000: append: 9.3 s reserve + append: 5.3 s allocate + assign: 1.3 s (Also note that append in this case wastes 168 MB memory compared to reserve+append (*)) It is also slightly faster for small arrays, where the GC bucket limitation prevents in place growing from happening: GDC 0.24, N = 512: append: 26 µs reserve + append: 23 µs allocate + assign: 1.7 µs ---- *) There seems to be a bug (or it might be by design) that the GC on appending to an array sometimes allocates up to 3x the required memory (to me, it seems anything above 2x is seriously overkill). The line in question is in internal/gc/gc.d in Phobos, where the correct line is commented out: //long mult = 100 + (1000L * size) / (6 * log2plus1(newcap)); long mult = 100 + (1000L * size) / (log2plus1(newcap)); So, for example, appending one int to a 1MB int[] array, could result in an array with 3MB reserved space. Or appending a single int to a 74MB array results in a 184 MB array. Something to look out for. -- Oskar

Oskar Linde>DMD/GDC also seems unable to inline the _d_arrayappendcTp call which would be one reason for the rather poor performance overall.< I see. -------------- kov_serg: > 1st do not use clock function -- it measure processor time for current thread. Simply use funtion that measure the time passed. Okay. I usually use wall clock timings (I haven't used it here to avoid initialization times). Note that I have used a Pentium3 and the system was well unloaded. So I think the actual wall clock time was close. > http://en.wikipedia.org/wiki/CPU_cache That's a very rich article! I'll read it. > http://leitl.org/docs/comp/AMD_block_prefetch_paper.pdf Very interesting, I have read it, but it's a bit too much maniac for me still :-) Compiler writers must help us for such things ;-) > 2nd reason is CPU cache behaviour. Turn it off and compare again. I don't understand how switching off a part of the CPU can help my tests show more real results. (And I don't know how to do it.) In this forum there are lot of people that know much more than me, bye and thank you very much to both, bear hugs, bearophile

March 03, 2008

Re: Array append performance (Was: Re: Lisp like lists and a problem with TANGO fromStringz)

Posted by Bjoern
in reply to Oskar Linde

Permalink

Bjoern

Posted in reply to Oskar Linde

Permalink

Oskar Linde schrieb:
> bearophile wrote:
>> To: Newsgroups: digitalmars.D
>> Subject: Re: Lisp like lists and a problem with TANGO fromStringz
>>
>> Oskar Linde>Here is a way to reserve space for arrays:<
>>
>> I did try a function equal to that one, but have you timed it? My timing results don't show any advantage of using such reserve.
> 
> There are more advantages to do it than speed. Memory fragmentation and wastage for example.
> 
> <snip>
> 
>> OUTPUT TIMINGS:
>>
>> DMD 1.026, N = 12_000_000:
>>   append:            3.3 s
>>   reserve + append:  3.3 s
>>   allocate + assign: 1.0 s
> 
> 
> It is hard to draw too many conclusions from such a simple test. For one thing, the DMD GC will on array appends try to grow the allocated GC chunk without reallocating. In an application such as this (with virtually no memory fragmentation), there is a good chance that it is able to do that almost all the time.
> 
> DMD/GDC also seems unable to inline the _d_arrayappendcTp call which would be one reason for the rather poor performance overall.
> 
> There are cases where the reallocation helps though, even though the performance overall is embarrassing compared to not appending.
> 
> GDC 0.24, N = 120_000_000:
>    append:            9.3 s
>    reserve + append:  5.3 s
>    allocate + assign: 1.3 s
> 
> (Also note that append in this case wastes 168 MB memory compared to reserve+append (*))
> 
> It is also slightly faster for small arrays, where the GC bucket limitation prevents in place growing from happening:
> 
> GDC 0.24, N = 512:
>    append:            26  µs
>    reserve + append:  23  µs
>    allocate + assign: 1.7 µs
> 
> ----
> 
> *) There seems to be a bug (or it might be by design) that the GC on appending to an array sometimes allocates up to 3x the required memory (to me, it seems anything above 2x is seriously overkill). The line in question is in internal/gc/gc.d in Phobos, where the correct line is commented out:
> 
> //long mult = 100 + (1000L * size) / (6 * log2plus1(newcap));
> long mult = 100 + (1000L * size) / (log2plus1(newcap));
> 
> So, for example, appending one int to a 1MB int[] array, could result in an array with 3MB reserved space. Or appending a single int to a 74MB array results in a 184 MB array. Something to look out for.
> 

Gentleman,
somebody out there remembering the topic ?   ... :
Bjoern -< OOPS wrong planet.

Forums