Thread overview
[D-runtime] Fw: [phobos] newCapacity function in array appending
Jul 14, 2010
Fawzi Mohamed
Jul 14, 2010
Fawzi Mohamed
Jul 14, 2010
Sean Kelly
Jul 16, 2010
Fawzi Mohamed
July 13, 2010
Didn't know this list existed until just now.

Moving the related discussion below to the runtime list

-Steve



----- Forwarded Message ----
> From: Steve Schveighoffer <schveiguy at yahoo.com>
> To: Phobos <phobos at puremagic.com>
> Sent: Tue, July 13, 2010 3:54:09 PM
> Subject: [phobos] newCapacity function in array appending
> 
> There is a couple issues with the growth rate of arrays in the array append function.
> 
> First, the newCapacity function is only called when the  array must be reallocated.  I believe this to be an error, I think the  newCapacity function

> should always be used, even if pages can be  added.  This would use less cycles
>
> searching for pages to  append.
> 
> Second, the newCapacity function seems to use the size of the  element to determine a percentage growth.  I feel this is an  error.  I think the growth factor should strictly be related to the  amount of memory being requested,
>which
>
> already takes into account the size  of the element.
> 
> At present, this means that arrays of ints grow 4x as  fast as arrays of
>bytes.
>
> And that's 4x in percentage terms, not in 4x  the memory used.  The memory used
>
> actually grows 16x  faster.
> 
> Does everyone agree with this?
> 
> The relevant bug for this  is
>http://d.puremagic.com/issues/show_bug.cgi?id=4412
> 
> -Steve
> 
> 
> 
> 
> _______________________________________________
> phobos  mailing list
> phobos at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/phobos
> 



July 14, 2010
On 13-lug-10, at 22:59, Steve Schveighoffer wrote:

> Didn't know this list existed until just now.
>
> Moving the related discussion below to the runtime list
>
> -Steve
>
>
>
> ----- Forwarded Message ----
>> From: Steve Schveighoffer <schveiguy at yahoo.com>
>> To: Phobos <phobos at puremagic.com>
>> Sent: Tue, July 13, 2010 3:54:09 PM
>> Subject: [phobos] newCapacity function in array appending
>>
>> There is a couple issues with the growth rate of arrays in the
>> array append
>> function.
>>
>> First, the newCapacity function is only called when the  array must
>> be
>> reallocated.  I believe this to be an error, I think the
>> newCapacity function
>
>> should always be used, even if pages can be  added.  This would use less cycles
>>
>> searching for pages to  append.

I am not sure I follow, you mean using the new function to determine
how many pages really commit?
well indeed you would spare some cycles, but at the cost of using
memory, not sure it is really a win (the use of pages already
>>
>> Second, the newCapacity function seems to use the size of the
>> element to
>> determine a percentage growth.  I feel this is an  error.  I think
>> the growth
>> factor should strictly be related to the  amount of memory being
>> requested,
>> which
>>
>> already takes into account the size  of the element.
>>
>> At present, this means that arrays of ints grow 4x as  fast as
>> arrays of
>> bytes.
>>
>> And that's 4x in percentage terms, not in 4x  the memory used.  The memory used
>>
>> actually grows 16x  faster.
>>
>> Does everyone agree with this?

Again I am not too sure, I flip flopped a couple of times implementing http://github.com/fawzi/blip/blob/master/blip/util/Grow.d
  , a version of which I had also used in tango.

basically if you append to an array do you want the appending be
determined by the number of elements of the array, or its size?
Code will probably work depending on the number of elements of the
array, not on its memory usage.
In the end it is just a factor, but if you have many small elements I
feel that it if "reasonable" to allocate a smaller amount of extra
space.
On the other hand as you pointed out if you thing strictly from the
memory point of view, one can argue that the percentile of extra space
should depend only on the space used, not on how it is used internally.

Finally I kept the "per element" view, but that was done without any real testing
>>
>> The relevant bug for this  is http://d.puremagic.com/issues/show_bug.cgi?id=4412
>>
>> -Steve
>>
>>
>>
>>
>> _______________________________________________
>> phobos  mailing list
>> phobos at puremagic.com
>> http://lists.puremagic.com/mailman/listinfo/phobos
>>
>
>
>
> _______________________________________________
> D-runtime mailing list
> D-runtime at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/d-runtime

July 14, 2010



----- Original Message ----
> From: Fawzi Mohamed <fawzi at gmx.ch>
> 
> 
> > ----- Forwarded Message ----
> >> From: Steve Schveighoffer  <schveiguy at yahoo.com>
> >> 
> >> There is a couple issues  with the growth rate of arrays in the array
append
> >>  function.
> >> 
> >> First, the newCapacity function is only  called when the  array must be reallocated.  I believe  this to be an error, I think the  newCapacity
>function
> > 
> >>  should always be used, even if pages can be  added.  This would use  less
>cycles
> >> 
> >> searching for pages to   append.
> 
> I am not sure I follow, you mean using the new function to  determine how many
>pages really commit?
> well indeed you would spare some  cycles, but at the cost of using memory, not
>sure it is really a win (the use of  pages already

I assume the newCapacity function is used to avoid constantly looking for new memory to allocate.  The theory being if you are appending, you don't want to just allocate exactly what you need, you want to preallocate a little more.

At least when the current pool is exhausted, newCapacity is used to allocate a fresh block, I just don't know why it's not used to extend the current block (which is why I'm trying to add that feature).

One of the issues with the current GC is that getting a size of a memory block that consists of hundreds or thousands of pages is that it is a linear search for the end of the block.   This is mitigated by the LRU cache, but once you exceed the number of cache, performance should probably fall off a cliff (however, it doesn't for reasons I don't understand).  But when I extend a block, it must do this same linear search to find the page size before doing an extend.  I'm trying to do that linear search as little as possible.  All this could probably be mitigated if the allocated length was somehow cached per allocated page.

I didn't write newCapacity, so I'm not sure about the theory behind it, or what desirable properties for appending are.  But it does make sense to me that if you are continuously appending, you want to preallocate more than "just enough to append the new data".  How much more, I have no clue :)  I assumed the formula in newCapacity was written by someone who knows what they are doing. Well, except for using the element size in the percentage formula.

> >> 
> >> Second, the newCapacity function seems to  use the size of the  element to determine a percentage  growth.  I feel this is an  error.  I think the
>growth
> >> factor should strictly be related to the  amount of  memory being
requested,
> >> which
> >> 
> >> already takes  into account the size  of the element.
> >> 
> >> At present,  this means that arrays of ints grow 4x as  fast as arrays of
> >>  bytes.
> >> 
> >> And that's 4x in percentage terms, not in  4x  the memory used.  The memory
>used
> >> 
> >>  actually grows 16x  faster.
> >> 
> >> Does everyone agree  with this?
> 
> Again I am not too sure, I flip flopped a couple of times  implementing
>http://github.com/fawzi/blip/blob/master/blip/util/Grow.d , a  version of which I had also used in tango.
> 
> basically if you append to an  array do you want the appending be determined by
>the number of elements of the  array, or its size?
> Code will probably work depending on the number of  elements of the array, not
>on its memory usage.
> In the end it is just a  factor, but if you have many small elements I feel
>that it if "reasonable" to  allocate a smaller amount of extra space.
> On the other hand as you pointed  out if you thing strictly from the memory
>point of view, one can argue that the  percentile of extra space should depend only on the space used, not on how it is  used internally.
> 
> Finally I kept the "per element" view, but that was done  without any real
>testing

Well, per length or per size, it's both a percentage.  The size is a linear scale of the length, so it's not really important whether you use size or length.  But does it make sense to alter the percentage based on the element size?  That is, if arrays of bytes should grow by 50% each time a realloc is needed, should arrays of ints grow by 200%?  It doesn't make sense to me, yet that is what was in the calculation.

The other factors are seemingly arbitrary, so I'm not sure how to tune those.

-Steve




July 14, 2010
On 14-lug-10, at 15:19, Steve Schveighoffer wrote:

> ----- Original Message ----
>> From: Fawzi Mohamed <fawzi at gmx.ch>
>>
>>
>>> ----- Forwarded Message ----
>>>> From: Steve Schveighoffer  <schveiguy at yahoo.com>
>> [...]
>> I am not sure I follow, you mean using the new function to
>> determine how many
>> pages really commit?
>> well indeed you would spare some  cycles, but at the cost of using
>> memory, not
>> sure it is really a win (the use of  pages already
>
> I assume the newCapacity function is used to avoid constantly
> looking for new
> memory to allocate.  The theory being if you are appending, you
> don't want to
> just allocate exactly what you need, you want to preallocate a
> little more.
>
> At least when the current pool is exhausted, newCapacity is used to
> allocate a
> fresh block, I just don't know why it's not used to extend the
> current block
> (which is why I'm trying to add that feature).

ok I understand, it makes sense.

> One of the issues with the current GC is that getting a size of a
> memory block
> that consists of hundreds or thousands of pages is that it is a
> linear search
> for the end of the block.   This is mitigated by the LRU cache, but
> once you
> exceed the number of cache, performance should probably fall off a
> cliff
> (however, it doesn't for reasons I don't understand).  But when I
> extend a
> block, it must do this same linear search to find the page size
> before doing an
> extend.  I'm trying to do that linear search as little as possible.
> All this
> could probably be mitigated if the allocated length was somehow
> cached per
> allocated page.

I discussed about this with winterwar (wm4), I had proposed some
enhancements, but it seems that most of these did not really improve
things much.
(I was arguing for a global array with pool sizes, as I thought that
it would be better from the caching prespective).
About the linear search a simple way would be to add some extra flags,
like
PAGE_PLUS_0 PAGE_PLUS_1, PAGE_PLUS_2, PAGE_PLUS_3, PAGE_PLUS_4,...
and then use that to know that if you are at PAGE_PLUS_x you have to
jump back at least 2^x pages.
with very few bits you can drastically reduce the jumps.
The idea is *not* o encode the exact jump, but just the position of
the highest bit of the jump.
linear search is fast (and well cached), so you really win only if you
encode large jumps, thus the exponential growth.
The other option is to encode the exact address (wm4 was rooting for
that), but I feel that the extra memory used defeats the advantages.

> I didn't write newCapacity, so I'm not sure about the theory behind
> it, or what
> desirable properties for appending are.  But it does make sense to
> me that if
> you are continuously appending, you want to preallocate more than
> "just enough
> to append the new data".  How much more, I have no clue :)  I
> assumed the
> formula in newCapacity was written by someone who knows what they
> are doing.
> Well, except for using the element size in the percentage formula.

well the initial formula in tango clearly had a flaw (exploding for
some arguments), I have reduced it to and interpolation between using
	a/100 percentage more space (geometric growth)
and adding something that grows proportionally to
	x/log2(x)
which is the average storage used by each bit of x (log2(x) is the
number of bits of x).
x/log2(x) goes to inifinity for x=0, and grows less than just x for
large x.
In the end I choose a functional form with 2 parameters (a,b, well
minBits if one really wants, but I think it is not so useful).

> [...]
> The other factors are seemingly arbitrary, so I'm not sure how to
> tune those.

Indeed I did not know a meaningful way to tune those parameters, so I just choose values that looked good and reasonable to me, and called it a day :).

Fawzi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.puremagic.com/pipermail/d-runtime/attachments/20100714/487ec218/attachment-0001.html>
July 14, 2010
On Jul 14, 2010, at 6:19 AM, Steve Schveighoffer wrote:
> 
> I assume the newCapacity function is used to avoid constantly looking for new memory to allocate.  The theory being if you are appending, you don't want to just allocate exactly what you need, you want to preallocate a little more.

What gets me is that the GC already does this, since it uses blocks that are powers of two in size, and then page-sized chunks (which may again have extra space allocated at the end based on some qualities of the alloc request).  I can see newCapacity allocating 4x extra on an int append than a byte append, but the other logic has always seemed a bit overthought to me.  But then I never sat down and tested it so I figured maybe it's actually good in practice.  I didn't notice the screwed up growth factor based on element size though.

> One of the issues with the current GC is that getting a size of a memory block that consists of hundreds or thousands of pages is that it is a linear search for the end of the block.   This is mitigated by the LRU cache, but once you exceed the number of cache, performance should probably fall off a cliff (however, it doesn't for reasons I don't understand).  But when I extend a block, it must do this same linear search to find the page size before doing an extend.  I'm trying to do that linear search as little as possible.  All this could probably be mitigated if the allocated length was somehow cached per allocated page.

Yeah, dealing with big blocks kind of stinks.  Maybe something like this would indeed be worth trying.

> I didn't write newCapacity, so I'm not sure about the theory behind it, or what desirable properties for appending are.  But it does make sense to me that if you are continuously appending, you want to preallocate more than "just enough to append the new data".  How much more, I have no clue :)  I assumed the formula in newCapacity was written by someone who knows what they are doing. Well, except for using the element size in the percentage formula.

Yeah, me too :-)
July 15, 2010



----- Original Message ----
> From: Sean Kelly <sean at invisibleduck.org>
> To: D's runtime library developers list <d-runtime at puremagic.com>
> Sent: Wed, July 14, 2010 4:54:35 PM
> Subject: Re: [D-runtime] Fw: [phobos] newCapacity function in array appending
> 
> On Jul 14, 2010, at 6:19 AM, Steve Schveighoffer wrote:
> > 
> > I  assume the newCapacity function is used to avoid constantly looking for
>new
>
> > memory to allocate.  The theory being if you are appending, you  don't want
>to
>
> > just allocate exactly what you need, you want to  preallocate a little more.
> 
> What gets me is that the GC already does this,  since it uses blocks that are
>powers of two in size, and then page-sized chunks  (which may again have extra space allocated at the end based on some qualities  of the alloc request).  I can see newCapacity allocating 4x extra on an int  append than a byte append, but the other logic has always seemed a bit  overthought to me.  But then I never sat down and tested it so I figured  maybe it's actually good in practice.  I didn't notice the screwed up  growth factor based on element size though.

Any appends of data < page do not use newCapacity, it's only for page size and above.  And then, it was only when allocating a brand new block.  From the bug report you can see it grew linearly by a page each time, until it needed to reallocate, and then the growth was ridiculous (200% growth!).

My new checked in code sort of hovers around 40% growth for larger blocks, even when extending in place.  I think the formula may need to be tweaked some more, but it's definitely much better than the linear growth pattern.  In my test program, it shaves off about 1/10th of a second.  The down side of course is when you are appending to a large array, it's going to add 40% back on itself which will be significant, even if you are just adding one element.  I think for very large arrays, we probably want to make the growth approach some low percentage (like 1 or 2%).  I also am wondering whether we should consider the number of elements being added, i.e. if you are only adding 1 or 2 elements, don't grow as much.

-Steve




July 16, 2010
On 15-lug-10, at 15:45, Steve Schveighoffer wrote:
>
> ----- Original Message ----
>> From: Sean Kelly <sean at invisibleduck.org>
>> To: D's runtime library developers list <d-runtime at puremagic.com>
>> Sent: Wed, July 14, 2010 4:54:35 PM
>> Subject: Re: [D-runtime] Fw: [phobos] newCapacity function in array
>> appending
>>
>> On Jul 14, 2010, at 6:19 AM, Steve Schveighoffer wrote:
>>>
>>> I  assume the newCapacity function is used to avoid constantly looking for
>> new
>>
>>> memory to allocate.  The theory being if you are appending, you don't want
>> to
>>
>>> just allocate exactly what you need, you want to  preallocate a little more.
>>
>> What gets me is that the GC already does this,  since it uses
>> blocks that are
>> powers of two in size, and then page-sized chunks  (which may again
>> have extra
>> space allocated at the end based on some qualities  of the alloc
>> request).  I
>> can see newCapacity allocating 4x extra on an int  append than a
>> byte append,
>> but the other logic has always seemed a bit  overthought to me.
>> But then I
>> never sat down and tested it so I figured  maybe it's actually good
>> in
>> practice.  I didn't notice the screwed up  growth factor based on
>> element size
>> though.
>
> Any appends of data < page do not use newCapacity, it's only for
> page size and
> above.  And then, it was only when allocating a brand new block.
> From the bug
> report you can see it grew linearly by a page each time, until it
> needed to
> reallocate, and then the growth was ridiculous (200% growth!).
>
> My new checked in code sort of hovers around 40% growth for larger
> blocks, even
> when extending in place.  I think the formula may need to be tweaked
> some more,
> but it's definitely much better than the linear growth pattern.  In
> my test
> program, it shaves off about 1/10th of a second.  The down side of
> course is
> when you are appending to a large array, it's going to add 40% back
> on itself
> which will be significant, even if you are just adding one element.
> I think for
> very large arrays, we probably want to make the growth approach some
> low
> percentage (like 1 or 2%).  I also am wondering whether we should
> consider the
> number of elements being added, i.e. if you are only adding 1 or 2
> elements,
> don't grow as much.

No the thing is that you want that even a loop adding single elements
does not have a quadratic cost.
any grow strategy that reallocates after x elements has a quadratic
cost:
growing n chunks implies n times realloc & copy of an array of length
1,2,3,...n chunks, thus a cost of sum(1,..n)=n*(n+1)/2=O(n^2).
A simple strategy to avoid the quadratic cost is geometric growth:
grow by multiplying with a factor.
If the array is large then the absolute amount of wasted memory can be
large, so one might want to reduce that a bit.
That is where O(x/log2(x)) is useful, it reduces the extra memory, but
as the logarithm grows slower than any polynomial it still avoids the
quadratic cost.

one can then tune these two functions for example with
adding x*(a*1/(log2(x)+1)) extra space
for small x (note that if x are bytes x>=1)  then you add a "a" factor
more.
for large x this becomes less.
one can then change the factor for the extra space to
	(a*(m+b)/(log2(x)+b))
here  the larger b the less the reduction for large x kicks in.
m is something that can be adjusted to choose for which value of
log2(x) the factor is exactly a.
It can be useful for example if you know that x is at least 32, then
log2(x) is at least 5, and it makes sense to choose m=5 (I called m
minBits, minimum number of bits in x).

With this knowledge it should be possible to tweak a bit better the function if one is so inclined.

Note that it might be good to keep a small and b largish if you know that you have many small arrays, as each of those will "waste" a extra space, so that is the wasted percentage for small arrays. for large ones it can be less.

Adding little when the array grows little is not a solution, because
it gives the quadratic growth.
If you know that you array will not grow much then you should resize
it manually... automatic append cannot avoid the quadratic cost and
always grow little.

Fawzi

>
> -Steve
>
>
>
>
> _______________________________________________
> D-runtime mailing list
> D-runtime at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/d-runtime