June 06, 2009
On Sat, 06 Jun 2009 20:19:45 +0300, Sean Kelly <sean@invisibleduck.org>
wrote:

> Jarrett Billingsley wrote:
>> On Sat, Jun 6, 2009 at 8:03 AM, Vladimir
>> Panteleev<thecybershadow@gmail.com> wrote:
>>> // Works for DMD1/Phobos, DMD1/Tango and DMD2/Phobos
>>> version(Tango) import tango.io.Console;
>>> else           import std.stdio;
>>>
>>> struct S
>>> {
>>>        ubyte[40_000] data;
>>> }
>>>
>>> void main()
>>> {
>>>        S[] a;
>>>        a ~= S();
>>>
>>>        // QUESTION: How much memory will this program consume upon reaching
>>> this point?
>>>        version(Tango) Cin.copyln();
>>>        else           readln();
>>> }
>>>
>>  There seems to be something wrong with the newCapacity function that
>> _d_arrayappendcT calls.  From an element size of 20000 (I halved it
>> just to make the allocation faster) and an array length of 1, it
>> somehow calculates the new size to be 266686600.  Hm.  That seems a
>> bit off.
>>  It seems this line:
>>  long mult = 100 + (1000L * size) / log2plus1(newcap);
>>  is to blame.  I don't think large value types were taken into account
>> here.  The resulting multiplier is 1,333,433, which is hilariously
>> large.
>
> I've been debating for a while whether newCapacity shoulds exist at all.   The GC already tends to leave free space at the end of blocks, so why should the runtime second-guess it?

Do you mean at the end of pages, or pools?

Pages are only 4K in size, causing a reallocation on every 4K doesn't make
sense performance-wise.

If you mean that DMD will allocate memory pools larger than the immediate
memory requirement, that's true, however D's GC is "greedy" in that it
always allocates memory at the lowest address possible. This means that
when all previous pools fill up, the next object will "cap" the new array,
and the array will no longer be able to extend.

Allow me to demonstrate:

-----------------------
import std.gc;
import std.stdio;

struct S1
{
	ubyte[4095] data;
}

struct S2
{
	ubyte[4095*4] data;
}

void main()
{
	S1[] a;
	S2*[] b;
	for (int i=0;i<1024;i++)
	{
		a ~= S1();
		b ~= new S2;
		writefln("%d", i);
	}
}
-----------------------

This program allocates 4-page blocks and appends 1-page blocks to an array
in a loop.

Here's a Diamond[1] analysis screenshot from before and after the first
two garbage collects:
http://dump.thecybershadow.net/b4af5badf32c954b7a18b848b7d9da64/1.png

The P+++ segments are S2 instances. The segments with the longer tails of
+es are copies of S1[].

As you can see, as soon as the previous pools fill up, the pool containing
the S1[] gets rapidly filled, because it's just a loop of reallocating
S1[] every time an S2[] is allocated at its end.

So, I don't think that your idea is feasable with the current GC
implementation. I wonder how much would things get improved if objects
would be divided between "growing" and "non-growing", with the GC
prioritizing allocating new objects in free space not directly following
"growing" objects.

[1] http://www.dsource.org/projects/diamond

-- 
Best regards,
   Vladimir                          mailto:thecybershadow@gmail.com
June 06, 2009
bearophile wrote:
> Sean Kelly:
>> Particularly in D2 where append operations on arrays are probably less common as a result of string being invariant.
> 
> They aren't much common maybe because they are currently dead-slow. Appending to an immutable string is a common operation. But I guess Array appenders will get more common...

Yes but appending to an immutable string is never performed in place, which is the only time the extra space reserved by newCapacity matters.  I suspect the memory wasted by newCapacity is more of an issue than any time savings it provides.
June 06, 2009
Vladimir Panteleev wrote:
> On Sat, 06 Jun 2009 20:19:45 +0300, Sean Kelly <sean@invisibleduck.org>
> wrote:
> 
>> Jarrett Billingsley wrote:
>>> On Sat, Jun 6, 2009 at 8:03 AM, Vladimir
>>> Panteleev<thecybershadow@gmail.com> wrote:
>>>> // Works for DMD1/Phobos, DMD1/Tango and DMD2/Phobos
>>>> version(Tango) import tango.io.Console;
>>>> else           import std.stdio;
>>>>
>>>> struct S
>>>> {
>>>>        ubyte[40_000] data;
>>>> }
>>>>
>>>> void main()
>>>> {
>>>>        S[] a;
>>>>        a ~= S();
>>>>
>>>>        // QUESTION: How much memory will this program consume upon reaching
>>>> this point?
>>>>        version(Tango) Cin.copyln();
>>>>        else           readln();
>>>> }
>>>>
>>>  There seems to be something wrong with the newCapacity function that
>>> _d_arrayappendcT calls.  From an element size of 20000 (I halved it
>>> just to make the allocation faster) and an array length of 1, it
>>> somehow calculates the new size to be 266686600.  Hm.  That seems a
>>> bit off.
>>>  It seems this line:
>>>  long mult = 100 + (1000L * size) / log2plus1(newcap);
>>>  is to blame.  I don't think large value types were taken into account
>>> here.  The resulting multiplier is 1,333,433, which is hilariously
>>> large.
>>
>> I've been debating for a while whether newCapacity shoulds exist at all.   The GC already tends to leave free space at the end of blocks, so why should the runtime second-guess it?
> 
> Do you mean at the end of pages, or pools?

Blocks.  Blocks less than 4k in the current allocator are sized in powers of two, so array appends already get the "double in size" behavior in Java even without newCapacity.  newCapacity just has the potential to throw an array into the next larger block early, thus potentially wasting more space if the array will never be appended to. I think newCapacity isn't supposed to reserve extra space for blocks larger than 4k, but it's been a while since I've looked at it.

> If you mean that DMD will allocate memory pools larger than the immediate
> memory requirement, that's true, however D's GC is "greedy" in that it
> always allocates memory at the lowest address possible. This means that
> when all previous pools fill up, the next object will "cap" the new array,
> and the array will no longer be able to extend.

This is a quirk of the current GC.  A new GC may not behave the same way (in fact I can guarantee that the one I'm working on is implemented differently).

> So, I don't think that your idea is feasable with the current GC
> implementation.

I'm not sure I understand.  The only "idea" I proposed was to get rid of newCapacity.

> I wonder how much would things get improved if objects
> would be divided between "growing" and "non-growing", with the GC
> prioritizing allocating new objects in free space not directly following
> "growing" objects.

The user already knows best which arrays are going to grow though.  Is this really something the runtime should try to figure out?
June 06, 2009
On Sat, Jun 6, 2009 at 1:42 PM, bearophile<bearophileHUGS@lycos.com> wrote:
> Sean Kelly:
>> Particularly in D2 where append
>> operations on arrays are probably less common as a result of string
>> being invariant.
>
> They aren't much common maybe because they are currently dead-slow. Appending to an immutable string is a common operation. But I guess Array appenders will get more common...

Especially since D2 has one in the stdlib.  I always find myself writing my own anyway, since I don't like depending on unspecified heap behavior.
June 06, 2009
On Sat, 06 Jun 2009 22:07:40 +0300, Sean Kelly <sean@invisibleduck.org> wrote:

> Blocks.  Blocks less than 4k in the current allocator are sized in powers of two, so array appends already get the "double in size" behavior in Java even without newCapacity.  newCapacity just has the potential to throw an array into the next larger block early, thus potentially wasting more space if the array will never be appended to. I think newCapacity isn't supposed to reserve extra space for blocks larger than 4k, but it's been a while since I've looked at it.

Yes, but you do agree that the penalty of reallocating every time the array size goes over a 4kB boundary (in the worst case of heap fragmentation similar like the one I demonstrated) is not acceptable?

> This is a quirk of the current GC.  A new GC may not behave the same way (in fact I can guarantee that the one I'm working on is implemented differently).

Could you tell us more about that? I was toying with a new GC idea myself (since last year) but haven't gotten around to finishing it yet.

> I'm not sure I understand.  The only "idea" I proposed was to get rid of newCapacity.

I understood as you wanting to change the code so it would behave as if newCapacity always returned newlength * size.

> The user already knows best which arrays are going to grow though.  Is this really something the runtime should try to figure out?

No, but then you'll need to change the language specification to allow the user to specify growable arrays. I guess the proper solution here is to force the user to use specialized classes with their own "newCapacity" implementations.

-- 
Best regards,
 Vladimir                          mailto:thecybershadow@gmail.com
June 06, 2009
On 2009-06-06 21:07:40 +0200, Sean Kelly <sean@invisibleduck.org> said:

> Vladimir Panteleev wrote:
>> On Sat, 06 Jun 2009 20:19:45 +0300, Sean Kelly <sean@invisibleduck.org>
>> wrote:
[...]
>> 
>>> I've been debating for a while whether newCapacity shoulds exist at all.   The GC already tends to leave free space at the end of blocks, so why should the runtime second-guess it?
>> 
>> Do you mean at the end of pages, or pools?
> 
> Blocks.  Blocks less than 4k in the current allocator are sized in powers of two, so array appends already get the "double in size" behavior in Java even without newCapacity.  newCapacity just has the potential to throw an array into the next larger block early, thus potentially wasting more space if the array will never be appended to. I think newCapacity isn't supposed to reserve extra space for blocks larger than 4k, but it's been a while since I've looked at it.

at the moment the behavior is exactly the opposite (leaving the small array to the GC handling you just sketched)), only array larger than a page have the special treatment, I think that the idea is that appending to large arrays can be potentially very expensive, so those get the special treatment.
But as I said previously I don't fully understand the rationale, especially behind the idea of having more reserved space if the elements are larger, I could understand having the reserved space just referring to the elements, like this:

           long mult = 100 + 200L / log2plus2(newlength)
instead of
           long mult = 100 + 1000L / log2plus2(newcap)

or something in between

these give at most 1.5 or 1.71, so a waste of at most 100% or 71% and decreases as 1/log toward 1.02.

To really test this one should benchmark some "typical" applications.

The current choice is neither of these, I don't know exactly why this high weight to the high size elements. I think it is an error, but maybe there was an idea (at least for smallish sizes), that I don't see.

Fawzi

> 
>> If you mean that DMD will allocate memory pools larger than the immediate
>> memory requirement, that's true, however D's GC is "greedy" in that it
>> always allocates memory at the lowest address possible. This means that
>> when all previous pools fill up, the next object will "cap" the new array,
>> and the array will no longer be able to extend.
> 
> This is a quirk of the current GC.  A new GC may not behave the same way (in fact I can guarantee that the one I'm working on is implemented differently).
> 
>> So, I don't think that your idea is feasable with the current GC
>> implementation.
> 
> I'm not sure I understand.  The only "idea" I proposed was to get rid of newCapacity.
> 
>> I wonder how much would things get improved if objects
>> would be divided between "growing" and "non-growing", with the GC
>> prioritizing allocating new objects in free space not directly following
>> "growing" objects.
> 
> The user already knows best which arrays are going to grow though.  Is this really something the runtime should try to figure out?


June 07, 2009
Vladimir Panteleev wrote:
> On Sat, 06 Jun 2009 18:12:58 +0300, Jarrett Billingsley <jarrett.billingsley@gmail.com> wrote:
> 
>> On Sat, Jun 6, 2009 at 8:03 AM, Vladimir
>> Panteleev<thecybershadow@gmail.com> wrote:
>>> // Works for DMD1/Phobos, DMD1/Tango and DMD2/Phobos
>>> version(Tango) import tango.io.Console;
>>> else           import std.stdio;
>>>
>>> struct S
>>> {
>>>        ubyte[40_000] data;
>>> }
>>>
>>> void main()
>>> {
>>>        S[] a;
>>>        a ~= S();
>>>
>>>        // QUESTION: How much memory will this program consume upon reaching
>>> this point?
>>>        version(Tango) Cin.copyln();
>>>        else           readln();
>>> }
>>>
>>
>> There seems to be something wrong with the newCapacity function that
>> _d_arrayappendcT calls.  From an element size of 20000 (I halved it
>> just to make the allocation faster) and an array length of 1, it
>> somehow calculates the new size to be 266686600.  Hm.  That seems a
>> bit off.
>>
>> It seems this line:
>>
>> long mult = 100 + (1000L * size) / log2plus1(newcap);
>>
>> is to blame.  I don't think large value types were taken into account
>> here.  The resulting multiplier is 1,333,433, which is hilariously
>> large.
> 
> Bulls-eye. I think mr. Dave Fladebo deserves a math lesson or something, and mr. Walter Bright should review patches that go into the language's runtime more carefully. :P
> 
> Can we discuss a suitable replacement? I suggested in #d that we look at how other platforms do it. For example, Java's Vector just doubles its capacity by default ( http://java.sun.com/javase/6/docs/api/java/util/Vector.html ).
> 

In my own array classes I'm using Python's: size += max(size>>3, 8);
Using this, you won't end up wasting a lot of memory. Preallocating is always preferred anyway.

L.
June 07, 2009
Fawzi Mohamed wrote:
> On 2009-06-06 21:07:40 +0200, Sean Kelly <sean@invisibleduck.org> said:
> 
>> Vladimir Panteleev wrote:
>>> On Sat, 06 Jun 2009 20:19:45 +0300, Sean Kelly <sean@invisibleduck.org>
>>> wrote:
> [...]
>>>
>>>> I've been debating for a while whether newCapacity shoulds exist at all.   The GC already tends to leave free space at the end of blocks, so why should the runtime second-guess it?
>>>
>>> Do you mean at the end of pages, or pools?
>>
>> Blocks.  Blocks less than 4k in the current allocator are sized in powers of two, so array appends already get the "double in size" behavior in Java even without newCapacity.  newCapacity just has the potential to throw an array into the next larger block early, thus potentially wasting more space if the array will never be appended to. I think newCapacity isn't supposed to reserve extra space for blocks larger than 4k, but it's been a while since I've looked at it.
> 
> at the moment the behavior is exactly the opposite (leaving the small array to the GC handling you just sketched)), only array larger than a page have the special treatment, I think that the idea is that appending to large arrays can be potentially very expensive, so those get the special treatment.

Hm.  I still think we'd be better off letting the user handle this.  If they're going to be appending and performance is important then they'll use an ArrayAppender anyway.  I'd rather not have extra space tacked onto the end of every array I create "just in case" I decide to append to it.
June 07, 2009
On Sat, 06 Jun 2009 12:03:03 -0700, Sean Kelly wrote:

> bearophile wrote:
>> Sean Kelly:
>>> Particularly in D2 where append
>>> operations on arrays are probably less common as a result of string
>>> being invariant.
>> 
>> They aren't much common maybe because they are currently dead-slow. Appending to an immutable string is a common operation. But I guess Array appenders will get more common...
> 
> Yes but appending to an immutable string is never performed in place,
> which is the only time the extra space reserved by newCapacity matters.
>   I suspect the memory wasted by newCapacity is more of an issue than
> any time savings it provides.

What gave you that idea?

void main()
{
  auto str1 = "hello".idup;
  auto str2 = str1;
  str1 ~= "world";
  assert(str1.ptr == str2.ptr);
}

-Steve
June 07, 2009
Steve Schveighoffer wrote:
> On Sat, 06 Jun 2009 12:03:03 -0700, Sean Kelly wrote:
> 
>> bearophile wrote:
>>> Sean Kelly:
>>>> Particularly in D2 where append
>>>> operations on arrays are probably less common as a result of string
>>>> being invariant.
>>> They aren't much common maybe because they are currently dead-slow.
>>> Appending to an immutable string is a common operation. But I guess
>>> Array appenders will get more common...
>> Yes but appending to an immutable string is never performed in place,
>> which is the only time the extra space reserved by newCapacity matters.
>>   I suspect the memory wasted by newCapacity is more of an issue than
>> any time savings it provides.
> 
> What gave you that idea?
> 
> void main()
> {
>   auto str1 = "hello".idup;
>   auto str2 = str1;
>   str1 ~= "world";
>   assert(str1.ptr == str2.ptr);
> }

auto str1 = "hello".idup;
auto str2 = str3 = str1;
str2 ~= " world";
str3 ~= " garbage";

Doesn't seem terribly safe to me.