June 12, 2006
Bruno Medeiros skrev:
> Derek Parnell wrote:
>> On Mon, 12 Jun 2006 09:11:04 +1000, Bruno Medeiros <brunodomedeirosATgmail@SPAM.com> wrote:
>>
>>> Hum, and happens when one shortens the length of the array? The Memory Manager "back" buffer size remains the same?
>>
>> Yes. However there is a bug (oops - an issue) in which if the length is set to zero the RAM is released back to the the system.
>>
>> --Derek Parnell
>> Melbourne, Australia
> 
> That makes perfect sense, why would it be a bug?
> 

I don't know if this is what Derek refers to, but it used to be recommended practice to reserve space for an array by doing:

arr.length = 1024;
arr.length = 0;
(start filling arr with data)

I'm quite sure this used to be mentioned in the documentation, but I can no longer find any reference to it (except this old post: http://www.digitalmars.com/drn-bin/wwwnews?D/17691)

Today, I guess you should do the following instead:

arr.length = 1024;
arr = arr[0..0];
(start filling arr with data)

/Oskar
June 12, 2006
On Tue, 13 Jun 2006 05:27:44 +1000, Bruno Medeiros <brunodomedeirosATgmail@SPAM.com> wrote:

> Derek Parnell wrote:
>> On Mon, 12 Jun 2006 09:11:04 +1000, Bruno Medeiros <brunodomedeirosATgmail@SPAM.com> wrote:
>>
>>> Hum, and happens when one shortens the length of the array? The Memory Manager "back" buffer size remains the same?
>>  Yes. However there is a bug (oops - an issue) in which if the length is set to zero the RAM is released back to the the system.
>>  --Derek Parnell
>> Melbourne, Australia
>
> That makes perfect sense, why would it be a bug?

Agreed, it is not a bug in the sense that it is contrary to specifications because this behaviour isn't specified. However it does prevent a coder from distinguishing between an empty array from a null array. An Empty one is an array that (no longer) has any elements and a null array is one that doesn't have any RAM to reference.

I sugest that Walter either document this functionality or fix it.

"When an array length is reduced the RAM it owns is not released and can be reused when the array subsequently is expanded (, unless the length is set to zero in which case the RAM is released). "

Setting the length to zero is a convenient way to reserved RAM for an array.

Also consider this ...

    foo("");

Now how can 'foo' be written to detect a coder's error of passing it an uninitialized array.

    char[] x;
    foo(x);


-- 
Derek Parnell
Melbourne, Australia
June 12, 2006
Derek Parnell wrote:
> On Tue, 13 Jun 2006 05:27:44 +1000, Bruno Medeiros <brunodomedeirosATgmail@SPAM.com> wrote:
> 
>> Derek Parnell wrote:
>>> On Mon, 12 Jun 2006 09:11:04 +1000, Bruno Medeiros <brunodomedeirosATgmail@SPAM.com> wrote:
>>>
>>>> Hum, and happens when one shortens the length of the array? The Memory Manager "back" buffer size remains the same?
>>>  Yes. However there is a bug (oops - an issue) in which if the length is set to zero the RAM is released back to the the system.
>>>  --Derek Parnell
>>> Melbourne, Australia
>>
>> That makes perfect sense, why would it be a bug?
> 
> Agreed, it is not a bug in the sense that it is contrary to specifications because this behaviour isn't specified. However it does prevent a coder from distinguishing between an empty array from a null array. An Empty one is an array that (no longer) has any elements and a null array is one that doesn't have any RAM to reference.
> 
> I sugest that Walter either document this functionality or fix it.
> 
> "When an array length is reduced the RAM it owns is not released and can be reused when the array subsequently is expanded (, unless the length is set to zero in which case the RAM is released). "
> 
> Setting the length to zero is a convenient way to reserved RAM for an array.
> 
> Also consider this ...
> 
>     foo("");
> 
> Now how can 'foo' be written to detect a coder's error of passing it an uninitialized array.
> 
>     char[] x;
>     foo(x);

Perhaps D arrays simply need a reserve property?


Sean
June 12, 2006
Derek Parnell skrev:

> I sugest that Walter either document this functionality or fix it.

I agree that it should be better documented.

> 
> "When an array length is reduced the RAM it owns is not released and can be reused when the array subsequently is expanded (, unless the length is set to zero in which case the RAM is released). "
> 
> Setting the length to zero is a convenient way to reserved RAM for an array.


t arr.length = 100_000_000; arr = arr[0..0]; is almost as convenient.

> Also consider this ...
> 
>     foo("");
> 
> Now how can 'foo' be written to detect a coder's error of passing it an uninitialized array.
> 
>     char[] x;
>     foo(x);
> 

Like this:

void foo(char[] arr) {
	if (!arr)
		writefln("Uninitialized array passed");
	else if (arr.length == 0)
		writefln("Zero length array received");
}

/Oskar
June 12, 2006
Sean Kelly skrev:

> Perhaps D arrays simply need a reserve property?

Something like this ought to work:

template reserve(ArrTy,IntTy) {
        void reserve(inout ArrTy a, IntTy size) {
                if (size > a.length) {
                        size_t old_length = a.length;
                        a.length = size;
                        a = a[0..old_length];
                }
        }
}


usage:

arr.reserve(1000);

/Oskar
June 13, 2006
On Tue, 13 Jun 2006 01:05:04 +0200, Oskar Linde wrote:

>> Setting the length to zero is a convenient way to reserved RAM for an array.
> 
> t arr.length = 100_000_000; arr = arr[0..0]; is almost as convenient.

Unfortunately this only appears to reserve the RAM, because the next change
in length will cause a new allocation to be made. See the example program
below ...

>> Also consider this ...
>> 
>>     foo("");
>> 
>> Now how can 'foo' be written to detect a coder's error of passing it an uninitialized array.
>> 
>>     char[] x;
>>     foo(x);
>> 
> 
> Like this:
> 
> void foo(char[] arr) {
> 	if (!arr)
> 		writefln("Uninitialized array passed");
> 	else if (arr.length == 0)
> 		writefln("Zero length array received");
> }

Yes, I can see that D can now distinguish between the two. This didn't used to be the case, IIRC. However there is still a 'bug' with this as the program here demonstrates...


 import std.stdio;
 void main()
 {

    char[] arr;

    foo(arr);
    foo("");
    foo("".dup);

    writefln("%s %s", arr.length, arr.ptr);
    arr.length = 100;
    writefln("%s %s", arr.length, arr.ptr);
    arr = arr[0..0];
    writefln("%s %s", arr.length, arr.ptr);
    arr.length = 50;
    writefln("%s %s", arr.length, arr.ptr);
    arr.length = 500;
    writefln("%s %s", arr.length, arr.ptr);

 }

 void foo(char[] t)
 {
    writefln("foo: %s %s", t.length, t.ptr);
 }

The results are ...
foo: 0 0000
foo: 0 413080
foo: 0 0000  *** A 'dup'ed empty string is now a null string.
0 0000
100 8A2F00
0 8A2F00   *** RAM appears to be reserved.
50 8A1F80  *** But it is not as a new allocation just occurred.
500 8A3E00 *** This allocation is expected.

-- 
Derek
(skype: derek.j.parnell)
Melbourne, Australia
"Down with mediocrity!"
13/06/2006 11:08:24 AM
June 13, 2006
Derek Parnell skrev:
> On Tue, 13 Jun 2006 01:05:04 +0200, Oskar Linde wrote:
> 
>>> Setting the length to zero is a convenient way to reserved RAM for an array.
>> t arr.length = 100_000_000; arr = arr[0..0]; is almost as convenient.
> 
> Unfortunately this only appears to reserve the RAM, because the next change
> in length will cause a new allocation to be made. See the example program
> below ...
>  
>>> Also consider this ...
>>>
>>>     foo("");
>>>
>>> Now how can 'foo' be written to detect a coder's error of passing it an uninitialized array.
>>>
>>>     char[] x;
>>>     foo(x);
>>>
>> Like this:
>>
>> void foo(char[] arr) {
>> 	if (!arr)
>> 		writefln("Uninitialized array passed");
>> 	else if (arr.length == 0)
>> 		writefln("Zero length array received");
>> }
> 
> Yes, I can see that D can now distinguish between the two. This didn't used
> to be the case, IIRC. However there is still a 'bug' with this as the
> program here demonstrates...
> 
> 
>  import std.stdio;
>  void main()
>  { 
> 
>     char[] arr;
> 
>     foo(arr);
>     foo("");
>     foo("".dup);
> 
>     writefln("%s %s", arr.length, arr.ptr);
>     arr.length = 100;
>     writefln("%s %s", arr.length, arr.ptr);
>     arr = arr[0..0];
>     writefln("%s %s", arr.length, arr.ptr);
>     arr.length = 50;
>     writefln("%s %s", arr.length, arr.ptr);
>     arr.length = 500;
>     writefln("%s %s", arr.length, arr.ptr);
>   }
> 
>  void foo(char[] t)
>  {
>     writefln("foo: %s %s", t.length, t.ptr);
>  }
> 
> The results are ...
> foo: 0 0000
> foo: 0 413080
> foo: 0 0000  *** A 'dup'ed empty string is now a null string.
> 0 0000
> 100 8A2F00
> 0 8A2F00   *** RAM appears to be reserved.
> 50 8A1F80  *** But it is not as a new allocation just occurred.
> 500 8A3E00 *** This allocation is expected.

You are right, changing length forces a reallocation. Interestingly, the following works:

arr.length = 100;
arr = arr[0..0];
writefln("%s %s",arr.length,arr.ptr);
for (int i = 0; i < 50; i++)
	arr ~= i;
writefln("%s %s",arr.length,arr.ptr);

prints (for me):

0 b7ee9e00
50 b7ee9e00

What is even more interesting is that the above "buggy" behavior seems intentional. The following patch removes the forced reallocation when changing length of a 0-length array:

--- gc.d.orig   2006-06-04 11:50:08.979945284 +0200
+++ gc.d        2006-06-13 09:19:02.135348959 +0200
@@ -382,8 +382,6 @@
        }
        //printf("newsize = %x, newlength = %x\n", newsize, newlength);

-       if (p.length)
-       {
            newdata = p.data;
            if (newlength > p.length)
            {
@@ -397,11 +395,6 @@
                }
                newdata[size .. newsize] = 0;
            }
-       }
-       else
-       {
-           newdata = cast(byte *)_gc.calloc(newsize + 1, 1);
-       }
     }
     else
     {


With this change, your above code prints:

$build -run ./arrtest ~/dmd/src/phobos/internal/gc/gc.d
Path and Version : build v2.9(1197)
  built on Thu Aug 11 16:07:55 2005
foo: 0 0
foo: 0 805765c
foo: 0 0
0 0
100 b7ee8e80
0 b7ee8e80    *** RAM is reserved
50 b7ee8e80   *** and is used
500 b7ee9e00  *** This causes reallocation as expected

I wonder why the code looks like it does...

/Oskar
June 13, 2006
On Tue, 13 Jun 2006 09:24:34 +0200, Oskar Linde wrote:

> What is even more interesting is that the above "buggy" behavior seems intentional. The following patch removes the forced reallocation when changing length of a 0-length array:

Hmmm... I just rewrote that function as below and it seems to test out quite well too. I incorporated your change plus I removed the check for a zero new length. Seems to work without any problems.

-----------------
extern (C)
byte[] _d_arraysetlength(size_t newlength, size_t sizeelem, Array *p)
in
{
    assert(sizeelem);
    assert(!p.length || p.data);
}
body
{
    byte* newdata;
    newdata = p.data;
    if (newlength > p.length)
    {
        version (D_InlineAsm_X86)
        {
            size_t newsize = void;
            asm
            {
            mov EAX,newlength   ;
            mul EAX,sizeelem    ;
            mov newsize,EAX ;
            jc  Loverflow   ;
            }
        }
        else
        {
            size_t newsize = sizeelem * newlength;
            if (newsize / newlength != sizeelem)
            goto Loverflow;
        }
        size_t size = p.length * sizeelem;
        size_t cap = _gc.capacity(p.data);
        if (cap <= newsize)
        {
            newdata = cast(byte *)_gc.malloc(newsize + 1);
            newdata[0 .. size] = p.data[0 .. size];
        }
        newdata[size .. newsize] = 0;
    }
    p.data = newdata;
    p.length = newlength;
    return newdata[0 .. newlength];
Loverflow:
    _d_OutOfMemory();
}
---------------

-- 
Derek
(skype: derek.j.parnell)
Melbourne, Australia
"Down with mediocrity!"
13/06/2006 5:54:57 PM
June 13, 2006
Oskar Linde wrote:
> 
> Like this:
> 
> void foo(char[] arr) {
>     if (!arr)
>         writefln("Uninitialized array passed");
>     else if (arr.length == 0)
>         writefln("Zero length array received");
> }
> 
> /Oskar

This is not safe to do. Currently in D null arrays and zero-length arrays are conceptually the same. It just so happens that sometimes the arr.ptr is null and sometimes not, depending on the previous operations.
The "A 'dup'ed empty string is now a null string." is an example of why that is not safe. I thought you knew this already? This is nothing new.

BTW, I do find it (at first sight at least) unnatural that a null array is the same as a zero-length arrays. It doesn't seem conceptually right/consistent.



-- 
Bruno Medeiros - CS/E student
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
June 13, 2006
Oskar Linde wrote:
> Derek Parnell skrev:
>> On Tue, 13 Jun 2006 01:05:04 +0200, Oskar Linde wrote:
>>
>>>> Setting the length to zero is a convenient way to reserved RAM for an array.
>>> t arr.length = 100_000_000; arr = arr[0..0]; is almost as convenient.
>>
>> Unfortunately this only appears to reserve the RAM, because the next change
>> in length will cause a new allocation to be made. See the example program
>> below ...
>>  
>>>> Also consider this ...
>>>>
>>>>     foo("");
>>>>
>>>> Now how can 'foo' be written to detect a coder's error of passing it an uninitialized array.
>>>>
>>>>     char[] x;
>>>>     foo(x);
>>>>
>>> Like this:
>>>
>>> void foo(char[] arr) {
>>>     if (!arr)
>>>         writefln("Uninitialized array passed");
>>>     else if (arr.length == 0)
>>>         writefln("Zero length array received");
>>> }
>>
>> Yes, I can see that D can now distinguish between the two. This didn't used
>> to be the case, IIRC. However there is still a 'bug' with this as the
>> program here demonstrates...
>>
>>
>>  import std.stdio;
>>  void main()
>>  {
>>     char[] arr;
>>
>>     foo(arr);
>>     foo("");
>>     foo("".dup);
>>
>>     writefln("%s %s", arr.length, arr.ptr);
>>     arr.length = 100;
>>     writefln("%s %s", arr.length, arr.ptr);
>>     arr = arr[0..0];
>>     writefln("%s %s", arr.length, arr.ptr);
>>     arr.length = 50;
>>     writefln("%s %s", arr.length, arr.ptr);
>>     arr.length = 500;
>>     writefln("%s %s", arr.length, arr.ptr);
>>   }
>>
>>  void foo(char[] t)
>>  {
>>     writefln("foo: %s %s", t.length, t.ptr);
>>  }
>>
>> The results are ...
>> foo: 0 0000
>> foo: 0 413080
>> foo: 0 0000  *** A 'dup'ed empty string is now a null string.
>> 0 0000
>> 100 8A2F00
>> 0 8A2F00   *** RAM appears to be reserved.
>> 50 8A1F80  *** But it is not as a new allocation just occurred.
>> 500 8A3E00 *** This allocation is expected.
> 
> You are right, changing length forces a reallocation. Interestingly, the following works:
> 
> arr.length = 100;
> arr = arr[0..0];
> writefln("%s %s",arr.length,arr.ptr);
> for (int i = 0; i < 50; i++)
>     arr ~= i;
> writefln("%s %s",arr.length,arr.ptr);
> 
> prints (for me):
> 
> 0 b7ee9e00
> 50 b7ee9e00
> 
> What is even more interesting is that the above "buggy" behavior seems intentional.

Hrm, there were some changes to gc.d a while back, but it was more than 10 versions ago as that's as far back as I have installed at the moment.  Perhaps Walter could comment on the change?  I suspect it was probably a bug fix.


Sean