Jump to page: 1 2
Thread overview
copy and array length vs capacity. (Doc suggestion?)
Nov 21, 2015
Jon D
Nov 22, 2015
Ali Çehreli
Nov 22, 2015
Jon D
Nov 22, 2015
Jonathan M Davis
Nov 22, 2015
Jon D
Nov 22, 2015
Jonathan M Davis
Nov 23, 2015
Jon D
Nov 24, 2015
Ali Çehreli
Nov 24, 2015
Jon D
November 21, 2015
Something I found confusing was the relationship between array capacity and copy(). A short example:

void main()
{
    import std.algorithm: copy;

    auto a = new int[](3);
    assert(a.length == 3);
    [1, 2, 3].copy(a);     // Okay

    int[] b;
    b.reserve(3);
    assert(b.capacity >= 3);
    assert(b.length == 0);
    [1, 2, 3].copy(b);     // Error
}

I had expected that copy() would work if the target had sufficient capacity, but that's not the case. Target has to have sufficient length.

If I've understood this correctly, a small change to the documentation for copy() might make this clearer. In particular, the "precondition" section:

    Preconditions:
    target shall have enough room to accomodate the entirety of source.

Clarifying that "enough room" means 'length' rather than 'capacity' might be beneficial.
November 22, 2015
Hi Jon! :)

On 11/21/2015 03:34 PM, Jon D wrote:

>      Preconditions:
>      target shall have enough room to accomodate the entirety of source.
>
> Clarifying that "enough room" means 'length' rather than 'capacity'
> might be beneficial.

May I suggest that you improve that page. ;) If you don't already have a clone o the repo, you can do it easily by clicking the "Improve this page" button on that page.

Regarding why copy() cannot use the capacity of the slice, it is because slices don't know about each other, so, copy could not let other slices know that the capacity has just been used by this particular slice.

However, copy() could first append an element, in which case the capacity would be owned by this slice. copy() then safely use the capacity, knowing very well that the act of appending that one element has dropped the capacities of all other slices to zero.

In pseudo code:

if
  there is enough capacity
  and if copying will spill into capacity
then
   append an element
   copy by spilling into capacity
   set .length appropriately

Others, please review, implement, prove that it is efficient, and post a pull request. :)

Ali

November 22, 2015
On Saturday, November 21, 2015 23:34:25 Jon D via Digitalmars-d-learn wrote:
> Something I found confusing was the relationship between array
> capacity and copy(). A short example:
>
> void main()
> {
>      import std.algorithm: copy;
>
>      auto a = new int[](3);
>      assert(a.length == 3);
>      [1, 2, 3].copy(a);     // Okay
>
>      int[] b;
>      b.reserve(3);
>      assert(b.capacity >= 3);
>      assert(b.length == 0);
>      [1, 2, 3].copy(b);     // Error
> }
>
> I had expected that copy() would work if the target had sufficient capacity, but that's not the case. Target has to have sufficient length.
>
> If I've understood this correctly, a small change to the documentation for copy() might make this clearer. In particular, the "precondition" section:
>
>      Preconditions:
>      target shall have enough room to accomodate the entirety of
> source.
>
> Clarifying that "enough room" means 'length' rather than 'capacity' might be beneficial.

Honestly, arrays suck as output ranges. They don't get appended to; they get filled, and for better or worse, the documentation for copy is probably assuming that you know that. If you want your array to be appended to when using it as an output range, then you need to use std.array.Appender.

- Jonathan M Davis

November 22, 2015
On Sunday, 22 November 2015 at 00:10:07 UTC, Ali Çehreli wrote:
> May I suggest that you improve that page. ;) If you don't already have a clone o the repo, you can do it easily by clicking the "Improve this page" button on that page.
>
Hi Ali, thanks for the quick response. And point taken :)  I hadn't noticed those buttons on the doc pages, looks very convenient. There are a couple formalities I need to look into before making contributions, even small ones, but I'll check into these.
>
> Regarding why copy() cannot use the capacity of the slice, it is because slices don't know about each other, so, copy could not let other slices know that the capacity has just been used by this particular slice.
>
Thanks for the explanation, very helpful understanding what's going on.

--Jon



November 22, 2015
On Sunday, 22 November 2015 at 00:31:53 UTC, Jonathan M Davis wrote:
>
> Honestly, arrays suck as output ranges. They don't get appended to; they get filled, and for better or worse, the documentation for copy is probably assuming that you know that. If you want your array to be appended to when using it as an output range, then you need to use std.array.Appender.
>
Hi Jonathan, thanks for the reply and the info about std.array.Appender. I was actually using copy to fill an array, not append. However, I also wanted to preallocate the space. And, since I'm mainly trying to understand the language, I was also trying to figure out the difference between these two forms of creating a dynamic array with an initial size:

   auto x = new int[](n);
   int[] y;  y.reserve(n);

The obvious difference is that first initializes n values, the second form does not. I'm still unclear if there are other material differences, or when one might be preferred over the other :) It's was in this context the behavior of copy surprised me, that it wouldn't operate on the second form without first filling in the elements. If this seems unclear, I can provide a slightly longer sample showing what I was doing.

--Jon


November 22, 2015
On Sunday, November 22, 2015 03:19:54 Jon D via Digitalmars-d-learn wrote:
> On Sunday, 22 November 2015 at 00:31:53 UTC, Jonathan M Davis wrote:
> >
> > Honestly, arrays suck as output ranges. They don't get appended to; they get filled, and for better or worse, the documentation for copy is probably assuming that you know that. If you want your array to be appended to when using it as an output range, then you need to use std.array.Appender.
> >
> Hi Jonathan, thanks for the reply and the info about std.array.Appender. I was actually using copy to fill an array, not append. However, I also wanted to preallocate the space. And, since I'm mainly trying to understand the language, I was also trying to figure out the difference between these two forms of creating a dynamic array with an initial size:
>
>     auto x = new int[](n);
>     int[] y;  y.reserve(n);
>
> The obvious difference is that first initializes n values, the second form does not. I'm still unclear if there are other material differences, or when one might be preferred over the other :) It's was in this context the behavior of copy surprised me, that it wouldn't operate on the second form without first filling in the elements. If this seems unclear, I can provide a slightly longer sample showing what I was doing.

If you haven't read this article yet, then you should read it:

http://dlang.org/d-array-article.html

It doesn't use the official terminology (in particular, it talks about T[] as being a slice and the underlying GC buffer as being the dynamic array, whereas per the language spec T[] is the dynamic array (which is alsa a slice of some sort of memory), and the underlying GC buffer that typically backs a dynamic array is just a GC buffer and is essentially an implementation detail), but it should give you good insight into how arrays work in D.

- Jonathan M Davis

November 23, 2015
On 11/21/15 10:19 PM, Jon D wrote:
> On Sunday, 22 November 2015 at 00:31:53 UTC, Jonathan M Davis wrote:
>>
>> Honestly, arrays suck as output ranges. They don't get appended to;
>> they get filled, and for better or worse, the documentation for copy
>> is probably assuming that you know that. If you want your array to be
>> appended to when using it as an output range, then you need to use
>> std.array.Appender.
>>
> Hi Jonathan, thanks for the reply and the info about std.array.Appender.
> I was actually using copy to fill an array, not append. However, I also
> wanted to preallocate the space. And, since I'm mainly trying to
> understand the language, I was also trying to figure out the difference
> between these two forms of creating a dynamic array with an initial size:
>
>     auto x = new int[](n);
>     int[] y;  y.reserve(n);

If you want to change the size of the array, use length:

y.length = n;

This will extend y to the correct length, automatically reserving a block of data that can hold it, and allow you to write to the array.

All reserve does is to make sure there is enough space so you can append that much data to it. It is not relevant to your use case.

> The obvious difference is that first initializes n values, the second
> form does not. I'm still unclear if there are other material
> differences, or when one might be preferred over the other :) It's was
> in this context the behavior of copy surprised me, that it wouldn't
> operate on the second form without first filling in the elements. If
> this seems unclear, I can provide a slightly longer sample showing what
> I was doing.

extending length affects the given array, extending if necessary. reserve is ONLY relevant if you are using appending (arr ~= x). It doesn't actually affect the "slice" or the variable you are using, at all (except to possibly point it at newly allocated space).

copy uses an "output range" as it's destination. The output range supports taking elements and putting them somewhere. In the case of a simple array, putting them somewhere means assigning to the first element, and then moving to the next one.

-Steve
November 23, 2015
On Monday, 23 November 2015 at 15:19:08 UTC, Steven Schveighoffer wrote:
> On 11/21/15 10:19 PM, Jon D wrote:
>> On Sunday, 22 November 2015 at 00:31:53 UTC, Jonathan M Davis wrote:
>>>
>>> Honestly, arrays suck as output ranges. They don't get appended to;
>>> they get filled, and for better or worse, the documentation for copy
>>> is probably assuming that you know that. If you want your array to be
>>> appended to when using it as an output range, then you need to use
>>> std.array.Appender.
>>>
>> Hi Jonathan, thanks for the reply and the info about std.array.Appender.
>> I was actually using copy to fill an array, not append. However, I also
>> wanted to preallocate the space. And, since I'm mainly trying to
>> understand the language, I was also trying to figure out the difference
>> between these two forms of creating a dynamic array with an initial size:
>>
>>     auto x = new int[](n);
>>     int[] y;  y.reserve(n);
>
> If you want to change the size of the array, use length:
>
> y.length = n;
>
> This will extend y to the correct length, automatically reserving a block of data that can hold it, and allow you to write to the array.
>
> All reserve does is to make sure there is enough space so you can append that much data to it. It is not relevant to your use case.
>
>> The obvious difference is that first initializes n values, the second
>> form does not. I'm still unclear if there are other material
>> differences, or when one might be preferred over the other :) It's was
>> in this context the behavior of copy surprised me, that it wouldn't
>> operate on the second form without first filling in the elements. If
>> this seems unclear, I can provide a slightly longer sample showing what
>> I was doing.
>
> extending length affects the given array, extending if necessary. reserve is ONLY relevant if you are using appending (arr ~= x). It doesn't actually affect the "slice" or the variable you are using, at all (except to possibly point it at newly allocated space).
>
> copy uses an "output range" as it's destination. The output range supports taking elements and putting them somewhere. In the case of a simple array, putting them somewhere means assigning to the first element, and then moving to the next one.
>
> -Steve

Thanks for the reply. And for your article (which Jonathan recommended). It clarified a number of things.

In the example I gave, what I was really wondering was if there is a difference between allocating with 'new' or with 'reserve', or with 'length', for that matter. That is, is there a material difference between:

    auto x = new int[](n);
    int[] y; y.length = n;

I can imagine that the first might be faster, but otherwise there appears no difference. As the article stresses, the question is the ownership model. If I'm understanding, both cause an allocation into the runtime managed heap.

--Jon

November 24, 2015
On 11/23/15 4:29 PM, Jon D wrote:
> On Monday, 23 November 2015 at 15:19:08 UTC, Steven Schveighoffer wrote:
>> On 11/21/15 10:19 PM, Jon D wrote:
>>> On Sunday, 22 November 2015 at 00:31:53 UTC, Jonathan M Davis wrote:
>>>>
>>>> Honestly, arrays suck as output ranges. They don't get appended to;
>>>> they get filled, and for better or worse, the documentation for copy
>>>> is probably assuming that you know that. If you want your array to be
>>>> appended to when using it as an output range, then you need to use
>>>> std.array.Appender.
>>>>
>>> Hi Jonathan, thanks for the reply and the info about std.array.Appender.
>>> I was actually using copy to fill an array, not append. However, I also
>>> wanted to preallocate the space. And, since I'm mainly trying to
>>> understand the language, I was also trying to figure out the difference
>>> between these two forms of creating a dynamic array with an initial
>>> size:
>>>
>>>     auto x = new int[](n);
>>>     int[] y;  y.reserve(n);
>>
>> If you want to change the size of the array, use length:
>>
>> y.length = n;
>>
>> This will extend y to the correct length, automatically reserving a
>> block of data that can hold it, and allow you to write to the array.
>>
>> All reserve does is to make sure there is enough space so you can
>> append that much data to it. It is not relevant to your use case.
>>
>>> The obvious difference is that first initializes n values, the second
>>> form does not. I'm still unclear if there are other material
>>> differences, or when one might be preferred over the other :) It's was
>>> in this context the behavior of copy surprised me, that it wouldn't
>>> operate on the second form without first filling in the elements. If
>>> this seems unclear, I can provide a slightly longer sample showing what
>>> I was doing.
>>
>> extending length affects the given array, extending if necessary.
>> reserve is ONLY relevant if you are using appending (arr ~= x). It
>> doesn't actually affect the "slice" or the variable you are using, at
>> all (except to possibly point it at newly allocated space).
>>
>> copy uses an "output range" as it's destination. The output range
>> supports taking elements and putting them somewhere. In the case of a
>> simple array, putting them somewhere means assigning to the first
>> element, and then moving to the next one.
>
> Thanks for the reply. And for your article (which Jonathan recommended).
> It clarified a number of things.
>
> In the example I gave, what I was really wondering was if there is a
> difference between allocating with 'new' or with 'reserve', or with
> 'length', for that matter. That is, is there a material difference between:
>
>      auto x = new int[](n);
>      int[] y; y.length = n;

There is no difference at all, other than the function that is called (the former will call an allocation function, the latter will call a length setting function, which then will determine if more data is needed, and finding it is, call the allocation function).

> I can imagine that the first might be faster, but otherwise there
> appears no difference. As the article stresses, the question is the
> ownership model. If I'm understanding, both cause an allocation into the
> runtime managed heap.

You are correct.

-Steve
November 24, 2015
On 11/23/2015 04:03 PM, Steven Schveighoffer wrote:
> On 11/23/15 4:29 PM, Jon D wrote:

>> In the example I gave, what I was really wondering was if there is a
>> difference between allocating with 'new' or with 'reserve', or with
>> 'length', for that matter. That is, is there a material difference
>> between:
>>
>>      auto x = new int[](n);
>>      int[] y; y.length = n;
>
> There is no difference at all, other than the function that is called
> (the former will call an allocation function, the latter will call a
> length setting function, which then will determine if more data is
> needed, and finding it is, call the allocation function).

Although Jon's example above does not compare reserve, I have to ask: How about non-trivial types? Both cases above would set all elements to .init, right? So, I think reserve would be faster if copy() knew how to take advantage of capacity. It could emplace elements instead of copying, no?

Ali

« First   ‹ Prev
1 2