new T[size] vs .reserve - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » Learn » new T[size] vs .reserve

Thread overview

new T[size] vs .reserve

Feb 02, 2013

Feb 02, 2013

Feb 02, 2013

Feb 03, 2013

Feb 02, 2013

Feb 02, 2013

Feb 02, 2013

Feb 02, 2013

Feb 02, 2013

Feb 02, 2013

Feb 02, 2013

Feb 02, 2013

Feb 02, 2013

Feb 03, 2013

Feb 03, 2013

Feb 03, 2013

Feb 03, 2013

Feb 03, 2013

Feb 03, 2013

Feb 03, 2013

Feb 03, 2013

Feb 03, 2013

Feb 03, 2013

Feb 03, 2013

Re: new T[size] vs .reserve - alloca
Feb 05, 2013 Nick Treleaven
Feb 05, 2013 Nick Treleaven
Feb 05, 2013 monarch_dodra
Feb 05, 2013 Namespace
Feb 05, 2013 Nick Treleaven
Feb 05, 2013 monarch_dodra
Feb 05, 2013 Nick Treleaven
Feb 05, 2013 Nick Treleaven
Feb 05, 2013 monarch_dodra
Feb 06, 2013 Nick Treleaven
Feb 05, 2013 bearophile
Feb 06, 2013 Nick Treleaven

Feb 02, 2013

Steven Schveighoffer

Feb 02, 2013

Feb 02, 2013

February 02, 2013

new T[size] vs .reserve

Posted by Namespace

Namespace

Currently something like new ubyte[4]; reserves space for _at least_ 4 items.
But if I want to store now something, the index isn't 0, it's 4.
Let me explain that on a brief example:

[code]
import std.stdio;

void main() {
	ubyte[] arr = new ubyte[4];
	arr ~= 4; // desirable: [4, 0, 0, 0];
	writeln(arr); // prints: [0, 0, 0, 0, 4]
	
	ubyte[] arr2;
	arr2.reserve(4);
	arr2 ~= 4; // expect: [4, 0, 0, 0];
	writeln(arr2); // prints: [4] just as well
}
[/code]

So is there any reason why this behaviour is like it is?

As I looked at arr.length and arr.capacity for the first time I was schocked: I want only space for 4 items, but I got space for 15.
I read in the docs that .reserve extends the space to at least SIZE items, but maybe more. But at that time and still yet I found nothing about new ubyte[SIZE]. So I ask:
wouldn't it be better, if new ubyte[4] reserves only space for 4 items and reallocs only if I append more than 4?

February 02, 2013

Re: new T[size] vs .reserve

Posted by Maxim Fomin
in reply to Namespace

Maxim Fomin

Posted in reply to Namespace

On Saturday, 2 February 2013 at 16:36:29 UTC, Namespace wrote:
> Currently something like new ubyte[4]; reserves space for _at least_ 4 items.
> But if I want to store now something, the index isn't 0, it's 4.
> Let me explain that on a brief example:
>
> [code]
> import std.stdio;
>
> void main() {
> 	ubyte[] arr = new ubyte[4];

You asked for array with four zeros (T.init for ubyte is 0).

> 	arr ~= 4; // desirable: [4, 0, 0, 0];
> 	writeln(arr); // prints: [0, 0, 0, 0, 4]

You got it. And you appended 4 to them, so you got proper result.

> 	ubyte[] arr2;

You asked for empty array.

> 	arr2.reserve(4);
> 	arr2 ~= 4; // expect: [4, 0, 0, 0];
> 	writeln(arr2); // prints: [4] just as well
> }

You appended 4 to empty array and you got [4]. By the way, note that the output is not [4,0,0,0] it is [4] since no one has pushed three zeros there.

> [/code]
>
> So is there any reason why this behaviour is like it is?

I found it is consistent and complying yo spec. new ubyte[4] and reverse(4) do different things.

> As I looked at arr.length and arr.capacity for the first time I was schocked: I want only space for 4 items, but I got space for 15.
> I read in the docs that .reserve extends the space to at least SIZE items, but maybe more. But at that time and still yet I found nothing about new ubyte[SIZE]. So I ask:
> wouldn't it be better, if new ubyte[4] reserves only space for 4 items and reallocs only if I append more than 4?

Since you have two tools which do different things, you can select that tool which does what you wanted. What's the problem here?

February 02, 2013

Re: new T[size] vs .reserve

Posted by Namespace
in reply to Maxim Fomin

Namespace

Posted in reply to Maxim Fomin

No one said that this is a problem, or? (;
But why should I generate an array with size 4 if an append of me resize it to 5?

I'm asking because we have currently nothing (nice) for _fixed_ size arrays at runtime. That should be change. Therefore new T[size] would be a good.

February 02, 2013

Re: new T[size] vs .reserve

Posted by FG
in reply to Namespace

FG

Posted in reply to Namespace

On 2013-02-02 17:36, Namespace wrote:
>      ubyte[] arr = new ubyte[4];
>      arr ~= 4; // desirable: [4, 0, 0, 0];

Why not arr[0] = 4? What is so special about having 4 elements?

> As I looked at arr.length and arr.capacity for the first time I was schocked: I
> want only space for 4 items, but I got space for 15.

I think you are out of luck with built-in dynamic arrays.
It seems that even for an array of length 1 it gives a minimum of 15 bytes
storage, so you can fit 15 bytes, 7 shorts, 3 ints, 1 long.
I'm not sure why is it 15, given alignment an all...

February 02, 2013

Re: new T[size] vs .reserve

Posted by Namespace
in reply to FG

Namespace

Posted in reply to FG

On Saturday, 2 February 2013 at 17:48:14 UTC, FG wrote:
> On 2013-02-02 17:36, Namespace wrote:
>>     ubyte[] arr = new ubyte[4];
>>     arr ~= 4; // desirable: [4, 0, 0, 0];
>
> Why not arr[0] = 4? What is so special about having 4 elements?

Example:

struct Color {
public:
    ubyte[4] colors;
}

ubyte[] data = new ubyte[color_data.length * 4]; // enough storage
foreach (ref const Color col; color_data) {
    data ~= col.colors;
}

Currently impossible, even with self indexing absolute annoying and unnecessary complicated.
Sure you could do the same with .reserve but the point is that you does not want more storage than color_data.length * 4.
Currently I use my own struct which reserve only color_data.length * 4 memory and let the index at beginning to 0.
But I'm using D because it is nice and simple, so why I should create my own data structure, and use ugly malloc, realloc and free, for such needed thing if the language could and should do that for you?

February 02, 2013

Re: new T[size] vs .reserve

Posted by FG
in reply to Namespace

FG

Posted in reply to Namespace

On 2013-02-02 19:01, Namespace wrote:
> Example:
>
> struct Color {
> public:
>      ubyte[4] colors;
> }
>
> ubyte[] data = new ubyte[color_data.length * 4]; // enough storage
> foreach (ref const Color col; color_data) {
>      data ~= col.colors;
> }

Sorry, but what is the point of data having only 4 bytes reserved
at the beginning? What is wrong with this:

    ubyte[] data;
    data.reserve(color_data.length * 4);
    foreach (ref const Color col; color_data)
        data ~= col.colors;

Are you uncomfortable, because it may allocate twice as much space
as you need (for bigger color_data)?

February 02, 2013

Re: new T[size] vs .reserve

Posted by Namespace
in reply to FG

Namespace

Posted in reply to FG

> Are you uncomfortable, because it may allocate twice as much space
> as you need (for bigger color_data)?

Yes, that's the point.
Sorry, I cannot express myself very well in English.

February 02, 2013

Re: new T[size] vs .reserve

Posted by FG
in reply to Namespace

FG

Posted in reply to Namespace

On 2013-02-02 19:53, Namespace wrote:
>> Are you uncomfortable, because it may allocate twice as much space
>> as you need (for bigger color_data)?
>
> Yes, that's the point.
> Sorry, I cannot express myself very well in English.

You're right. It's surprising for anyone used to dealing with std::vector,
that it actually reserves more than you specify.

Another odd thing - when pushing back to a vector its capacity grows:
1, 2, 4, 8, 16, 32 ..., but with D arrays it's like 7, 15, 31, 63...
Why 2**n - 1? What secret data is hidden after the block? ;)

February 02, 2013

Re: new T[size] vs .reserve

Posted by Steven Schveighoffer
in reply to Namespace

Steven Schveighoffer

Posted in reply to Namespace

On Sat, 02 Feb 2013 11:36:28 -0500, Namespace <rswhite4@googlemail.com> wrote:

> Currently something like new ubyte[4]; reserves space for _at least_ 4 items.
> But if I want to store now something, the index isn't 0, it's 4.
> Let me explain that on a brief example:
>
> [code]
> import std.stdio;
>
> void main() {
> 	ubyte[] arr = new ubyte[4];
> 	arr ~= 4; // desirable: [4, 0, 0, 0];
> 	writeln(arr); // prints: [0, 0, 0, 0, 4]
> 	
> 	ubyte[] arr2;
> 	arr2.reserve(4);
> 	arr2 ~= 4; // expect: [4, 0, 0, 0];
> 	writeln(arr2); // prints: [4] just as well
> }
> [/code]
>
> So is there any reason why this behaviour is like it is?
>
> As I looked at arr.length and arr.capacity for the first time I was schocked: I want only space for 4 items, but I got space for 15.
> I read in the docs that .reserve extends the space to at least SIZE items, but maybe more. But at that time and still yet I found nothing about new ubyte[SIZE]. So I ask:
> wouldn't it be better, if new ubyte[4] reserves only space for 4 items and reallocs only if I append more than 4?

Heap block sizes start at 16.  One byte overhead is used to store the array length, so the minimum size of ANY array allocation is 15 bytes.

Then it doubles to 32, then to 64, then to 128, 256, 512, etc. until you get to a page size (4096).  Then it scales linearly by pages.

I think you misunderstand what reserve and append do.  You should read this article to understand arrays and appending better: http://dlang.org/d-array-article.html

-Steve

February 02, 2013

Re: new T[size] vs .reserve

Posted by monarch_dodra
in reply to FG

monarch_dodra

Posted in reply to FG

On Saturday, 2 February 2013 at 19:49:39 UTC, FG wrote:
> On 2013-02-02 19:53, Namespace wrote:
>>> Are you uncomfortable, because it may allocate twice as much space
>>> as you need (for bigger color_data)?
>>
>> Yes, that's the point.
>> Sorry, I cannot express myself very well in English.
>
> You're right. It's surprising for anyone used to dealing with std::vector,
> that it actually reserves more than you specify.

FYI: std::vector will do exactly the same thing. The actual grow scheme may be different, but that is purelly an implementation detail.

Another thing to take into account: In C++, if you request 10 bytes of allocation space, but the underlying implementation actually allocates a 32 byte block, you'll know nothing of it. If later, you want to grow those 10 bytes to 20 bytes, you'll have to request a new allocation.

std::vector has actually no idea how much space actually gets allocated. It requests a certain amount but has no idea how much it really gets. At best, it can tell you how much it requested.

In D, you can exploit the last drop of allocation space actually allocated. It is the underlying array itself that tells you how much space there is left.

> Another odd thing - when pushing back to a vector its capacity grows:
> 1, 2, 4, 8, 16, 32 ..., but with D arrays it's like 7, 15, 31, 63...
> Why 2**n - 1? What secret data is hidden after the block? ;)

The currently used size.

In C++, set::vector is a struct that has a "size" member, and a pointer to a payload. In D, the "size" data is embedded straight into the payload. Kind of like how std::string is implemented actually.

This has a two-fold advantage: The actual slice object is light (just a fat pointer). Also, when you have two (or more) slices that reference the same data, they *know* if or if not they can safelly append, without cloberring the data of another slice.

I suggest you read this:
http://dlang.org/d-array-article.html
It is a very interesting read, especially for those of us with C++ background.

One very (very) important thing to realize is that std::vector is a container. D slices are not containers: they are ranges. They iterate on an underlying array object, but they are not containers themselves.

The underlying object, the so called "dynamic array" is actually a obscure and hidden object you cannot access dirrectly.

There is some ambiguity between both terms, and they are often used interchangedbly, but to really understand what is going on, you need to imagine them as two different objects interacting.

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation