Thread overview
why does clearing an array set its capacity to 0?
Jul 01, 2014
Vlad Levenfeld
Jul 01, 2014
Vlad Levenfeld
Jul 01, 2014
Ivan Kazmenko
Jul 01, 2014
Vlad Levenfeld
Jul 01, 2014
Ali Çehreli
Jul 02, 2014
Vlad Levenfeld
Jul 02, 2014
bearophile
Jul 01, 2014
Chris Cain
July 01, 2014
I'm trying to implement some wrappers atop preallocated arrays, so I'm calling reserve in the constructor, have an invariant to ensure that the array.ptr never moves, and use in-contracts to make sure any insertion and appends don't put me over the array capacity.

I find that I'm able to safely add and remove elements (I increment xor decrement the length when I do this) but when I call clear on the backing array or set its length to 0, my capacity also goes to 0.

Why is this, and what do I do about it?


---
As an aside, I remember seeing a Dconf2013 talk given by Don Clugston where he had mentioned that at Sociomantic Labs they operate within slices of larger, pinned arrays or something to minimize GC activity. (I'm very interested in seeing his Dconf2014 talk but can't seem to find a video).

This is basically what I'm trying to do, specifically in the context of a custom allocator: allocating custom sliced range types that can do fast appends (as well as containers that don't call new). If anyone has any advice on this in general I would be grateful for it.
July 01, 2014
I was mistaken earlier, decrementing the length counter also sets the capacity to 0.
July 01, 2014
In short:
> Why is this, and what do I do about it?

http://dlang.org/phobos/object.html#.assumeSafeAppend

On Tuesday, 1 July 2014 at 12:33:15 UTC, Vlad Levenfeld wrote:
> I'm trying to implement some wrappers atop preallocated arrays, so I'm calling reserve in the constructor, have an invariant to ensure that the array.ptr never moves, and use in-contracts to make sure any insertion and appends don't put me over the array capacity.
>
> I find that I'm able to safely add and remove elements (I increment xor decrement the length when I do this) but when I call clear on the backing array or set its length to 0, my capacity also goes to 0.

As you noted in your followup post, decrementing also sets capacity to 0.

The reason is simple, observe:

auto arr = [1,2,3]

It is safe if you append 4 to that.

auto arr = [1,2,3,4]
auto other = arr[];
arr.length = 2;

It is *not* safe to append 5 to arr here, because doing so would change other to [1,2,5,4] which is (probably) not what you want (or, at least, you haven't made it explicit that you're okay with this behavior). Since slices are unaware of whether other views on memory exist, they must always reallocate if they've been shrunk to avoid the possibility of overwriting existing elements in other views of the array.

When you use "assumeSafeAppend", you are guaranteeing that either no other views exist, or if they do exist that you don't care if they are overwritten. Take note that the second case is *really bad* if your element type is immutable, for obvious reasons.
July 01, 2014
On Tuesday, 1 July 2014 at 13:03:54 UTC, Vlad Levenfeld wrote:
> I was mistaken earlier, decrementing the length counter also sets the capacity to 0.

Besides just learning to use assumeSafeAppend (as mentioned already), I'd also recommend reading the article on D slices to deeper understand the ground you are building on top of:

http://dlang.org/d-array-article.html
July 01, 2014
Thanks for your replies. The article was quite helpful in clarifying some questions I had.

I decided to benchmark the different append methods (with ["releaseMode", "inline", "noBoundsCheck", "optimize"]) by appending 10,000 elements to arrays with
  ~=,
  Appender,
  with and without first reserving,
  with and without assumeSafeAppend
  and with a custom array type that preallocated an array and kept its own length.

The custom array was fastest, then the appender was over 2-3x slower, and the dynamic array 12-13x slower. What surprised me though, was that calling reserve  didn't actually give much performance improvement. In fact, of all the ~= benchmarks, plain dynamic arrays had the best performance, followed by reserved and assumedSafe, followed by reserved.

For the Appender, reserving was negligibly faster.

I'm not sure if I'm doing something wrong but these seem pretty extreme to me, as the description of reserve gave me the impression that it was roughly the same thing as the custom array (i.e. slicing preallocated data).

July 01, 2014
On 07/01/2014 03:51 PM, Vlad Levenfeld wrote:

> Thanks for your replies. The article was quite helpful in clarifying
> some questions I had.
>
> I decided to benchmark the different append methods (with
> ["releaseMode", "inline", "noBoundsCheck", "optimize"]) by appending
> 10,000 elements to arrays with
>    ~=,
>    Appender,
>    with and without first reserving,
>    with and without assumeSafeAppend
>    and with a custom array type that preallocated an array and kept its
> own length.
>
> The custom array was fastest,

How did you allocate your array? The GC has the APPENDABLE attribute special for memory blocks that are going to be used as slices:

  http://dlang.org/phobos/core_memory.html#.GC.BlkAttr.APPENDABLE

However, if you use your block as a slice, then you might get the same performance as a slice.

> then the appender was over 2-3x slower,
> and the dynamic array 12-13x slower.

Are you doing bounds checking for your array? Try the -boundscheck=off compiler flag. (That may be new in 2.066. Otherwise, try -noboundscheck.)

> What surprised me though, was that
> calling reserve  didn't actually give much performance improvement. In
> fact, of all the ~= benchmarks, plain dynamic arrays had the best
> performance,

The runtime is very smart about that. As you read in that article, if there is room at the end of the block, there is almost no cost for appending more memory to the slice.

> followed by reserved and assumedSafe, followed by reserved.
>
> For the Appender, reserving was negligibly faster.
>
> I'm not sure if I'm doing something wrong but these seem pretty extreme
> to me, as the description of reserve gave me the impression that it was
> roughly the same thing as the custom array (i.e. slicing preallocated
> data).

With a smart growth policy like 50%, the cost of memory allocation is amortized and negligible. reserve() would bring some benefit but it is not necessary in most cases. However, when you know the size before-hand, there is no reason not to take advantage of it either.

Ali

July 02, 2014
> How did you allocate your array? The GC has the APPENDABLE attribute special for memory blocks that are going to be used as slices:

I allocated with new T[n]. I've tried allocating with both the APPENDABLE and NO_SCAN blocks but beyond a certain allocation size this doesn't work, I get a reported capacity of 0 and any append reallocates. (8000 does it for me, I haven't looked for the lower bound)

> Are you doing bounds checking for your array? Try the -boundscheck=off compiler flag. (That may be new in 2.066. Otherwise, try -noboundscheck.)

I had it set through dub's "buildOptions". Will there be a @nogc or @noheap flag in 2.066? I read some forum posts about it but can't seem to find whether it will be part of the next release.
July 02, 2014
Vlad Levenfeld:

> Will there be a @nogc or @noheap flag in 2.066?

A @nogc. (The idea of @noheap was refused and I've closed down the Enhancement request).

Bye,
bearophile