Jump to page: 1 2
Thread overview
Avoid deallocate empty arrays?
Dec 17, 2020
IGotD-
Dec 17, 2020
Q. Schroll
Dec 17, 2020
IGotD-
Dec 17, 2020
Ali Çehreli
Dec 17, 2020
Ali Çehreli
Dec 17, 2020
IGotD-
Dec 17, 2020
H. S. Teoh
Dec 17, 2020
IGotD-
Dec 17, 2020
H. S. Teoh
December 17, 2020
It's common using arrays for buffering, that means constantly adding elements and empty the elements. I have seen that when the number of elements is zero, the array implementation deallocates the array which is shown with capacity is zero. This of course leads to constant allocation and deallocation of the array.

One workaround is just to allocate the array once, then keep track of the position yourself rather than shrinking the array so that the length field always track the number of elements. This is possible but if you want dynamic increasing capacity you have track that yourself too.

However, is there a way to tell the array not deallocate the array and just increasing the array when necessary.
December 17, 2020
On Thursday, 17 December 2020 at 16:11:37 UTC, IGotD- wrote:
> It's common using arrays for buffering

Outside of CTFE, use an Appender.¹ Unless you're having a const/immutable element type, Appender can shrink and reuse space.² If you have, reallocation is necessary anyway not to break const/immutable' guarantees.

¹ https://dlang.org/phobos/std_array.html#appender
² https://dlang.org/phobos/std_array.html#.Appender.clear
December 17, 2020
On Thursday, 17 December 2020 at 16:46:47 UTC, Q. Schroll wrote:
> On Thursday, 17 December 2020 at 16:11:37 UTC, IGotD- wrote:
>> It's common using arrays for buffering
>
> Outside of CTFE, use an Appender.¹ Unless you're having a const/immutable element type, Appender can shrink and reuse space.² If you have, reallocation is necessary anyway not to break const/immutable' guarantees.
>
> ¹ https://dlang.org/phobos/std_array.html#appender
> ² https://dlang.org/phobos/std_array.html#.Appender.clear

Thank you, I will try this one out.
December 17, 2020
On 12/17/20 8:11 AM, IGotD- wrote:

> It's common using arrays for buffering, that means constantly adding
> elements and empty the elements.

I show an example of this at the following point in a DConf presentation:

  https://youtu.be/dRORNQIB2wA?t=791

The following code:

  int[] outer;

  while (a) {
    int[] inner;

    while (b) {
      inner ~= e;
    }

    outer ~= bar(inner);
  }

Can be turned into this:

  // Remember: These are thread-local
  static Appender!(int[]) outer;
  static Appender!(int[]) inner;

  outer.clear();    // Clear state from last call

  while (a) {
    inner.clear();  // Clear state from last iteration

    while (b) {
      inner ~= e;
    }

    outer ~= bar(inner.data);
  }

There is also assumeUnique(), which does not appear in the presentation, which may be useful in some cases.

Ali

December 17, 2020
On Thursday, 17 December 2020 at 16:11:37 UTC, IGotD- wrote:
> It's common using arrays for buffering, that means constantly adding elements and empty the elements. I have seen that when the number of elements is zero, the array implementation deallocates the array which is shown with capacity is zero. This of course leads to constant allocation and deallocation of the array.
>
> One workaround is just to allocate the array once, then keep track of the position yourself rather than shrinking the array so that the length field always track the number of elements. This is possible but if you want dynamic increasing capacity you have track that yourself too.
>
> However, is there a way to tell the array not deallocate the array and just increasing the array when necessary.

This isn’t correct. Can you post the code that led you to believe this?

-Steve
December 17, 2020
On Thursday, 17 December 2020 at 17:46:59 UTC, Steven Schveighoffer wrote:
>
> This isn’t correct. Can you post the code that led you to believe this?
>
> -Steve

Sure.


import std.algorithm;
import std.typecons;
import std.stdio;


struct Buffer
{
	this(size_t size)
	{
		m_buffer.reserve = size;
	}


	void add(const void[] arr)
	{
		m_buffer ~= cast(ubyte[])arr;
	}


	string getSome()
	{
		if(m_buffer.length > 0)
		{
			return cast(string)m_buffer[0..$];
		}
		else
		{
			return "";
		}
	}

	void remove(size_t size)
	{
		m_buffer = m_buffer.remove(tuple(0, size));
	}

	ubyte[] m_buffer;
}

void main()
{
	Buffer b = Buffer(16);
	
	b.add("aa");

	writeln("b.m_buffer.length ", b.m_buffer.length, ", b.m_buffer.capacity ", b.m_buffer.capacity);

	string s = b.getSome();

	assert(s == "aa");

	b.remove(s.length);

	writeln("b.m_buffer.length ", b.m_buffer.length, ", b.m_buffer.capacity ", b.m_buffer.capacity);
}

This will print

b.m_buffer.length 2, b.m_buffer.capacity 31
b.m_buffer.length 0, b.m_buffer.capacity 0

capacity 0, suggests that the array has been deallocated.
December 17, 2020
On Thu, Dec 17, 2020 at 06:10:25PM +0000, IGotD- via Digitalmars-d-learn wrote: [...]
> b.m_buffer.length 2, b.m_buffer.capacity 31 b.m_buffer.length 0, b.m_buffer.capacity 0
> 
> capacity 0, suggests that the array has been deallocated.

Are you sure?

My understanding is that capacity is always set to 0 when you shrink an array, in order to force reallocation when you append a new element. The reason is this:

	int[] data = [ 1, 2, 3, 4, 5 ];
	int[] slice = data[0 .. 4];

	writeln(slice.capacity); // 0
	writeln(data.capacity);	 // 7  <--- N.B.
	slice ~= 10;

	writeln(slice); // [1, 2, 3, 4, 10]
	writeln(data);	// [1, 2, 3, 4, 5]

Notice that slice.capacity is 0, but data.capacity is *not* 0, even after taking the slice.  Meaning the array was *not* deallocated.

Why is slice.capacity set to 0?  So that when you append 10 to it, it does not overwrite the original array (cf. last line of code above), but instead allocates a new array and appends to that.  This is the default behaviour because it's the least surprising -- you don't want people to be able to modify elements of your array outside the slice you've handed to them. If they want to append, they get a copy of the data instead.

In order to suppress this behaviour, use .assumeSafeAppend.


T

-- 
If you're not part of the solution, you're part of the precipitate.
December 17, 2020
On Thursday, 17 December 2020 at 18:42:54 UTC, H. S. Teoh wrote:
>
> Are you sure?
>
> My understanding is that capacity is always set to 0 when you shrink an array, in order to force reallocation when you append a new element. The reason is this:
>
> 	int[] data = [ 1, 2, 3, 4, 5 ];
> 	int[] slice = data[0 .. 4];
>
> 	writeln(slice.capacity); // 0
> 	writeln(data.capacity);	 // 7  <--- N.B.
> 	slice ~= 10;
>
> 	writeln(slice); // [1, 2, 3, 4, 10]
> 	writeln(data);	// [1, 2, 3, 4, 5]
>
> Notice that slice.capacity is 0, but data.capacity is *not* 0, even after taking the slice.  Meaning the array was *not* deallocated.
>
> Why is slice.capacity set to 0?  So that when you append 10 to it, it does not overwrite the original array (cf. last line of code above), but instead allocates a new array and appends to that.  This is the default behaviour because it's the least surprising -- you don't want people to be able to modify elements of your array outside the slice you've handed to them. If they want to append, they get a copy of the data instead.
>
> In order to suppress this behaviour, use .assumeSafeAppend.
>
>
> T

How does this connect to the array with zero elements? According to your explanation if I have understood it correctly, capacity is also indicating if the pointer has been "borrowed" from another array forcing an allocation whenever the array is modified. However, if capacity is zero when the array length is zero, then you would get a allocation as well, regardless if there was a previous allocated array.

December 17, 2020
On 12/17/20 8:48 AM, Ali Çehreli wrote:

> There is also assumeUnique()

I meant assumeSafeAppend().

Ali


December 17, 2020
On Thu, Dec 17, 2020 at 06:57:08PM +0000, IGotD- via Digitalmars-d-learn wrote: [...]
> How does this connect to the array with zero elements? According to your explanation if I have understood it correctly, capacity is also indicating if the pointer has been "borrowed" from another array forcing an allocation whenever the array is modified. However, if capacity is zero when the array length is zero, then you would get a allocation as well, regardless if there was a previous allocated array.

Technically if the array/slice has zero length and zero capacity, it doesn't have any allocated memory associated with it, so it makes sense to allocate when you next append to it.  If .ptr is set, though, you could use .assumeSafeAppend and it should reuse whatever .ptr is pointing to.

This isn't any different from the slice case of non-zero length. If I hand you a zero-length slice of my array, I'd be unhappy if you could use that zero-length slice to modify the other elements in my array. So it seems logical to set capacity to 0 when array length is reduced to zero.  When this is not desired, .assumeSafeAppend is the ticket.


On a side note, though, I find this idiosyncratic behaviour annoying when all I really want is to use an array as, e.g., a backing for a stack.  For those cases, I ignore array capacity and keep a slice over the entire allocated storage, including elements that have been erased, and keep a separate index that represents the logical end-of-array. While .assumeSafeAppend does work well, it does represent a druntime function call, which introduces a slight runtime overhead, and it does come with a slight performace hit.


T

-- 
I am a consultant. My job is to make your job redundant. -- Mr Tom
« First   ‹ Prev
1 2