Thread overview
Two "references" to dynamic array, why not change both when reallocating?
Nov 11, 2020
zack
Nov 11, 2020
Mike Parker
Nov 11, 2020
Simen Kjærås
Nov 11, 2020
zack
November 11, 2020
I am new to D. Appending to an array can lead to reallocation, that's clear. But why is the "reference" b not changed accordingly to the new position and still points to "old" memory? Why is b not also changed when reallocating array a and the old data getting invalid/freed?

auto a = [55,10,20];
auto b = a;
a ~= [99,99,99,99,99,99];
a[0] = 1;
assert(b[0] == 1); // could fail

(similar to p.103-104 in "The D Programming language")
November 11, 2020
On Wednesday, 11 November 2020 at 10:17:09 UTC, zack wrote:
> I am new to D. Appending to an array can lead to reallocation, that's clear. But why is the "reference" b not changed accordingly to the new position and still points to "old" memory? Why is b not also changed when reallocating array a and the old data getting invalid/freed?
>
> auto a = [55,10,20];
> auto b = a;
> a ~= [99,99,99,99,99,99];
> a[0] = 1;
> assert(b[0] == 1); // could fail
>
> (similar to p.103-104 in "The D Programming language")

`b` is not a "reference" to `a`. Consider that under the hood, an array is a pointer/length pair. `b = a` means that `b` is initialized such that `b.length == a.length` and `b.ptr == a.ptr`. Now when you append to `a`, one of two things can happen:

1. there's no allocation, in which case `a.length != b.length` and `a.ptr == b.ptr`.
2. there's a reallocation, in which case `a.length != b.length` and `a.ptr != b.ptr`.

And when you append to `b`, it can also reallocate and `a` will not be affected.

November 11, 2020
On Wednesday, 11 November 2020 at 10:17:09 UTC, zack wrote:
> I am new to D. Appending to an array can lead to reallocation, that's clear. But why is the "reference" b not changed accordingly to the new position and still points to "old" memory? Why is b not also changed when reallocating array a and the old data getting invalid/freed?
>
> auto a = [55,10,20];
> auto b = a;
> a ~= [99,99,99,99,99,99];
> a[0] = 1;
> assert(b[0] == 1); // could fail
>
> (similar to p.103-104 in "The D Programming language")

The short answer is 'because that's how we've chosen to define it'. A more involved answer is that changing every reference is prohibitively expensive - it would require the equivalent of a GC collection on every reallocation, as references to the array could exist anywhere in the program, be that on the stack, heap, even on other threads. That's the performance side of it.

Another heavy argument is 'can you make it behave the other way?' Currently, that's simple: use a pointer to a slice (T[]*), and share that around. I don't see how I would get the current behavior if reallocation caused reassignment of all references (though admittedly I haven't thought too much about it).

Next up: does b, ending on the same element as a, refer to the whole length of a (i.e. should b's length be reassigned when a is reallocated?), or is it a slice only referencing the first three elements? Either choice is going to be unexpected in some cases.

All in all, there's many reasons to choose the behavior D has chosen. There are some drawbacks, but I feel it's the right choice.

--
  Simen
November 11, 2020
Alright, thanks for sharing this thoughts and arguments!
November 11, 2020
On Wednesday, 11 November 2020 at 13:30:16 UTC, Simen Kjærås wrote:
> The short answer is 'because that's how we've chosen to define it'. A more involved answer is that changing every reference is prohibitively expensive - it would require the equivalent of a GC collection on every reallocation, as references to the array could exist anywhere in the program, be that on the stack, heap, even on other threads. That's the performance side of it.

No... Not true. But either way, D and Golang only have simple windows onto memory rather than dynamic array ADTs. That is bad for correctness, static analysis and ownership modelling. A simple choice, plausible for libraries, but bad for application code. If D is going to support non-GC code well, it has to change this. C++ got this right btw, where slices can only decrease in size.

Then again, extending arrays is generally a bad idea for performance in all languages... So try to avoid increasing size after initial building.