Two "references" to dynamic array, why not change both when reallocating? - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » Learn » Two "references" to dynamic array, why not change both when reallocating?

Thread overview

Two "references" to dynamic array, why not change both when reallocating?
Nov 11, 2020 zack
Nov 11, 2020 Mike Parker
Nov 11, 2020 Simen Kjærås
Nov 11, 2020 zack
Nov 11, 2020 Ola Fosheim Grøstad

November 11, 2020

Two "references" to dynamic array, why not change both when reallocating?

Posted by zack

zack

I am new to D. Appending to an array can lead to reallocation, that's clear. But why is the "reference" b not changed accordingly to the new position and still points to "old" memory? Why is b not also changed when reallocating array a and the old data getting invalid/freed?

auto a = [55,10,20];
auto b = a;
a ~= [99,99,99,99,99,99];
a[0] = 1;
assert(b[0] == 1); // could fail

(similar to p.103-104 in "The D Programming language")

November 11, 2020

Re: Two "references" to dynamic array, why not change both when reallocating?

Posted by Mike Parker
in reply to zack

Mike Parker

Posted in reply to zack

On Wednesday, 11 November 2020 at 10:17:09 UTC, zack wrote:
> I am new to D. Appending to an array can lead to reallocation, that's clear. But why is the "reference" b not changed accordingly to the new position and still points to "old" memory? Why is b not also changed when reallocating array a and the old data getting invalid/freed?
>
> auto a = [55,10,20];
> auto b = a;
> a ~= [99,99,99,99,99,99];
> a[0] = 1;
> assert(b[0] == 1); // could fail
>
> (similar to p.103-104 in "The D Programming language")

`b` is not a "reference" to `a`. Consider that under the hood, an array is a pointer/length pair. `b = a` means that `b` is initialized such that `b.length == a.length` and `b.ptr == a.ptr`. Now when you append to `a`, one of two things can happen:

1. there's no allocation, in which case `a.length != b.length` and `a.ptr == b.ptr`.
2. there's a reallocation, in which case `a.length != b.length` and `a.ptr != b.ptr`.

And when you append to `b`, it can also reallocate and `a` will not be affected.

November 11, 2020

Re: Two "references" to dynamic array, why not change both when reallocating?

Posted by Simen Kjærås
in reply to zack

Simen Kjærås

Posted in reply to zack

On Wednesday, 11 November 2020 at 10:17:09 UTC, zack wrote:
> I am new to D. Appending to an array can lead to reallocation, that's clear. But why is the "reference" b not changed accordingly to the new position and still points to "old" memory? Why is b not also changed when reallocating array a and the old data getting invalid/freed?
>
> auto a = [55,10,20];
> auto b = a;
> a ~= [99,99,99,99,99,99];
> a[0] = 1;
> assert(b[0] == 1); // could fail
>
> (similar to p.103-104 in "The D Programming language")

The short answer is 'because that's how we've chosen to define it'. A more involved answer is that changing every reference is prohibitively expensive - it would require the equivalent of a GC collection on every reallocation, as references to the array could exist anywhere in the program, be that on the stack, heap, even on other threads. That's the performance side of it.

Another heavy argument is 'can you make it behave the other way?' Currently, that's simple: use a pointer to a slice (T[]*), and share that around. I don't see how I would get the current behavior if reallocation caused reassignment of all references (though admittedly I haven't thought too much about it).

Next up: does b, ending on the same element as a, refer to the whole length of a (i.e. should b's length be reassigned when a is reallocated?), or is it a slice only referencing the first three elements? Either choice is going to be unexpected in some cases.

All in all, there's many reasons to choose the behavior D has chosen. There are some drawbacks, but I feel it's the right choice.

--
  Simen

November 11, 2020

Re: Two "references" to dynamic array, why not change both when reallocating?

Posted by zack
in reply to Simen Kjærås

zack

Posted in reply to Simen Kjærås

Alright, thanks for sharing this thoughts and arguments!

November 11, 2020

Re: Two "references" to dynamic array, why not change both when reallocating?

Posted by Ola Fosheim Grøstad
in reply to Simen Kjærås

Ola Fosheim Grøstad

Posted in reply to Simen Kjærås

On Wednesday, 11 November 2020 at 13:30:16 UTC, Simen Kjærås wrote:
> The short answer is 'because that's how we've chosen to define it'. A more involved answer is that changing every reference is prohibitively expensive - it would require the equivalent of a GC collection on every reallocation, as references to the array could exist anywhere in the program, be that on the stack, heap, even on other threads. That's the performance side of it.

No... Not true. But either way, D and Golang only have simple windows onto memory rather than dynamic array ADTs. That is bad for correctness, static analysis and ownership modelling. A simple choice, plausible for libraries, but bad for application code. If D is going to support non-GC code well, it has to change this. C++ got this right btw, where slices can only decrease in size.

Then again, extending arrays is generally a bad idea for performance in all languages... So try to avoid increasing size after initial building.

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation