Jump to page: 1 212  
Page
Thread overview
DIP 1025--Dynamic Arrays Only Shrink, Never Grow--Community Review Round 1
Nov 11
bachmeier
Nov 11
Dennis
Nov 11
Exil
Nov 11
Dennis
Nov 11
Dukc
Nov 11
Dukc
Nov 11
ixid
Nov 11
jmh530
Nov 11
jmh530
Nov 11
bachmeier
Nov 11
IGotD-
Nov 11
Dennis
Nov 11
IGotD-
Nov 11
jmh530
Nov 12
Jab
Nov 12
Jab
Nov 12
jmh530
Nov 11
Uknown
Nov 11
Uknown
Nov 11
Uknown
Nov 11
Meta
Nov 12
Meta
Nov 11
Basile B.
Nov 11
Basile B.
Nov 12
Suleyman
Nov 12
IGotD-
Nov 12
Tim
Nov 12
Dukc
Nov 12
Jab
Nov 12
bachmeier
Nov 12
jmh530
Nov 12
jmh530
Nov 12
Jab
November 11
This is the feedback thread for the first round of Community Review for DIP 1025, "Dynamic Arrays Only Shrink, Never Grow":

https://github.com/dlang/DIPs/blob/1b525ec4c914c06bc286c1a6dc93bf1533ee56e4/DIPs/DIP1025.md

All review-related feedback on and discussion of the DIP should occur in this thread. The review period will end at 11:59 PM ET on November 25, or when I make a post declaring it complete.

At the end of Round 1, if further review is deemed necessary, the DIP will be scheduled for another round of Community Review. Otherwise, it will be queued for the Final Review and Formal Assessment.

Anyone intending to post feedback in this thread is expected to be familiar with the reviewer guidelines:

https://github.com/dlang/DIPs/blob/master/docs/guidelines-reviewers.md

*Please stay on topic!*

Thanks in advance to all who participate.
November 11
On Monday, November 11, 2019 3:27:26 AM MST Mike Parker via Digitalmars-d wrote:
> This is the feedback thread for the first round of Community Review for DIP 1025, "Dynamic Arrays Only Shrink, Never Grow":
>
> https://github.com/dlang/DIPs/blob/1b525ec4c914c06bc286c1a6dc93bf1533ee56e 4/DIPs/DIP1025.md
>
> All review-related feedback on and discussion of the DIP should occur in this thread. The review period will end at 11:59 PM ET on November 25, or when I make a post declaring it complete.
>
> At the end of Round 1, if further review is deemed necessary, the DIP will be scheduled for another round of Community Review. Otherwise, it will be queued for the Final Review and Formal Assessment.
>
> Anyone intending to post feedback in this thread is expected to be familiar with the reviewer guidelines:
>
> https://github.com/dlang/DIPs/blob/master/docs/guidelines-reviewers.md
>
> *Please stay on topic!*
>
> Thanks in advance to all who participate.

Ouch. This would be a _huge_ breaking change. The ability to grow dynamic arrays is used all over the place. It's also one of the great benefits of D's dynamic array IMHO. The fact that you append to a dynamic array without caring where it came from can be incredibly useful, and if you really care about checking whether it has the capacity to grow or whether it's managed by the GC, we have functions to check for that.

How would this interact with stuff like assumeSafeAppend? There are code bases that take advantage of the combination of assumeSafeAppend so that they don't have to worry about managing allocations (since ~= takes care of it all), but they can still reuse the dynamic array and thus avoid reallocations if the GC-managed memory block that the dynamic array is a slice of is large enough to hold whatever data is appended. It looks to me like change would break the code in all such code bases.

As for Appender, it's designed around the idea that you're going to build an array and then use it rather than you're going to grow it or shrink it as needed, so it doesn't work anywhere near as well when dealing with an array shrinking, let alone growing and shrinking. It also does not work for the use case where you're passing dynamic arrays around and then appending to them, since then you don't have access to the Appender, just the dynamic array. Also, the ability to append to a dynamic array without worrying about whether it allocates or not such that the array can grow in place when it can (thus avoiding unnecessary allocations) but still have reallocations occur when it needs to is something that isn't going to work with Appender. In my experience, code operating on strings in particular takes advantage of being able to slice and append to dynamic arrays with impunity - which works particularly well with immutable, since then you don't have to worry about any of the characters being mutated. Changing such code to use Appender or ~ would be far more complicated and is likely to cause a lot more unnecessary allocations to occur.

Also, I fail to see how getting rid of the ability to append and reallocate dynamic arrays really helps with managing memory when the memory that the dynamic array is a slice of is not allocated by the GC. Dynamic arrays aren't reference-counted. You can have slices throughout the code which all refer to the same memory even if they never do an append operation. All of the tools that you'd need to use in terms of scope or pure or coding conventions or whatever to know whether a slice of that memory exists anywhere (and thus whether it's safe to free it) would be the same whether it can be appended to or not. And if you want to disable the ability to append to a dynamic array, you can always just mark the code as @nogc. Why do programmers who _do_ want to use the GC in their code need to be punished by making dynamic array worse, when anyone who doesn't want to use the GC can already explicitly add restrictions forbidding it?

Honestly, this seems like a prime case for how trying to make D work better with code that doesn't want to use the GC is making D worse for code that does want to use the GC. I really, really, really hope that this change does not make it into the language. From where I sit, it would make the language hands down worse. In fact, as far as usability goes, it would make D's strings _worse_ than C++'s strings. Sure, we'd still be able to get substrings more efficiently than C++ can, but not being able to append to a string makes it a pretty terrible string. At least C++'s strings can do that efficiently.

IMHO, where D's dynamic arrays and strings sit right now as far as memory allocation goes is the sweet spot. They're efficient at both growing and shrinking. If I wanted a language that was catering to the non-GC case over the GC case, I'd use C++. Improvements to D that make it work better with @nogc code without compromising code that uses the GC seems like a good idea to me, but this very much compromises the language for code that uses the GC. _Please_ don't do this.

- Jonathan M Davis



November 11
On Monday, 11 November 2019 at 10:27:26 UTC, Mike Parker wrote:
> This is the feedback thread for the first round of Community Review for DIP 1025, "Dynamic Arrays Only Shrink, Never Grow":

The rationale is severely lacking:

Any issue related to the GC is completely orthogonal to any issues with memory safety.
The first example is buggy, but is @system because it uses free, and thus the onus is on the user.
The second example is well defined because ~= copies if the capacity is exceeded[1].


The prior work section lists no references and does not present any logical connection to the ideas presented in the rationale.


The breaking changes are massive and not well argued for.
The suggested workaround of

1) a = a ~ b;

in place of

2) a ~= b;

is functional identical, thus any issues present in 2 are also present in 1. The suggestion that doing so in a loop would create more garbage is false as it is functionally identical.



The author should revisit the core assumptions made and rewrite the DIP accordingly with references for prior art. If the issue in example one is considered bad enough to warrant doing something about then that issue should be tacked specifically, starting perhaps with a druntime / GC option to later statistics/log when an unmanaged slice is appended to.

[1]:

void main()
{
    import std.stdio;
    enum { dead, alive }
    foreach (i;0 .. 10000)
    {
        int[] cat = new int[i+1];
        cat[i] = alive;
    	writeln(cat[i]);
        int[] b = cat;
        b ~= 1;
        b[i] = dead;
    	writeln(cat[i]);
        if (i % 1000)
        {
            import core.memory;
            GC.collect();
        }
    }
}
November 11
On Monday, 11 November 2019 at 10:27:26 UTC, Mike Parker wrote:
> This is the feedback thread for the first round of Community Review for DIP 1025, "Dynamic Arrays Only Shrink, Never Grow":
>
> https://github.com/dlang/DIPs/blob/1b525ec4c914c06bc286c1a6dc93bf1533ee56e4/DIPs/DIP1025.md
>
> All review-related feedback on and discussion of the DIP should occur in this thread. The review period will end at 11:59 PM ET on November 25, or when I make a post declaring it complete.
>
> At the end of Round 1, if further review is deemed necessary, the DIP will be scheduled for another round of Community Review. Otherwise, it will be queued for the Final Review and Formal Assessment.
>
> Anyone intending to post feedback in this thread is expected to be familiar with the reviewer guidelines:
>
> https://github.com/dlang/DIPs/blob/master/docs/guidelines-reviewers.md
>
> *Please stay on topic!*
>
> Thanks in advance to all who participate.

In my homebrew D-lite language, I used T[~] as the built-in type for "an array that carries capacity information, owns memory (T[auto~] but nevermind that) and can be appended to." This seemed to work well. I definitely think that if we deprecate ~, such a type should be built-in, because this is vital functionality. appender is just too weird a type (it's a weird class-struct hybrid??) to carry this functionality.

Simple things must be easy. We don't have to import std.array for T[], we shouldn't have to import it for T[~].

November 11
On Monday, 11 November 2019 at 11:17:26 UTC, Jonathan M Davis wrote:

> Honestly, this seems like a prime case for how trying to make D work better with code that doesn't want to use the GC is making D worse for code that does want to use the GC. I really, really, really hope that this change does not make it into the language. From where I sit, it would make the language hands down worse. In fact, as far as usability goes, it would make D's strings _worse_ than C++'s strings. Sure, we'd still be able to get substrings more efficiently than C++ can, but not being able to append to a string makes it a pretty terrible string. At least C++'s strings can do that efficiently.
>
> IMHO, where D's dynamic arrays and strings sit right now as far as memory allocation goes is the sweet spot. They're efficient at both growing and shrinking. If I wanted a language that was catering to the non-GC case over the GC case, I'd use C++. Improvements to D that make it work better with @nogc code without compromising code that uses the GC seems like a good idea to me, but this very much compromises the language for code that uses the GC. _Please_ don't do this.

I'll move on from D if this change is incorporated. It's a terrible idea that would break code that shouldn't be broken. It would also be a sign that D is effectively giving up on GC in an attempt to compete with Rust, and as such is no longer suited for general purpose programming.

November 11
I'll repeat what I said in the draft review, because I don't know whether my feedback wasn't noticed. In addition, I'll be more explicit.

In no practical way we can implement this. The breakage would most likely dwarf breaking autodecoding.

However, the above only applies to breaking appending to slices in general. I agree that if the slice is not either:

1. GC-allocated
2. referring to a static array.
3. referring to an initialization value of an array of `static` storage class

...then the chances of somebody appending to it are low enough that breaking might be viable. Even then, somebody might append to a mallocated slice by GC if there is another reference to the original slice lying around, but for me that sounds rare enough that finding memory leaks could justify the breakage.

I see two alternatives. We can simply discourage growing slices with the garbage collector. A transition switch would be introduced, but no removing `~` or `slice.length += 2` totally, or at least not for many years. We also could introduce a new appending operator, say >< (along with ><=), which behaves just like ~ but refuses to work with slices, and encourage to use that istead of the present append operator.

The other alternative is that slice growing by GC would check where the array is referring to. First, if it refers to null, the static memory area or the stack, check passed. Second, if it refers to anything the GC controls, check passed. Otherwise, it's assumed it's pointing to mallocated memory, or just corrupt - program terminated with an unrecoverable error. For @system code, the compiler could forgo these checks with the same rules as array bounds checks can be forgoed. Growing a slice of mallocated memory where the check is omitted would be undefined behaviour.


November 11
On Monday, 11 November 2019 at 10:27:26 UTC, Mike Parker wrote:
> This is the feedback thread for the first round of Community Review for DIP 1025, "Dynamic Arrays Only Shrink, Never Grow":

This will make for stronger typing and is therefore a-good-thing.

November 11
On Monday, November 11, 2019 4:33:00 AM MST Dukc via Digitalmars-d wrote:
> I'll repeat what I said in the draft review, because I don't know whether my feedback wasn't noticed. In addition, I'll be more explicit.
>
> In no practical way we can implement this. The breakage would most likely dwarf breaking autodecoding.
>
> However, the above only applies to breaking appending to slices in general. I agree that if the slice is not either:
>
> 1. GC-allocated
> 2. referring to a static array.
> 3. referring to an initialization value of an array of `static`
> storage class
>
> ...then the chances of somebody appending to it are low enough that breaking might be viable. Even then, somebody might append to a mallocated slice by GC if there is another reference to the original slice lying around, but for me that sounds rare enough that finding memory leaks could justify the breakage.
>
> I see two alternatives. We can simply discourage growing slices with the garbage collector. A transition switch would be introduced, but no removing `~` or `slice.length += 2` totally, or at least not for many years. We also could introduce a new appending operator, say >< (along with ><=), which behaves just like ~ but refuses to work with slices, and encourage to use that istead of the present append operator.
>
> The other alternative is that slice growing by GC would check where the array is referring to. First, if it refers to null, the static memory area or the stack, check passed. Second, if it refers to anything the GC controls, check passed. Otherwise, it's assumed it's pointing to mallocated memory, or just corrupt - program terminated with an unrecoverable error. For @system code, the compiler could forgo these checks with the same rules as array bounds checks can be forgoed. Growing a slice of mallocated memory where the check is omitted would be undefined behaviour.

IMHO, the ability to append to a dynamic array without caring whether it's backed by GC-allocated memory, malloc-ed memory, stack allocated memory, or whatever is a huge benefit. Right now, you can pass a dynamic array that isn't backed by the GC to a function, and it all just works even if the function needs to do a reallocation to make the array large enough. Doing so is not necessarily a bug and allows you to use non-GC allocated memory where appropriate while still being able to grow the array, and it does so without the code having to care about what kind of memory is currently backing the dynamic array. And if the code _does_ care for some reason, the array's capacity property will tell you whether the array can be grown in-place, and multiple functions in core.memory can be used to determine whether the dynamic array is a slice of GC-allocated memory.

- Jonathan M Davis



November 11
On Monday, 11 November 2019 at 10:27:26 UTC, Mike Parker wrote:
> This is the feedback thread for the first round of Community Review for DIP 1025, "Dynamic Arrays Only Shrink, Never Grow":

«The author is unaware of any slices in other languages that don't also carry a .capacity property to keep track of how much the slice can grow before it must be reallocated.»

C++20 span:

https://en.cppreference.com/w/cpp/container/span

Maybe you can achieve binary compatibility, somehow.

November 11
On Monday, 11 November 2019 at 11:51:09 UTC, Jonathan M Davis wrote:
>
> IMHO, the ability to append to a dynamic array without caring whether it's backed by GC-allocated memory, malloc-ed memory, stack allocated memory, or whatever is a huge benefit.

Yeah... And I definitely think all of these should continue to exist with the possible exception of malloced memory.

> Right now, you can pass a dynamic array that isn't backed by the GC to a function, and it all just works even if the function needs to do a reallocation to make the array large enough. Doing so is not necessarily a bug and allows you to use non-GC allocated memory where appropriate while still being able to grow the array, and it does so without the code having to care about what kind of memory is currently backing the dynamic array.

But chances are that if you grow a malloced slice bu GC, you cause a memory leak, and that's what Walter is trying to solve. On the other hand, there should be a way for the caller to explicitly allow a function to grow a GC-allocated slice, without touching the function, and my idea does not solve that. I dunno...


« First   ‹ Prev
1 2 3 4 5 6 7 8 9 10 11