August 09, 2009 T[new] | ||||
---|---|---|---|---|
| ||||
D has a number of subtle problems (performance and semantic) that arise when arrays are resized. The solution is to separate resizeable array types from slices. Slices will retain the old: T[] slice; syntax. Resizeable arrays will be declared as: T[new] array; The new expression: new T[10] will return a T[new]. T[new] will implicitly convert to T[], but not the other way. slice.length will become read-only. Under the hood, a T[new] will be a single pointer to a library defined type. This library defined type will likely contain three properties: size_t length; T* ptr; size_t capacity; The usual array operations will work on T[new] as well as T[]. Doing this change will: 1. fix many nasties at the edges of array semantics 2. make arrays implementable on .net 3. make clear in function signatures if the function can resize the array or not |
August 09, 2009 Re: T[new] | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | Yay. What will happen with slices over a T[new] when the underlying T[new] must be moved due to resizing? The behavior needs to be at least well specified, if not well defined.
Walter Bright wrote:
> D has a number of subtle problems (performance and semantic) that arise when arrays are resized. The solution is to separate resizeable array types from slices. Slices will retain the old:
>
> T[] slice;
>
> syntax. Resizeable arrays will be declared as:
>
> T[new] array;
>
> The new expression:
>
> new T[10]
>
> will return a T[new].
>
> T[new] will implicitly convert to T[], but not the other way.
>
> slice.length will become read-only.
>
> Under the hood, a T[new] will be a single pointer to a library defined type. This library defined type will likely contain three properties:
>
> size_t length;
> T* ptr;
> size_t capacity;
>
> The usual array operations will work on T[new] as well as T[].
>
> Doing this change will:
>
> 1. fix many nasties at the edges of array semantics
>
> 2. make arrays implementable on .net
>
> 3. make clear in function signatures if the function can resize the array or not
|
August 09, 2009 Re: T[new] | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | That's great. I think this would finally fix some of the more pressing issues of D. I have a question: will T[new] a; allocate something? Or will the allocation of the hidden library array object delayed until the length is set to non null? If the library array object is not allocated, will a.ptr, a.capacity and a.length simply return null? |
August 09, 2009 Re: T[new] | ||||
---|---|---|---|---|
| ||||
Posted in reply to Brad Roberts | Brad Roberts wrote:
> Yay. What will happen with slices over a T[new] when the underlying T[new] must
> be moved due to resizing? The behavior needs to be at least well specified, if
> not well defined.
Great question! It's no different from what happens with any container when its contents change and there are references to the old content.
What won't happen is the slice won't be pointing to unallocated memory, thanks to the gc. Resizing the T[new] will not cause the old contents to be deleted.
The slice will either point to the old content, if the resize caused a move, or the new content, if it was resized in place. So it will still be implementation-defined behavior.
But the likelihood of getting caught by such behavior is much less likely with T[new], as one can clearly see in the code where the resizeables are, rather than the current situation where any slice could be resized by anyone.
I think that taking a slice of a T[new], then resizing the T[new], will be rare (or at least should be). Normally, a T[new] will be built, and when it is all done, it is converted to a T[] and there is no longer any way to resize it.
|
August 09, 2009 Re: T[new] | ||||
---|---|---|---|---|
| ||||
Posted in reply to grauzone | grauzone wrote: > will > > T[new] a; > > allocate something? Nope. > Or will the allocation of the hidden library array object delayed until the length is set to non null? Yes. > If the library array object is not allocated, will a.ptr, a.capacity and a.length simply return null? Yes. Clearly, those properties will have to be functions under the hood. So T[new] operations will be a bit slower than for slices. For faster indexing, you'd probably want to do: auto slice = a[]; and then operate on the slice. |
August 09, 2009 Re: T[new] | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | By the way, T[new] is Andrei's idea, as well as it being his idea to make it a reference type. |
August 09, 2009 Re: T[new] | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | Walter Bright: I like this general proposal, it sounds like something that can improve D a little. There are many other things that have to be improved in D, but baby steps are enough to go somewhere :-) >Slices will retain the old: > T[] slice; > syntax. Resizeable arrays will be declared as: > T[new] array; Such syntaxes have to be chosen wisely. I don't fully understand that syntax. And maybe I don't fully like it. I think the default (simpler) syntax has to be the most flexible and safer construct, that is resizable arrays. Slices can be seen as an optimization, so they can have a bit longer syntax. > Under the hood, a T[new] will be a single pointer to a library defined > type. This library defined type will likely contain three properties: > size_t length; > T* ptr; > size_t capacity; Weren't you suggestion to use the start pointer - end pointer instead? > 2. make arrays implementable on .net I don't care of such thing. dotnet already has C# and C# is probably better than D, and it's similar anyway. So I don't think people will use D on dotnet. So even if creating a D for dotnet can be positive, I don't want D2 to change its design to allow a better implementation on dotnet. The question by Brad Roberts looks important, it has to find some kind of answer: >What will happen with slices over a T[new] when the underlying T[new] must be moved due to resizing?< Bye, bearophile |
August 09, 2009 Re: T[new] | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | Walter Bright:
> Under the hood, a T[new] will be a single pointer to a library defined type. This library defined type will likely contain three properties:
I have another question: are there some performance penalities in using such arrays for normal random access operations?
Bye,
bearophile
|
August 09, 2009 Re: T[new] | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | Walter Bright wrote:
> Brad Roberts wrote:
>> Yay. What will happen with slices over a T[new] when the underlying
>> T[new] must
>> be moved due to resizing? The behavior needs to be at least well
>> specified, if
>> not well defined.
>
> Great question! It's no different from what happens with any container when its contents change and there are references to the old content.
>
> What won't happen is the slice won't be pointing to unallocated memory, thanks to the gc. Resizing the T[new] will not cause the old contents to be deleted.
>
> The slice will either point to the old content, if the resize caused a move, or the new content, if it was resized in place. So it will still be implementation-defined behavior.
>
> But the likelihood of getting caught by such behavior is much less likely with T[new], as one can clearly see in the code where the resizeables are, rather than the current situation where any slice could be resized by anyone.
>
> I think that taking a slice of a T[new], then resizing the T[new], will be rare (or at least should be). Normally, a T[new] will be built, and when it is all done, it is converted to a T[] and there is no longer any way to resize it.
As expected, but like I said.. this needs to be clear in the spec. Because something like this is just confusing (and yes, I know it's not new behavior):
auto a = new int[10];
a[0] = 1;
auto s1 = a[0..1];
a.length = <some value large enough to force a move>;
auto s2 = a[0..1];
s2[0] = 2;
assert(s1[0] == s2[1]); // fail
Later,
Brad
|
August 09, 2009 Re: T[new] | ||||
---|---|---|---|---|
| ||||
Posted in reply to bearophile | bearophile wrote:
> Walter Bright:
>> Under the hood, a T[new] will be a single pointer to a library defined type. This library defined type will likely contain three properties:
>
> I have another question: are there some performance penalities in using such arrays for normal random access operations?
One extra indirection, usually nearby.
Andrei
|
Copyright © 1999-2021 by the D Language Foundation