October 20, 2009
On 2009-10-18 17:05:39 -0400, Walter Bright <newshound1@digitalmars.com> said:

> The purpose of T[new] was to solve the problems T[] had with passing T[] to a function and then the function resizes the T[]. What happens with the original?
> 
> The solution we came up with was to create a third array type, T[new], which was a reference type.
> 
> Andrei had the idea that T[new] could be dispensed with by making a "builder" library type to handle creating arrays by doing things like appending, and then delivering a finished T[] type. This is similar to what std.outbuffer and std.array.Appender do, they just need a bit of refining.
> 
> The .length property of T[] would then become an rvalue only, not an lvalue, and ~= would no longer be allowed for T[].
> 
> We both feel that this would simplify D, make it more flexible, and remove some awkward corner cases like the inability to say a.length++.
> 
> What do you think?

I never liked T[new] much from the beginning, so good riddance. :-)

But seriously, disallowing '~=' but not '~'? I can already see the newbies:

Q: Why can't I write '~=' as a shortcut for '~' on a slice? It works fine for every other operator, everywhere else.
A: Because appending is not very efficient.
Q: So '~' is more efficient than '~='?
A: Hum, no. They're both as inefficient, but...

-- 
Michel Fortin
michel.fortin@michelf.com
http://michelf.com/

October 20, 2009
On 2009-10-19 10:14:02 -0400, dsimcha <dsimcha@yahoo.com> said:

> == Quote from downs (default_357-line@yahoo.de)'s article
>> Walter Bright wrote:
>>> 
>>> Probably not. But you can rewrite:
>>> 
>>> a ~= stuff;
>>> 
>>> as:
>>> 
>>> a = a ~ stuff;
>>> 
>>> to make it work.
>> Is there any reason the first can't be a short-hand for the second?
> 
> Devil's advocate because I somewhat agree with you:  a = a ~ stuff; is so
> inefficient that it should be ugly.

I'd call that a premature optimization of the programmer's behavior. :-)

-- 
Michel Fortin
michel.fortin@michelf.com
http://michelf.com/

October 20, 2009
Andrei Alexandrescu wrote:
> grauzone wrote:
>> Andrei Alexandrescu wrote:
>>> Don wrote:
>>>> Walter Bright wrote:
>>>>> The purpose of T[new] was to solve the problems T[] had with passing T[] to a function and then the function resizes the T[]. What happens with the original?
>>>>>
>>>>> The solution we came up with was to create a third array type, T[new], which was a reference type.
>>>>>
>>>>> Andrei had the idea that T[new] could be dispensed with by making a "builder" library type to handle creating arrays by doing things like appending, and then delivering a finished T[] type. This is similar to what std.outbuffer and std.array.Appender do, they just need a bit of refining.
>>>>>
>>>>> The .length property of T[] would then become an rvalue only, not an lvalue, and ~= would no longer be allowed for T[].
>>>>>
>>>>> We both feel that this would simplify D, make it more flexible, and remove some awkward corner cases like the inability to say a.length++.
>>>>>
>>>>> What do you think?
>>>>
>>>> Since noone else seems to have said it: The fact that you're both willing to let it go, after having already invested a lot of time in it, is a good sign for the language. Well done.
>>>
>>> I'm relieved that somebody mentioned that :o). As soon as we gave up with T[new], people started to sell it to us. We should preemptively post about eliminating feature plans before actually implementing them.
>>>
>>> By the way: implementation of @property has been canceled.
>>
>> Yeah, let's just keep the language in the broken state it is, because we can't think of a better solution.
> 
> Silly me, I was thinking the humor was all too obvious.

It was only a joke? That's a relief.

> Andrei
October 20, 2009
On Sun, 18 Oct 2009 17:05:39 -0400, Walter Bright <newshound1@digitalmars.com> wrote:

> The purpose of T[new] was to solve the problems T[] had with passing T[] to a function and then the function resizes the T[]. What happens with the original?
>
> The solution we came up with was to create a third array type, T[new], which was a reference type.
>
> Andrei had the idea that T[new] could be dispensed with by making a "builder" library type to handle creating arrays by doing things like appending, and then delivering a finished T[] type. This is similar to what std.outbuffer and std.array.Appender do, they just need a bit of refining.
>
> The .length property of T[] would then become an rvalue only, not an lvalue, and ~= would no longer be allowed for T[].
>
> We both feel that this would simplify D, make it more flexible, and remove some awkward corner cases like the inability to say a.length++.
>
> What do you think?

At the risk of sounding like bearophile -- I've proposed 2 solutions in the past for this that *don't* involve creating a T[new] type.

1. Store the allocated length in the GC structure, then only allow appending when the length of the array being appended matches the allocated length.

2. Store the allocated length at the beginning of the array, and use a bit in the array length to determine if it starts at the beginning of the block.

The first solution has space concerns, and the second has lots more concerns, but can help in the case of having to do a GC lookup to determine if a slice can be appended (you'd still have to lock the GC to do an actual append or realloc).  I prefer the first solution over the second.

I like the current behavior *except* for appending.  Most of the time it does what you want, and the syntax is beautiful.

In regards to disallowing x ~= y, I'd propose you at least make it equivalent to x = x ~ y instead of removing it.

-Steve
October 20, 2009
On Tue, Oct 20, 2009 at 6:25 AM, Steven Schveighoffer <schveiguy@yahoo.com> wrote:
> On Sun, 18 Oct 2009 17:05:39 -0400, Walter Bright <newshound1@digitalmars.com> wrote:
>
>> The purpose of T[new] was to solve the problems T[] had with passing T[] to a function and then the function resizes the T[]. What happens with the original?
>>
>> The solution we came up with was to create a third array type, T[new], which was a reference type.
>>
>> Andrei had the idea that T[new] could be dispensed with by making a "builder" library type to handle creating arrays by doing things like appending, and then delivering a finished T[] type. This is similar to what std.outbuffer and std.array.Appender do, they just need a bit of refining.
>>
>> The .length property of T[] would then become an rvalue only, not an lvalue, and ~= would no longer be allowed for T[].
>>
>> We both feel that this would simplify D, make it more flexible, and remove some awkward corner cases like the inability to say a.length++.
>>
>> What do you think?
>
> At the risk of sounding like bearophile -- I've proposed 2 solutions in the past for this that *don't* involve creating a T[new] type.
>
> 1. Store the allocated length in the GC structure, then only allow appending when the length of the array being appended matches the allocated length.
>
> 2. Store the allocated length at the beginning of the array, and use a bit in the array length to determine if it starts at the beginning of the block.
>
> The first solution has space concerns, and the second has lots more concerns, but can help in the case of having to do a GC lookup to determine if a slice can be appended (you'd still have to lock the GC to do an actual append or realloc).  I prefer the first solution over the second.
>
> I like the current behavior *except* for appending.  Most of the time it does what you want, and the syntax is beautiful.
>
> In regards to disallowing x ~= y, I'd propose you at least make it equivalent to x = x ~ y instead of removing it.

If you're going to do ~= a lot then you should convert to the dynamic
array type.
If you're not going to do ~= a lot, then you can afford to write out x = x ~ y.

The bottom line is that it just doesn't make sense to append onto a "view" type.  It's really a kind of constness.  Having a view says the underlying memory locations you are looking at are fixed.  It doesn't make sense to imply there's an operation that can change those memory locations (other than shrinking the window to view fewer of them).

--bb
October 20, 2009
On Tue, 20 Oct 2009 11:10:20 -0400, Bill Baxter <wbaxter@gmail.com> wrote:

> On Tue, Oct 20, 2009 at 6:25 AM, Steven Schveighoffer
> <schveiguy@yahoo.com> wrote:
>> On Sun, 18 Oct 2009 17:05:39 -0400, Walter Bright
>> <newshound1@digitalmars.com> wrote:
>>
>>> The purpose of T[new] was to solve the problems T[] had with passing T[]
>>> to a function and then the function resizes the T[]. What happens with the
>>> original?
>>>
>>> The solution we came up with was to create a third array type, T[new],
>>> which was a reference type.
>>>
>>> Andrei had the idea that T[new] could be dispensed with by making a
>>> "builder" library type to handle creating arrays by doing things like
>>> appending, and then delivering a finished T[] type. This is similar to what
>>> std.outbuffer and std.array.Appender do, they just need a bit of refining.
>>>
>>> The .length property of T[] would then become an rvalue only, not an
>>> lvalue, and ~= would no longer be allowed for T[].
>>>
>>> We both feel that this would simplify D, make it more flexible, and remove
>>> some awkward corner cases like the inability to say a.length++.
>>>
>>> What do you think?
>>
>> At the risk of sounding like bearophile -- I've proposed 2 solutions in the
>> past for this that *don't* involve creating a T[new] type.
>>
>> 1. Store the allocated length in the GC structure, then only allow appending
>> when the length of the array being appended matches the allocated length.
>>
>> 2. Store the allocated length at the beginning of the array, and use a bit
>> in the array length to determine if it starts at the beginning of the block.
>>
>> The first solution has space concerns, and the second has lots more
>> concerns, but can help in the case of having to do a GC lookup to determine
>> if a slice can be appended (you'd still have to lock the GC to do an actual
>> append or realloc).  I prefer the first solution over the second.
>>
>> I like the current behavior *except* for appending.  Most of the time it
>> does what you want, and the syntax is beautiful.
>>
>> In regards to disallowing x ~= y, I'd propose you at least make it
>> equivalent to x = x ~ y instead of removing it.
>
> If you're going to do ~= a lot then you should convert to the dynamic
> array type.
> If you're not going to do ~= a lot, then you can afford to write out x = x ~ y.
>
> The bottom line is that it just doesn't make sense to append onto a
> "view" type.  It's really a kind of constness.  Having a view says the
> underlying memory locations you are looking at are fixed.  It doesn't
> make sense to imply there's an operation that can change those memory
> locations (other than shrinking the window to view fewer of them).

Having the append operation extend into already allocated memory is an optimization.  In this case, it's an optimization that can corrupt memory.

If we can make append extend into already allocated memory *and* not cause corruption, I don't see the downside.  And then there is one less array type to deal with (, create functions that handle, etc.).

Besides, I think Andrei's LRU solution is better than mine (and pretty much in line with it).

I still think having an Appender object or struct is a worthwhile thing, the "pre-allocate array then set length to zero" model is a hack at best.

-Steve
October 20, 2009
Steven Schveighoffer wrote:
> I still think having an Appender object or struct is a worthwhile thing, the "pre-allocate array then set length to zero" model is a hack at best.

Would that work with Andrei's append cache at all? Setting the length to zero and then appending is like taking a slice of length 0 and then appending.

Maybe introduce a write/readable .capacity property, that magically accesses the cache/GC?

> -Steve
October 20, 2009
On Tue, Oct 20, 2009 at 8:50 AM, Steven Schveighoffer <schveiguy@yahoo.com> wrote:
> On Tue, 20 Oct 2009 11:10:20 -0400, Bill Baxter <wbaxter@gmail.com> wrote:
>
>> On Tue, Oct 20, 2009 at 6:25 AM, Steven Schveighoffer <schveiguy@yahoo.com> wrote:
>>>
>>> On Sun, 18 Oct 2009 17:05:39 -0400, Walter Bright <newshound1@digitalmars.com> wrote:
>>>
>>>> The purpose of T[new] was to solve the problems T[] had with passing T[]
>>>> to a function and then the function resizes the T[]. What happens with
>>>> the
>>>> original?
>>>>
>>>> The solution we came up with was to create a third array type, T[new], which was a reference type.
>>>>
>>>> Andrei had the idea that T[new] could be dispensed with by making a
>>>> "builder" library type to handle creating arrays by doing things like
>>>> appending, and then delivering a finished T[] type. This is similar to
>>>> what
>>>> std.outbuffer and std.array.Appender do, they just need a bit of
>>>> refining.
>>>>
>>>> The .length property of T[] would then become an rvalue only, not an lvalue, and ~= would no longer be allowed for T[].
>>>>
>>>> We both feel that this would simplify D, make it more flexible, and
>>>> remove
>>>> some awkward corner cases like the inability to say a.length++.
>>>>
>>>> What do you think?
>>>
>>> At the risk of sounding like bearophile -- I've proposed 2 solutions in
>>> the
>>> past for this that *don't* involve creating a T[new] type.
>>>
>>> 1. Store the allocated length in the GC structure, then only allow
>>> appending
>>> when the length of the array being appended matches the allocated length.
>>>
>>> 2. Store the allocated length at the beginning of the array, and use a
>>> bit
>>> in the array length to determine if it starts at the beginning of the
>>> block.
>>>
>>> The first solution has space concerns, and the second has lots more
>>> concerns, but can help in the case of having to do a GC lookup to
>>> determine
>>> if a slice can be appended (you'd still have to lock the GC to do an
>>> actual
>>> append or realloc).  I prefer the first solution over the second.
>>>
>>> I like the current behavior *except* for appending.  Most of the time it does what you want, and the syntax is beautiful.
>>>
>>> In regards to disallowing x ~= y, I'd propose you at least make it equivalent to x = x ~ y instead of removing it.
>>
>> If you're going to do ~= a lot then you should convert to the dynamic
>> array type.
>> If you're not going to do ~= a lot, then you can afford to write out x = x
>> ~ y.
>>
>> The bottom line is that it just doesn't make sense to append onto a "view" type.  It's really a kind of constness.  Having a view says the underlying memory locations you are looking at are fixed.  It doesn't make sense to imply there's an operation that can change those memory locations (other than shrinking the window to view fewer of them).
>
> Having the append operation extend into already allocated memory is an optimization.  In this case, it's an optimization that can corrupt memory.
>
> If we can make append extend into already allocated memory *and* not cause corruption, I don't see the downside.  And then there is one less array type to deal with (, create functions that handle, etc.).
>
> Besides, I think Andrei's LRU solution is better than mine (and pretty much
> in line with it).
>
> I still think having an Appender object or struct is a worthwhile thing, the "pre-allocate array then set length to zero" model is a hack at best.

But you still have the problem Andrei posted.  Code like this:

void func(int[] x)
{
     x ~= 3;
     x[0] = 42;
}

it'll compile and maybe run just fine, but there's no way to know if the caller will see the 42 or not.   Unpredictable behavior like that is breeding grounds for subtle bugs.

Perhaps that potential for bugs can be reduced by turning off the LRU stuff in debug builds, and just making ~= reallocate always there. Since, as you said, it's an optimization, makes sense to only turn it on in release or maybe optimized builds.

To Andrei, do you really feel comfortable trying to explain this in your book?  It seems like it will be difficult to explain that ~= is sometimes efficient for appending but not necessarily if you're working with a lot of arrays because it actually keeps this cache under the hood that may or may not remember the actual underlying capacity of the array you're appending to, so you should probably use ArrayBuilder if you can, despite the optimization.

--bb
October 20, 2009
grauzone wrote:
> Steven Schveighoffer wrote:
>> I still think having an Appender object or struct is a worthwhile thing, the "pre-allocate array then set length to zero" model is a hack at best.
> 
> Would that work with Andrei's append cache at all? Setting the length to zero and then appending is like taking a slice of length 0 and then appending.
> 
> Maybe introduce a write/readable .capacity property, that magically accesses the cache/GC?

For my money, I'd get rid of that trick:

a.length = 1000;
a.length = 0;
for (...) a ~= x;


Andrei
October 20, 2009
Bill Baxter wrote:
> To Andrei, do you really feel comfortable trying to explain this in
> your book?  It seems like it will be difficult to explain that ~= is
> sometimes efficient for appending but not necessarily if you're
> working with a lot of arrays because it actually keeps this cache
> under the hood that may or may not remember the actual underlying
> capacity of the array you're appending to, so you should probably use
> ArrayBuilder if you can, despite the optimization.

I guess I'll try and let you all know.

Andrei