December 20, 2006
Andrei Alexandrescu (See Website For Email) wrote:
> Don Clugston wrote:
>> Andrei Alexandrescu (See Website For Email) wrote:
>>> Similarly, let's say that a group of revolutionaries convinces Walter (as I understand happened in case of using "length" and "$" inside slice expressions, which is a shame and an absolute disaster that must be undone at all costs) to implement "auto"
>>
>> This off-hand remark worries me. I presume that you mean being able to reference the length of a string, from inside the slice? (rather than simply the notation).
>> And the problem being that it requires a sliceable entity to know its length? Or is the problem more serious than that?
>> It's worrying because any change would break an enormous amount of code.
> 
> It would indeed break an enormous amount of code, but "all costs" includes "enormous costs". :o) A reasonable migration path is to deprecate them soon and make them illegal over the course of one year.
> 
> A small book could be written on just how bad language design is using "length" and "$" to capture slice size inside a slice expression. I managed to write two lengthy emails to Walter about them, and just barely got started. Long story short, "length" introduces a keyword through the back door, effectively making any use of "length" anywhere unrecommended and highly fragile. Using "$" is a waste of symbolic real estate to serve a narrow purpose; the semantics isn't naturally generalized to its logical conclusion; and the choice of symbol itself as a reminiscent of Perl's regexp is at best dubious ("#" would have been vastly better as it has count connotation in natural language, and making it into an operator would have fixed the generalization issue). As things stand now, the rules governing the popping up of "length" and "$" constitute a sudden boo-boo on an otherwise carefully designed expression landscape.
> 

Are you suggesting either / both:

slice = array[x .. array.length];
slice2 = array2[y .. #];

?

Since length and $ are pretty easily grep-able in the context of slice syntax, perhaps it's not a "huge" issue if these were deprecated now and then not supported in the span of a couple of 1.0.x releases or so (instead of a year)?

>> These issues you're raising seem to be far too fundamental to be fixed in the next few days, casting grave doubts on whether a D1.0 release on Jan 1 is a good idea.
> 
> The lvalue/rvalue issue is fundamental. I'm not in the position to assess whether it's a maker or breaker of D 1.0.
> 

Since one of the main drivers for 1.0 by Jan. 1, 2007 is to encourage / solidify library development, and since library design could be affected in a large way by this issue, I'd say it's best to figure this out before releasing 1.0.

> The "length"/"$" issue is not fundamental the same way that C's declaration syntax, Java's throw specifications, C++'s use of "<" and ">" for templates, and Mao Zedong's refusal to use a toothbrush are not fundamental. It will "just" go down in history as a huge embarrassment and a good resource for cheap shooters and naysayers. If I understand its genesis, it will also be a canonical example of why design by committee is bad.
> 

If indeed it will be an embarrassment, better to take care of this sooner (pre-1.0) rather than later, IMHO.

Thanks,

- Dave
December 20, 2006
Andrei Alexandrescu (See Website For Email) schrieb am 2006-12-20:
> Thomas Kuehne wrote:
>>> We then discussed another solution that I won't bore you with, as it's so wrong it hurts. My current thoughts navigate around two possible solutions. One is to make the storage part of the template parameters:
>>>
>>> template ident(S T) {
>>>    S T ident(S T e) { return e; }
>>> }
>>>
>>> When two adjacent symbols appear in a template parameter list, they unambiguously denote a storage class followed by a type. So "S" can bind to things like "in", "inout" etc., while "T" can bind to types.
>> 
>> Unambiguously?
>> 
>> template Templ_1(int i) {
>> }
>> 
>> Is "int" now a type or a storage class?
>
> It's a type because the symbol "int" is already bound as a keyword. There's no way (to the best of my knowledge) to specify two adjacent non-keyword symbols in a template parameter list.

enum S{ FOO }
template Templ(S T) { }
mixin Templ!(S.FOO) bar;

Do you consider S an keyword here?

Thomas


December 20, 2006
Andrei Alexandrescu (See Website For Email) wrote:
> Let me illustrate further why ident is important and what solution we should have for it. Consider C's response to ident:
> 
> #define IDENT(e) (e)
> 
> ...
>
> ...leading to the following implementation of ident:
> 
> auto ident(auto x) {
>   return x;
> }

I don't get it.

Why is it necessary (or even desirable) for functions to return lvalues?

I can see how it'd be an interesting trick, and I can appreciate the experimental curiosity about how the language (and the implementation) should cope with the explicit handling of lvalues.

But I can't think of a real-world use case.

Are there languages where this is currently possible? How do they implement it? And, much more importantly, what do people use it for?

--benji
December 20, 2006
On Wed, 20 Dec 2006 06:24:28 -0800, Andrei Alexandrescu (See Website For
Email) wrote:


> A small book could be written on just how bad language design is using "length" and "$" to capture slice size inside a slice expression. I managed to write two lengthy emails to Walter about them, and just barely got started.

Please share your thoughts here if you can too.

> Long story short, "length" introduces a keyword through the back door, effectively making any use of "length" anywhere unrecommended and highly fragile.

There is no arguement from me on that score.

> Using "$" is a waste of symbolic real estate to serve a narrow purpose;

By that do you mean that the symbol "$" could be better utilized elsewhere in the language?

> the semantics isn't naturally generalized to its logical conclusion;

I have no idea what that statement means. The semantics of *what*? Define "naturally generalized" in absolute terms without recourse to opinion. What is the "logical conclusion" you talk of?

> and the choice of symbol itself as a reminiscent of Perl's regexp is at best dubious ("#" would have been vastly better as it has count connotation in natural language

The concept was to have a very short symbol to represent the array's current element count. A number of different symbols were put forward, "#" being one of them. Walter rejected that one because it was already used for something else - the 'special token sequences' construct, such as "#line". Also symbols that consisted of identifier characters all have the same problem as "length" does. I favoured "$" because it was a single-character symbol and is already used in similar concepts inside regular expression syntaxes.

Although the exact symbol that is 'finally' decided upon is not a burning issue for me, there would have to be a very solid argument for the provable superiority of that one symbol over the rest. And currently, the choice between "$" and "#" is equivalent in my mind.

> and making it into an operator would have fixed the generalization issue

Would this lead to an opLength() method available for overloading?

> As things stand now, the rules governing the popping up of "length" and "$" constitute a sudden boo-boo on an otherwise carefully designed expression landscape.

Still sounds like an opinion and not a fact, in my opinion ;-)

-- 
Derek
December 20, 2006
Derek Parnell wrote:
> The concept was to have a very short symbol to represent the array's
> current element count. A number of different symbols were put forward, "#"
> being one of them. Walter rejected that one because it was already used for
> something else - the 'special token sequences' construct, such as "#line".
> Also symbols that consisted of identifier characters all have the same
> problem as "length" does. I favoured "$" because it was a single-character
> symbol and is already used in similar concepts inside regular expression
> syntaxes.

The "[..$]" syntax is also present in ColdC and its relatives (including my Bovis), so it was familiar to me from the beginning.  (That said I still harbor thoughts that $ could be used for other things... but honestly, I think the syntax would be unambiguous: a lone $ as the right hand side of a slice expression should easily enough be distinguishable from a $ anywhere followed by something, like an identifier.)

Which leads me to another thought.  One other operator that ColdC and family posesses is the @ for list splicing.  Useless sample ColdC:

# var foo, bar, result;
#
# foo = {1, 2, 3};
# bar = {4, 5, 6};
# result = {@foo, @bar};

The 'result' variable now equals {1, 2, 3, 4, 5, 6}.  Perhaps we could get something similar in D, also using @, and then I think @ would possibly be a logical choice for denoting "until end" in slices.  I also think an 'opLength' operator is not a bad idea, but it might be best just to require a .length property of some kind be present, since presumably any class that exposes itself to slicing would likely also have a length concept of some kind.  (Contrary examples welcome.)

-- Chris Nicholson-Sauls
December 20, 2006
Benji Smith wrote:
> Andrei Alexandrescu (See Website For Email) wrote:
>> Let me illustrate further why ident is important and what solution we should have for it. Consider C's response to ident:
>>
>> #define IDENT(e) (e)
>>
>  > ...
>  >
>> ...leading to the following implementation of ident:
>>
>> auto ident(auto x) {
>>   return x;
>> }
> 
> I don't get it.
> 
> Why is it necessary (or even desirable) for functions to return lvalues?

For functions, I essentially agree.  There aren't many use cases for it, and for the few niche ones a pointer will often suffice.  But for methods of classes on the other hand, it can be valuable at times.

> Are there languages where this is currently possible?

C++, by returning a referance.

-- Chris Nicholson-Sauls
December 20, 2006
Chris Nicholson-Sauls wrote:
> The "[..$]" syntax is also present in ColdC and its relatives (including my Bovis), so it was familiar to me from the beginning.  (That said I still harbor thoughts that $ could be used for other things... but honestly, I think the syntax would be unambiguous: a lone $ as the right hand side of a slice expression should easily enough be distinguishable from a $ anywhere followed by something, like an identifier.)
> 

FWIW $ is not only used for the RHS of a slice

char[] str;
str[$/2..$];	// 2nd half of array
str[$-1];	// last element in array
str[$-5..$];	// last 5 things in array
str[$-10..10];	// um... well... you get the idea

> Which leads me to another thought.  One other operator that ColdC and family posesses is the @ for list splicing.  Useless sample ColdC:
> 
> # var foo, bar, result;
> #
> # foo = {1, 2, 3};
> # bar = {4, 5, 6};
> # result = {@foo, @bar};
> 
> The 'result' variable now equals {1, 2, 3, 4, 5, 6}.

I would think this would be the same thing.

auto foo = [1,2,3];
auto bar = [4,5,6];
auto result = foo ~ bar;

am I missing somethign?
> 
> -- Chris Nicholson-Sauls
December 20, 2006
BCS wrote:
> Chris Nicholson-Sauls wrote:
>> The "[..$]" syntax is also present in ColdC and its relatives (including my Bovis), so it was familiar to me from the beginning.  (That said I still harbor thoughts that $ could be used for other things... but honestly, I think the syntax would be unambiguous: a lone $ as the right hand side of a slice expression should easily enough be distinguishable from a $ anywhere followed by something, like an identifier.)
>>
> 
> FWIW $ is not only used for the RHS of a slice
> 
> char[] str;
> str[$/2..$];    // 2nd half of array
> str[$-1];    // last element in array
> str[$-5..$];    // last 5 things in array
> str[$-10..10];    // um... well... you get the idea

Ack.  Having never used anything quite like that before, I guess I had assumed the $ only had meaning as I described above.  Still, it could be possible.

>> Which leads me to another thought.  One other operator that ColdC and family posesses is the @ for list splicing.  Useless sample ColdC:
>>
>> # var foo, bar, result;
>> #
>> # foo = {1, 2, 3};
>> # bar = {4, 5, 6};
>> # result = {@foo, @bar};
>>
>> The 'result' variable now equals {1, 2, 3, 4, 5, 6}.
> 
> I would think this would be the same thing.
> 
> auto foo = [1,2,3];
> auto bar = [4,5,6];
> auto result = foo ~ bar;
> 
> am I missing somethign?

Not really, no.  But consider:

# ColdC                          D
#
# result = {@foo, 0, @bar};      result = foo ~ [0] ~ bar;
# result = {42, @someFunc()};    result = [42] ~ someFunc();
# result = {@foo, 1, @foo, 2};   result = foo ~ [1] ~ foo ~ [2];
# result = {3, 6, @myConst, 9};  result = [3, 6] ~ myConst.dup ~ [9];

It becomes part of the literal syntax, which makes things cleaner in most elaborate cases.  Just something I enjoy over there that I wouldn't mind seeing from time to time over here.  :)

-- Chris Nicholson-Sauls
December 20, 2006
Chris Nicholson-Sauls wrote:
> BCS wrote:
>> Chris Nicholson-Sauls wrote:
>>> The "[..$]" syntax is also present in ColdC and its relatives (including my Bovis), so it was familiar to me from the beginning.  (That said I still harbor thoughts that $ could be used for other things... but honestly, I think the syntax would be unambiguous: a lone $ as the right hand side of a slice expression should easily enough be distinguishable from a $ anywhere followed by something, like an identifier.)
>>>
>>
>> FWIW $ is not only used for the RHS of a slice
>>
>> char[] str;
>> str[$/2..$];    // 2nd half of array
>> str[$-1];    // last element in array
>> str[$-5..$];    // last 5 things in array
>> str[$-10..10];    // um... well... you get the idea
> 
> Ack.  Having never used anything quite like that before, I guess I had assumed the $ only had meaning as I described above.  Still, it could be possible.

These are very possible and are at times very useful. Though the last example is perhaps pushing it in the latter department ;).
But yeah, $ is valid inside any [] pair, whether as (part of) an index or as (part of) either side of a slice.

>>> Which leads me to another thought.  One other operator that ColdC and family posesses is the @ for list splicing.  Useless sample ColdC:
>>>
>>> # var foo, bar, result;
>>> #
>>> # foo = {1, 2, 3};
>>> # bar = {4, 5, 6};
>>> # result = {@foo, @bar};
>>>
>>> The 'result' variable now equals {1, 2, 3, 4, 5, 6}.
>>
>> I would think this would be the same thing.
>>
>> auto foo = [1,2,3];
>> auto bar = [4,5,6];
>> auto result = foo ~ bar;
>>
>> am I missing somethign?
> 
> Not really, no.  But consider:
> 
> # ColdC                          D
> #
> # result = {@foo, 0, @bar};      result = foo ~ [0] ~ bar;

You don't need to surround a non-array by [] if concatenating with an array of the same type:
                                   result = foo ~ 0 ~ bar;
will work just fine.

> # result = {42, @someFunc()};    result = [42] ~ someFunc();

                                   result = 42 ~ someFunc();

> # result = {@foo, 1, @foo, 2};   result = foo ~ [1] ~ foo ~ [2];

                                   result = foo ~ 1 ~ foo ~ 2;

> # result = {3, 6, @myConst, 9};  result = [3, 6] ~ myConst.dup ~ [9];

                                   result = [3, 6] ~ myConst ~ 9;

The first []s are still needed, unless you add parentheses to group the 6 to myConst (and possibly the 9), but the last ones are unnecessary. Also, .dup is completely useless here (presuming myConst is an array) since ~ always allocates. (Note that repeated ~s in the same expression only allocate once though, at least in DMD)
December 21, 2006
Chris Nicholson-Sauls wrote:
> BCS wrote:
>> Chris Nicholson-Sauls wrote:
> Not really, no.  But consider:
> 
> # ColdC                          D
> #
> # result = {@foo, 0, @bar};      result = foo ~ [0] ~ bar;
> # result = {42, @someFunc()};    result = [42] ~ someFunc();
> # result = {@foo, 1, @foo, 2};   result = foo ~ [1] ~ foo ~ [2];
> # result = {3, 6, @myConst, 9};  result = [3, 6] ~ myConst.dup ~ [9];
> 
> It becomes part of the literal syntax, which makes things cleaner in most elaborate cases.  Just something I enjoy over there that I wouldn't mind seeing from time to time over here.  :)

I think that syntax is a little more attractive than ~ for some cases. It makes it clearer that you're building a list.  We don't say [] ~ 1 ~ 2 ~ 3 to make the array [1,2,3], after all.  But for that reason (because it's generalizing array literals) I think it should use [] instead of {}.  So

  result = [@foo, 0, @bar];

To me that certainly does make it clearer that I'm making a list out of lists.
The @ as a special symbol just for this kind of bothers me, though.  And how would it work for user-defined types?  I guess it could just be turned into the equivalent opCat calls...

--bb