May 11, 2009
On Sat, 09 May 2009 19:15:59 -0400, Derek Parnell <derek@psych.ward> wrote:

> On Sat, 09 May 2009 11:43:09 -0500, Andrei Alexandrescu wrote:
>
>> Consider:
>>
>> uint fun();
>> int gun();
>> ...
>> int[] a = new int[5];
>> a[fun] = gun;
>>
>> Which should be evaluated first, fun() or gun()? It's a rather arbitrary
>> decision. C/C++ don't even define an order. Python chooses
>> left-to-right, EXCEPT for assignment, which is right-hand side first.
>> Lisp and C# choose consistent left-to-right. I don't like exceptions and
>> I'd like everything to be left-to-right. However, this leads to some odd
>> cases. Consider this example in TDPL:
>>
>> import std.stdio, std.string;
>>
>> void main() {
>>    uint[string] dic;
>>    foreach (line; stdin.byLine) {
>>      string[] words = split(strip(line));
>>      foreach (word; words) {
>>        if (word in dic) continue; // nothing to do
>>        uint newID = dic.length;
>>        dic[word] = newID;
>>        writeln(newID, '\t', word);
>>      }
>>    }
>> }
>>
>> If we want to get rid of newID, we'd write:
>>
>>        writeln(dic.length, '\t', word);
>>        dic[word] = dic.length;
>>
>> by the Python rule, and
>>
>>        writeln(dic.length, '\t', word);
>>        dic[word] = dic.length - 1;
>>
>> by the C# rule.
>>
>> What's best?
>
> I'm sure about 'best', but I'd prefer the Python method.

Think you meant 'not sure' :)

>
> The example is similar to ...
>
>     array = array ~ array.length;
>
> in as much as the result of the assignment is that the array length
> changes, but here it more easy to see that the pre-assignment length is
> being used by the RHS.
>
> In COBOL-like syntax ...
>
>    move dic.length to dic[word].
>
> it is also more obvious what the coder's intentions were.
>
> In assembler-like syntax (which is what eventually gets run, of course) ...
>
>    mov regA, dic.length
>    mov dic[word], regA
>
> It just seems counter-intuitive that the target expression's side-effects
> should influence the source expression.
>

This reasoning makes the most sense, but let's leave COBOL out of it :)

I vote for the Python method too.  It's how my brain sees the expression.

Also consider like this:

uint len;

mydic[x] = len = mydic.length;

Now, it's even more obvious that len = mydic.length should be immune to the effects of mydic[x].  Longer chained assignment expressions seem like they would make the problem even harder to understand if it's all evaluated left to right.  You may even make code more bloated because of it.

For example:

mydic[x] = mydic[y] = mydic[z] = mydic.length;

if evaluating right to left, this looks like:

1. calculate mydic.length, store it in register A.
2. lookup mydic[z], if it doesn't exist, add it.  Store register A to it.
3. lookup mydic[y], if it doesn't exist, add it.  Store register A to it.
4. ditto for mydic[x]

If evaluating left to right, this looks like:

1. lookup mydic[x], if it doesn't exist, add it.  Store a reference to it on the stack.
2. lookup mydic[y], if it doesn't exist, add it.  Store a reference to it on the stack.
3. lookup mydic[z], if it doesn't eixst, add it.  Store the reference to it in register B.
4. calculate mydic.length, store it in register A.  Store the result in the reference pointed to by register B.
5. pop register B from the stack, store register A to the value it references.
6. Repeat step 5.

Two extra steps, and I have to use a stack.  Maybe 3 chained assignments would be easy to store without a stack, but try 10 chained assignments.

I'd think the compiler code to evaluate right to left would be simpler also, because you can reduce the expression at every assignment.

-Steve
May 11, 2009
On Mon, 11 May 2009 07:34:38 -0400, Steven Schveighoffer <schveiguy@yahoo.com> wrote:

> For example:
>
> mydic[x] = mydic[y] = mydic[z] = mydic.length;
>
> if evaluating right to left, this looks like:
>
> 1. calculate mydic.length, store it in register A.
> 2. lookup mydic[z], if it doesn't exist, add it.  Store register A to it.
> 3. lookup mydic[y], if it doesn't exist, add it.  Store register A to it.
> 4. ditto for mydic[x]
>
> If evaluating left to right, this looks like:
>
> 1. lookup mydic[x], if it doesn't exist, add it.  Store a reference to it on the stack.
> 2. lookup mydic[y], if it doesn't exist, add it.  Store a reference to it on the stack.
> 3. lookup mydic[z], if it doesn't eixst, add it.  Store the reference to it in register B.
> 4. calculate mydic.length, store it in register A.  Store the result in the reference pointed to by register B.
> 5. pop register B from the stack, store register A to the value it references.
> 6. Repeat step 5.
>
> Two extra steps, and I have to use a stack.  Maybe 3 chained assignments would be easy to store without a stack, but try 10 chained assignments.
>
> I'd think the compiler code to evaluate right to left would be simpler also, because you can reduce the expression at every assignment.

BTW, I'm curious to know how Java does this...

-Steve
May 11, 2009
Michel Fortin wrote:

>           arra[i++] = arrb[j]; // how can the compiler issue an
>           error for this?


assert( &i != &j);

-manfred

May 11, 2009
On Mon, 11 May 2009 08:20:07 -0400, Manfred Nowak <svv1999@hotmail.com> wrote:

> Michel Fortin wrote:
>
>>           arra[i++] = arrb[j]; // how can the compiler issue an
>>           error for this?
>
>
> assert( &i != &j);
>
> -manfred

That is not a compiler error, it is an inserted runtime error.

-Steve
May 11, 2009
Michel Fortin Wrote:

> On 2009-05-11 05:49:01 -0400, Georg Wrede <georg.wrede@iki.fi> said:
> 
> > Andrei Alexandrescu wrote:
> >> Consider:
> >> 
> >> uint fun();
> >> int gun();
> >> ...
> >> int[] a = new int[5];
> >> a[fun] = gun;
> >> 
> >> Which should be evaluated first, fun() or gun()?
> > 
> > arra[i] = arrb[i++];
> > 
> > arra[i++] = arrb[i];
> > 
> > I'm not sure that such dependences are good code.
> > 
> > By stating a definite order between lvalue and rvalue, you would actually encourage this kind of code.
> 
> Well, I agree with you that we shouldn't encourage this kind of code. But leaving it undefined (as in C) isn't a good idea because even if it discourages people from relying on it, it also makes any well tested code potentially buggy when switching compiler.
> 
> You could simply make it an error in the language to avoid that being written in the first place. But even then you can't catch all the cases statically. For instance, two different pointers or references can alias the same value, as in:
> 
> 	int i;
> 	func(i, i);
> 
> 	void func(ref int i, ref int j)
> 	{
> 		arra[i++] = arrb[j]; // how can the compiler issue an error for this?
> 	}

D2 could have no ordering guarantees, and simply give an error when reordering could effect impure operations. Flow analysis could relax this rule a bit. Local primitives that have not escaped are immune to side effects affecting other variables.

> So even if you make it an error for the obvious cases, you still need to define the evaluation order for the ones the compiler can't catch.
> 
> And, by the way, I don't think we should make it an error even for the so-called obvious cases. Deciding what's obvious and what is not is going to complicate the rules more than necessary.
> 
> 
> -- 
> Michel Fortin
> michel.fortin@michelf.com
> http://michelf.com/
> 

May 11, 2009
Consider that mathematically speaking, an array is a function. And an assignment to an array element actually changes the function.

A[i] = E;

is actually the same as

A = A[E/i];,

where the right-hand side reads: "A where i yields E" (notation not to be confused with division). It is formally defined:

A[E/i][j] == E    (if i == j)
             A[j] (if i != j).

Of course, there are no side-effects in mathematics, but I believe it's beneficial to try to keep as many well-known mathematical identities (like that one) valid in the face of chaos.

So your first example would then be equivalent with

a = a[gun/fun];,

which still leaves the question of side-effect evaluation order. The second example would read:

dic = dic[dic.length/word];,

which would suggest using the old dic.length.

-- 
Michiel Helvensteijn

May 11, 2009
Steven Schveighoffer wrote:
> On Mon, 11 May 2009 08:20:07 -0400, Manfred Nowak <svv1999@hotmail.com> wrote:
> 
>> Michel Fortin wrote:
>>
>>>           arra[i++] = arrb[j]; // how can the compiler issue an
>>>           error for this?
>>
>>
>> assert( &i != &j);
>>
>> -manfred
> 
> That is not a compiler error, it is an inserted runtime error.

Besides, it's just a particular case. Generally you can't tell modularly whether two expressions change the same variable.

Andrei
May 11, 2009
Jason House wrote:
> Michel Fortin Wrote:
> 
>> On 2009-05-11 05:49:01 -0400, Georg Wrede <georg.wrede@iki.fi> said:
>>
>>> Andrei Alexandrescu wrote:
>>>> Consider:
>>>>
>>>> uint fun();
>>>> int gun();
>>>> ...
>>>> int[] a = new int[5];
>>>> a[fun] = gun;
>>>>
>>>> Which should be evaluated first, fun() or gun()?
>>> arra[i] = arrb[i++];
>>>
>>> arra[i++] = arrb[i];
>>>
>>> I'm not sure that such dependences are good code.
>>>
>>> By stating a definite order between lvalue and rvalue, you would actually encourage this kind of code.
>>
>> Well, I agree with you that we shouldn't encourage this kind of code. But leaving it undefined (as in C) isn't a good idea because even if it discourages people from relying on it, it also makes any well tested code potentially buggy when switching compiler.
> 
> D2 could have no ordering guarantees, and simply give an error when
reordering could effect impure operations. Flow analysis could relax
this rule a bit. Local primitives that have not escaped are immune to
side effects affecting other variables.
> 
>> So even if you make it an error for the obvious cases, you still need to define the evaluation order for the ones the compiler can't catch.


C didn't define it for good reason. It should not be used, period.

Defining it in any way, or forbidding it, both mean that the compiler writer has to write lines of code to *try* to analyse it somehow. D is not a language for the infantile (even if I strongly advocate its use in language education), so we don't have to make this a bicycle with assist-wheels.

Walter's time is better spent on things that give more reward and take less of his time. And Andrei's, too.
May 11, 2009
Georg Wrede wrote:
> Andrei Alexandrescu wrote:
>> Consider:
>>
>> uint fun();
>> int gun();
>> ...
>> int[] a = new int[5];
>> a[fun] = gun;
>>
>> Which should be evaluated first, fun() or gun()? 
> 
> arra[i] = arrb[i++];
> 
> arra[i++] = arrb[i];
> 
> I'm not sure that such dependences are good code.
> 
> By stating a definite order between lvalue and rvalue, you would actually encourage this kind of code.

By not stating it, I introduce a gratuitous nonportability.

Andrei
May 11, 2009
Steven Schveighoffer wrote:
> For example:
> 
> mydic[x] = mydic[y] = mydic[z] = mydic.length;


I distinctly remember Walter discouraging chained assignments in the doccs, already in the very early versions of D.