December 23, 2013
On 12/21/2013 6:21 AM, bearophile wrote:
> aldanor:
>
>> So should this considered a bug then and be filed?
>
> I think so. I have a related EnhancementRequest open, but I have to
> close it down or modify it...
>
> Bye,
> bearophile

int mightUpdate(int& x)
{
    ...
    return x;
}

{
    ...
    a[mightUpdate(i)] = mightUpdate(i);
    ...
}

Is this also a bug?  How would the compiler know whether to emit a diagnostic for this case?

Dave

December 24, 2013
On 12/23/2013 12:39 PM, David Held wrote:
> On 12/21/2013 6:21 AM, bearophile wrote:
>> aldanor:
>>
>>> So should this considered a bug then and be filed?
>>
>> I think so. I have a related EnhancementRequest open, but I have to
>> close it down or modify it...
>>
>> Bye,
>> bearophile
>
> int mightUpdate(int& x)
> {
>     ...
>     return x;
> }
>
> {
>     ...
>     a[mightUpdate(i)] = mightUpdate(i);
>     ...
> }
>
> Is this also a bug?  How would the compiler know whether to emit a diagnostic for this case?
>
> Dave
>
>
Yes, that should be an error.  And the compiler could notice that the function parameter values are allowed to be modified, and forbid such use.  As it should.
After all, who's going to know which call would modify the value of i?  Of course, it could be protocol that the functions included within the statement are all called before the local values are calculated, and that if the same syntax appears twice, the final value of the syntax is the value used...but that, while reasonable for the compiler, is hell on people reading the code.

Note that one can make several different plausible arguments as to the order in which the code should be evaluated.  When such is possible, the code should be an error.  And when it relies on rarely used rules, it should probably be an error.  (E.g., if you define that the location into which values will be stored must be evaluated before the expression to be stored is evaluated, you have a case that would be well defined, but it's a "rare event" kind of rule, and unless there's good reason, it should be an error.)

-- 
Charles Hixson

December 24, 2013
On Mon, Dec 23, 2013 at 04:12:05PM -0800, Charles Hixson wrote:
> On 12/23/2013 12:39 PM, David Held wrote:
[...]
> >int mightUpdate(int& x)
> >{
> >    ...
> >    return x;
> >}
> >
> >{
> >    ...
> >    a[mightUpdate(i)] = mightUpdate(i);
> >    ...
> >}
> >
> >Is this also a bug?  How would the compiler know whether to emit a diagnostic for this case?
> >
> >Dave
> >
> >
> Yes, that should be an error.  And the compiler could notice that the function parameter values are allowed to be modified, and forbid such use.  As it should.  After all, who's going to know which call would modify the value of i?  Of course, it could be protocol that the functions included within the statement are all called before the local values are calculated, and that if the same syntax appears twice, the final value of the syntax is the value used...but that, while reasonable for the compiler, is hell on people reading the code.
> 
> Note that one can make several different plausible arguments as to the order in which the code should be evaluated.  When such is possible, the code should be an error.  And when it relies on rarely used rules, it should probably be an error.  (E.g., if you define that the location into which values will be stored must be evaluated before the expression to be stored is evaluated, you have a case that would be well defined, but it's a "rare event" kind of rule, and unless there's good reason, it should be an error.)
[...]

Agreed. Note that introducing assignment into the mix may not help matters, but complicate them even more. For example:

	int x=1, y=0;
	writeln((y = ++x) + (++y--) * (x=y)); // what does this print?

Or worse yet:

	int func1(int x) { ... }
	int func2(int x) { ... }
	int func3(int x) { ... }

	int function(int)[] funcptrs = [ &func1, &func2, &func3 ];

	int x=0, y=0;
	y += funcptrs[++x--](y = --x++); // what does this do?

The only place I can see where you'd even *want* to write code like this is in an entry to the IOCCC. Make it refuse to compile, I say.


T

-- 
"Holy war is an oxymoron." -- Lazarus Long
December 24, 2013
On 12/23/2013 4:12 PM, Charles Hixson wrote:
> On 12/23/2013 12:39 PM, David Held wrote:
>>[...]
>> int mightUpdate(int& x)
>> {
>>     ...
>>     return x;
>> }
>>
>> {
>>     ...
>>     a[mightUpdate(i)] = mightUpdate(i);
>>     ...
>> }
>>
>> Is this also a bug?  How would the compiler know whether to emit a
>> diagnostic for this case?
>>
> Yes, that should be an error.  And the compiler could notice that the
> function parameter values are allowed to be modified, and forbid such
> use.  As it should.
> [...]

Ok, let's make it more interesting...

void undefinedBehavior(int[] a, int b)
{
    int x = f(b);
    int y = g(b);
    a[mightUpdate(a[x])] = mightUpdate(a[y]);
}

Now, depending on b, f(), and g(), you might have the same as above, or you might not.  Should this be illegal, or allowed?

Dave

December 24, 2013
On Tuesday, 24 December 2013 at 07:08:49 UTC, David Held wrote:
> Ok, let's make it more interesting...

The compiler is only supposed to error out on stuff that is actually illegal according to the language.

What you are doing is not *illegal*, it just produces un-specified behavior. The compiler has no obligation to error out on it, and may not even be *allowed* to. At best, it might be able to detect something wrong, and be allowed to warn you about it.

AFAIK, Walter is not exactly for documenting such behavior, but just for enforcing that all compilers adhere to the same behvior. EG: the "wrong" behavior will be the same on all platforms. This way, if your code works on a machine, it *should* work just fine on another.
December 24, 2013
On 12/24/2013 02:37 PM, monarch_dodra wrote:
> On Tuesday, 24 December 2013 at 07:08:49 UTC, David Held wrote:
>> Ok, let's make it more interesting...
>
> The compiler is only supposed to error out on stuff that is actually
> illegal according to the language.
>
> What you are doing is not *illegal*, it just produces un-specified
> behavior.  ...

The behaviour is specified modulo evaluation order. An implementation will just pick some evaluation order, it is not allowed to produce arbitrary behaviour. In any case, it is a given that strict left-to-right evaluation order will eventually be required, so I don't think that it makes a lot of sense to discuss whether or not to error out.
December 24, 2013
On 12/24/2013 01:48 AM, H. S. Teoh wrote:
> Agreed. Note that introducing assignment into the mix may not help
> matters, but complicate them even more. For example:
>
> 	int x=1, y=0;
> 	writeln((y = ++x) + (++y--) * (x=y)); // what does this print?
> ...

'y-- is not an lvalue'.

Assuming we add parens as follows:

    int x=1, y=0;
    writeln((y = ++x) + ((++y)--) * (x=y));

We should get 8 according to strict left to right evaluation rules. Furthermore, we should have the value 2 stored in both x and y.


> Or worse yet:
>
> 	int func1(int x) { ... }
> 	int func2(int x) { ... }
> 	int func3(int x) { ... }
>
> 	int function(int)[] funcptrs = [ &func1, &func2, &func3 ];
>
> 	int x=0, y=0;
> 	y += funcptrs[++x--](y = --x++); // what does this do?
>

'x-- is not an lvalue'
'x++ is not an lvalue'

Assuming we run the following code instead:

    int func1(int x) { return 1*x; }
    int func2(int x) { return 2*x; }
    int func3(int x) { return 3*x; }

    int function(int)[] funcptrs = [ &func1, &func2, &func3 ];

    int x=0, y=0;
    y += funcptrs[(++x)--](y = (--x)++)

We should be left in a state where x contains 0 and y contains -2 according to strict left to right evaluation rules.

> The only place I can see where you'd even*want*  to write code like this
> is in an entry to the IOCCC. Make it refuse to compile, I say.

I don't think it makes sense to add some arbitrary rules to the compiler implementation just to ban some code that nobody writes anyway and potentially some perfectly fine code as well.
December 24, 2013
On Tue, Dec 24, 2013 at 05:15:15PM +0100, Timon Gehr wrote:
> On 12/24/2013 01:48 AM, H. S. Teoh wrote:
> >Agreed. Note that introducing assignment into the mix may not help matters, but complicate them even more. For example:
> >
> >	int x=1, y=0;
> >	writeln((y = ++x) + (++y--) * (x=y)); // what does this print?
> >...
> 
> 'y-- is not an lvalue'.

D'oh, should've tested my code before posting it. Next time I'm gonna write all code snippets as unittests... :-P  (OK, kidding.)


> Assuming we add parens as follows:
> 
>     int x=1, y=0;
>     writeln((y = ++x) + ((++y)--) * (x=y));
> 
> We should get 8 according to strict left to right evaluation rules. Furthermore, we should have the value 2 stored in both x and y.

What is "strict left to right evaluation", since * has to be evaluated before +? Or you consider all side-effects to be evaluated first (from left to right), and then the expression?


> >Or worse yet:
> >
> >	int func1(int x) { ... }
> >	int func2(int x) { ... }
> >	int func3(int x) { ... }
> >
> >	int function(int)[] funcptrs = [ &func1, &func2, &func3 ];
> >
> >	int x=0, y=0;
> >	y += funcptrs[++x--](y = --x++); // what does this do?
> >
> 
> 'x-- is not an lvalue'
> 'x++ is not an lvalue'
> 
> Assuming we run the following code instead:
> 
>     int func1(int x) { return 1*x; }
>     int func2(int x) { return 2*x; }
>     int func3(int x) { return 3*x; }
> 
>     int function(int)[] funcptrs = [ &func1, &func2, &func3 ];
> 
>     int x=0, y=0;
>     y += funcptrs[(++x)--](y = (--x)++)
> 
> We should be left in a state where x contains 0 and y contains -2 according to strict left to right evaluation rules.
> 
> >The only place I can see where you'd even*want*  to write code like this is in an entry to the IOCCC. Make it refuse to compile, I say.
> 
> I don't think it makes sense to add some arbitrary rules to the compiler implementation just to ban some code that nobody writes anyway and potentially some perfectly fine code as well.

A potential issue is that forcing a particular evaluation order may be detrimental to the optimizer, e.g., according to strict left-to-right evaluation (if I understand you correctly), then to evaluate this:

	z = (y = ++x) + ((++y)--) * (x=y)

you'd have to compile it into the equivalent of:

	y = ++x;	// i.e., ++x; y = x;
	tmp1 = y;
	tmp2 = ++y;	// i.e., ++y; tmp2 = y;
	y--;
	x=y;
	z = tmp1 + tmp2 * x; // i.e., tmp3 = tmp2*x; tmp3 += tmp1; z = tmp3;

Since the order can't be permuted without changing the semantics, the codegen would have to load x and y multiple times as well as introduce temporaries in order to evaluate the final expression, and the optimizer wouldn't be able to do anything about it.

Of course, this is probably not *that* big of a deal if normal expressions without side-effects can be optimized as usual -- it'd be the user's own fault for writing unoptimizable code. Code with side-effects are already difficult to optimize anyway, so it probably doesn't make that big of a difference.


T

-- 
"No, John.  I want formats that are actually useful, rather than over-featured megaliths that address all questions by piling on ridiculous internal links in forms which are hideously over-complex." -- Simon St. Laurent on xml-dev
December 24, 2013
On 12/24/2013 06:46 PM, H. S. Teoh wrote:
> ...
>
> What is "strict left to right evaluation", since * has to be evaluated
> before +? Or you consider all side-effects to be evaluated first (from
> left to right), and then the expression?
>

It is about the order operands are evaluated in.

Eg. if you have a binary expression, first evaluate the left operand and then the right operand (and finally compute the result).

>...
>
> A potential issue is that forcing a particular evaluation order may be
> detrimental to the optimizer, e.g., according to strict left-to-right
> evaluation (if I understand you correctly), then to evaluate this:
>
> 	z = (y = ++x) + ((++y)--) * (x=y)
>
> you'd have to compile it into the equivalent of:
>
> 	y = ++x;	// i.e., ++x; y = x;
> 	tmp1 = y;
> 	tmp2 = ++y;	// i.e., ++y; tmp2 = y;
> 	y--;
> 	x=y;
> 	z = tmp1 + tmp2 * x; // i.e., tmp3 = tmp2*x; tmp3 += tmp1; z = tmp3;
>
> Since the order can't be permuted without changing the semantics, the
> codegen would have to load x and y multiple times as well as introduce
> temporaries in order to evaluate the final expression, and the optimizer
> wouldn't be able to do anything about it.
>...

Yes, but in this case we wouldn't want the optimizer to do anything other than this. The case where unspecified order helps is when the optimizer cannot prove that two evaluation orders yield the same result (even though it is the case), in which case it can freely choose the more performant of the two. I don't think this is the right trade-off for D.

> Of course, this is probably not *that* big of a deal if normal
> expressions without side-effects can be optimized as usual -- it'd be
> the user's own fault for writing unoptimizable code. Code with
> side-effects are already difficult to optimize anyway, so it probably
> doesn't make that big of a difference.
>
>
> T
>

1 2
Next ›   Last »