August 05, 2014
This has already been stated by others, but I just wanted to pile on - I agree with Walter's definition of assert.

2. Semantic change.
> The proposal changes the meaning of assert(), which will result in breaking existing code.  Regardless of philosophizing about whether or not the code was "already broken" according to some definition of assert, the fact is that shipping programs that worked perfectly well before may no longer work after this change.



Disagree.
Assert (as I always understood it) means 'this must be true, or my program
is broken.'  In -release builds the explicit explosion on a triggered
assert is skipped, but any code after a non-true assert is, by definition,
broken.  And by broken I mean the fundamental constraints of the program
are violated, and so all bets are off on it working properly.
A shipping program that 'worked perfectly well' but has unfulfilled asserts
is broken - either the asserts are not actually true constraints, or the
broken path just hasn't been hit yet.

Looking at the 'breaking' example:

assert(x!=1);
if (x==1) {
 ...
}

If the if is optimized out, this will change from existing behaviour.  But it is also obviously (to me at least) broken code already.  The assert says that x cannot be 1 at this point in the program, if it ever is then there is an error in the program.... and then it continues as if the program were still valid.  If x could be one, then the assert is invalid here.  And this code will already behave differently between -release and non-release builds, which is another kind of broken.


3a. An alternate statement of the proposal is literally "in release mode,
> assert expressions introduce undefined behavior into your code in if the expression is false".
>

This statement seems fundamentally true to me of asserts already, regardless of whether they are used for optimizations.  If your assert fails, and you have turned off 'blow up on assert' then your program is in an undefined state.  It is not that the assert introduces the undefined behaviour, it is that the assert makes plain an expectation of the code and if that expectation is false the code will have undefined behaviour.



3b. Since assert is such a widely used feature (with the original
> semantics, "more asserts never hurt"), the proposal will inject a massive amount of undefined behavior into existing code bases, greatly increasing the probability of experiencing problems related to undefined behavior.
>

I actually disagree with the 'more asserts never hurt' statement.  Exactly because asserts get compiled out in release builds, I do not find them very useful/desirable.  If they worked as optimization hints I might actually use them more.

And there will be no injection of undefined behaviour - the undefined behaviour is already there if the asserted constraints are not valid.


 Maybe if the yea side was consulted, they might easily agree to an
> alternative way of achieving the improved optimization goal, such as creating a new function that has the proposed semantics.
>

Prior to this (incredibly long) discussion, I was not aware people had a different interpretation of assert.  To me, this 'new semantics' is precisely what I always thought assert was, and the proposal is just leveraging it for some additional optimizations.  So from my standpoint, adding a new function would make sense to support this 'existing' behaviour that others seem to rely on - assert is fine as is, if the definition of 'is' is what I think it is.


August 05, 2014
(limited connectivity for me)

For some perspective, recently gcc and clang have introduced optimizations based on undefined behavior in C/C++. The undefined behavior has been interpreted by modern optimizers as "these cases will never happen". This has wound up breaking a significant amount of existing code. There have been a number of articles about these, with detailed explanations about how they come about and the new, more correct, way to write code.

The emerging consensus is that the code breakage is worth it for the performance gains. That said, I do hear what people are saying about potential code breakage and agree that we need to address this properly.
August 05, 2014
On 08/05/2014 08:18 PM, Jeremy Powers via Digitalmars-d wrote:
>
> And there will be no injection of undefined behaviour - the undefined
> behaviour is already there if the asserted constraints are not valid.

Well, no. http://en.wikipedia.org/wiki/Undefined_behavior
August 05, 2014
On Tue, Aug 05, 2014 at 11:18:46AM -0700, Jeremy Powers via Digitalmars-d wrote:
> This has already been stated by others, but I just wanted to pile on - I agree with Walter's definition of assert.
> 
> 2. Semantic change.
> > The proposal changes the meaning of assert(), which will result in breaking existing code.  Regardless of philosophizing about whether or not the code was "already broken" according to some definition of assert, the fact is that shipping programs that worked perfectly well before may no longer work after this change.
> 
> Disagree.
> Assert (as I always understood it) means 'this must be true, or my
> program is broken.'  In -release builds the explicit explosion on a
> triggered assert is skipped, but any code after a non-true assert is,
> by definition, broken.  And by broken I mean the fundamental
> constraints of the program are violated, and so all bets are off on it
> working properly.  A shipping program that 'worked perfectly well' but
> has unfulfilled asserts is broken - either the asserts are not
> actually true constraints, or the broken path just hasn't been hit
> yet.

Exactly. I think part of the problem is that people have been using assert with the wrong meaning. In my mind, 'assert(x)' doesn't mean "abort if x is false in debug mode but silently ignore in release mode", as some people apparently think it means. To me, it means "at this point in the program, x is true".  It's that simple.

Now if it turns out that x actually *isn't* true, then you have a contradiction in your program logic, and therefore, by definition, your program is invalid, which means any subsequent behaviour is undefined. If you start with an axiomatic system where the axioms contain a contradiction, then any results you derive from the system will be meaningless, since a contradiction vacuously proves everything. Similarly, any program behaviour that follows a false assertion is undefined, because one of the "axioms" (i.e., assertions) introduces a contradiction to the program logic.


> Looking at the 'breaking' example:
> 
> assert(x!=1);
> if (x==1) {
>  ...
> }
> 
> If the if is optimized out, this will change from existing behaviour. But it is also obviously (to me at least) broken code already.  The assert says that x cannot be 1 at this point in the program, if it ever is then there is an error in the program.... and then it continues as if the program were still valid.  If x could be one, then the assert is invalid here.  And this code will already behave differently between -release and non-release builds, which is another kind of broken.

Which is what Walter has been saying: the code is *already* broken, and is invalid by definition, so it makes no difference what the optimizer does or doesn't do. If your program has an array overrun bug that writes garbage to an unrelated variable, then you can't blame the optimizer for producing a program where the unrelated variable acquires a different garbage value from before. The optimizer only guarantees (in theory) consistent program behaviour if the program is valid to begin with. If the program is invalid, all bets are off as to what its "optimized" version does.


> > 3a. An alternate statement of the proposal is literally "in release mode, assert expressions introduce undefined behavior into your code in if the expression is false".
> 
> This statement seems fundamentally true to me of asserts already, regardless of whether they are used for optimizations.  If your assert fails, and you have turned off 'blow up on assert' then your program is in an undefined state.  It is not that the assert introduces the undefined behaviour, it is that the assert makes plain an expectation of the code and if that expectation is false the code will have undefined behaviour.

I agree.


> > 3b. Since assert is such a widely used feature (with the original semantics, "more asserts never hurt"), the proposal will inject a massive amount of undefined behavior into existing code bases, greatly increasing the probability of experiencing problems related to undefined behavior.
> >
> 
> I actually disagree with the 'more asserts never hurt' statement. Exactly because asserts get compiled out in release builds, I do not find them very useful/desirable.  If they worked as optimization hints I might actually use them more.
> 
> And there will be no injection of undefined behaviour - the undefined behaviour is already there if the asserted constraints are not valid.

And if people are using asserts in ways that are different from what it's intended to be (expressions that must be true if the program logic has been correctly implemented), then their programs are already invalid by definition. Why should it be the compiler's responsibility to guarantee consistent behaviour of invalid code?


> > Maybe if the yea side was consulted, they might easily agree to an alternative way of achieving the improved optimization goal, such as creating a new function that has the proposed semantics.
> >
> 
> Prior to this (incredibly long) discussion, I was not aware people had a different interpretation of assert.  To me, this 'new semantics' is precisely what I always thought assert was, and the proposal is just leveraging it for some additional optimizations.  So from my standpoint, adding a new function would make sense to support this 'existing' behaviour that others seem to rely on - assert is fine as is, if the definition of 'is' is what I think it is.

Yes, the people using assert as a kind of "check in debug mode but ignore in release mode" should really be using something else instead, since that's not what assert means. I'm honestly astounded that people would actually use assert as some kind of non-release-mode-check instead of the statement of truth that it was meant to be.

Furthermore, I think Walter's idea to use asserts as a source of optimizer hints is a very powerful concept that may turn out to be a revolutionary feature in D. It could very well develop into the answer to my long search for a way of declaring identities in user-defined types that allow high-level optimizations by the optimizer, thus allowing user-defined types to be on par with built-in types in optimizability. Currently, the compiler is able to optimize x+x+x+x into 4*x if x is an int, for example, but it can't if x is a user-defined type (e.g. BigInt), because it can't know if opBinary was defined in a way that obeys this identity. But if we can assert that this holds for the user-defined type, e.g., BigInt, then the compiler can make use of that axiom to perform such an optimization.  This would then allow code to be written in more human-readable forms, and still maintain optimal performance, even where user-defined types are involved.

While manually-written code generally doesn't need this kind of optimization (instead of writing x+x+x+x, just write 4*x to begin with), this becomes an important issue with generic code and metaprogramming. The generic version of the code may very well be w+x+y+z, which cannot be reduced to n*x, so when you instantiate that code for the case where w==x==y==z, you have to pay the penalty of genericity. But we can eliminate this cost if we can tell the compiler that when w==x==y==z, then w+x+y+z == 4*x. Then we don't have to separately implement this special case in order to achieve optimal performance, but we will be able to continue using the generic, more maintainable code.

Something like this will require much more development of Walter's core concept than currently proposed, of course, but the current proposal is an important step in this direction, and I fully support it.


T

-- 
There are 10 kinds of people in the world: those who can count in binary, and those who can't.
August 05, 2014
>
> And there will be no injection of undefined behaviour - the undefined
>> behaviour is already there if the asserted constraints are not valid.
>>
>
> Well, no. http://en.wikipedia.org/wiki/Undefined_behavior
>

Well, yes: Undefined behaviour in the sense the writer of the program has not defined it.

A program is written with certain assumptions about the state at certain points.  An assert can be used to explicitly state those assumptions, and halt the program (in non-release) if the assumptions are invalid.  If the state does not match what the assert assumes it to be, then any code relying on that state is invalid, and what it does has no definition given by the programmer.

(And here I've circled back to assert==assume... all because I assume what
assert means)

If the state that is being checked could actually ever be valid, then it is not valid for an assert - use some other validation.


August 05, 2014
On Tue, Aug 05, 2014 at 11:35:14AM -0700, Walter Bright via Digitalmars-d wrote:
> (limited connectivity for me)
> 
> For some perspective, recently gcc and clang have introduced optimizations based on undefined behavior in C/C++. The undefined behavior has been interpreted by modern optimizers as "these cases will never happen". This has wound up breaking a significant amount of existing code. There have been a number of articles about these, with detailed explanations about how they come about and the new, more correct, way to write code.

And I'd like to emphasize that code *should* have been written in this new, more correct way in the first place. Yes, it's a pain to have to update legacy code, but where would progress be if we're continually hampered by the fear of breaking what was *already* broken to begin with?


> The emerging consensus is that the code breakage is worth it for the performance gains. That said, I do hear what people are saying about potential code breakage and agree that we need to address this properly.

The way I see it, we need to educate D users to use 'assert' with the proper meaning, and to replace all other usages with alternatives (perhaps a Phobos function that does what they want without the full implications of assert -- i.e., "breaking" behaviour like influencing the optimizer, etc.). Once reasonable notice and time has been given, I'm all for introducing optimizer hinting with asserts. I think in the long run, this will turn out to be an important, revolutionary development not just in D, but in programming languages in general.


T

-- 
Ignorance is bliss... until you suffer the consequences!
August 05, 2014
> Furthermore, I think Walter's idea to use asserts as a source of
> optimizer hints is a very powerful concept that may turn out to be a
> revolutionary feature in D. It could very well develop into the answer
> to my long search for a way of declaring identities in user-defined
> types that allow high-level optimizations by the optimizer, thus
> allowing user-defined types to be on par with built-in types in
> optimizability.

The answer to your search is "term rewriting macros (with sideeffect and alias analysis)" as introduced by Nimrod. Watch my talk. ;-)

'assume' is not nearly powerful enough for this and in no way "revolutionary".
August 05, 2014
On Tuesday, 5 August 2014 at 19:14:57 UTC, H. S. Teoh via Digitalmars-d wrote:
>T
>
>--
>Ignorance is bliss... until you suffer the consequences!

(sic!)
August 05, 2014
On 8/5/14, 3:55 PM, H. S. Teoh via Digitalmars-d wrote:
> On Tue, Aug 05, 2014 at 11:18:46AM -0700, Jeremy Powers via Digitalmars-d wrote:

> Furthermore, I think Walter's idea to use asserts as a source of
> optimizer hints is a very powerful concept that may turn out to be a
> revolutionary feature in D.

LLVM already has it. It's not revolutionary:

http://llvm.org/docs/LangRef.html#llvm-assume-intrinsic

By the way, I think Walter said "assert can be potentially used to make optimizations" not "Oh, I just had an idea! We could use assert to optimize code". I think the code already does this. Of course, we would have to look at the source code to find out...

By the way, most of the time in this list I hear "We could use this and that feature to allow better optimizations" and no optimizations are ever implemented. Look at all those @pure nosafe nothrow const that you have to put and yet you don't get any performance boost from that.
August 05, 2014
On Tuesday, 5 August 2014 at 18:57:40 UTC, H. S. Teoh via Digitalmars-d wrote:
> Exactly. I think part of the problem is that people have been using
> assert with the wrong meaning. In my mind, 'assert(x)' doesn't mean
> "abort if x is false in debug mode but silently ignore in release mode",
> as some people apparently think it means. To me, it means "at this point
> in the program, x is true".  It's that simple.

A language construct with such a meaning is useless as a safety feature. If I first have to prove that the condition is true before I can safely use an assert, I don't need the assert anymore, because I've already proved it. If it is intended to be an optimization hint, it should be implemented as a pragma, not as a prominent feature meant to be widely used. (But I see that you have a different use case, see my comment below.)

> The optimizer only guarantees (in theory)
> consistent program behaviour if the program is valid to begin with. If
> the program is invalid, all bets are off as to what its "optimized"
> version does.

There is a difference between invalid and undefined: A program is invalid ("buggy"), if it doesn't do what it's programmer intended, while "undefined" is a matter of the language specification. The (wrong) behaviour of an invalid program need not be undefined, and often isn't in practice.

An optimizer may only transform code in a way that keeps the resulting code semantically equivalent. This means that if the original "unoptimized" program is well-defined, the optimized one will be too.

> Yes, the people using assert as a kind of "check in debug mode but
> ignore in release mode" should really be using something else instead,
> since that's not what assert means. I'm honestly astounded that people
> would actually use assert as some kind of non-release-mode-check instead
> of the statement of truth that it was meant to be.

Well, when this "something else" is introduced, it will need to replace almost every existing instance of "assert", as the latter must only be used if it is proven that the condition is always true. To name just one example, it cannot be used in range `front` and `popFront` methods to assert that the range is not empty, unless there is an additional non-assert check directly before it.

>
> Furthermore, I think Walter's idea to use asserts as a source of
> optimizer hints is a very powerful concept that may turn out to be a
> revolutionary feature in D. It could very well develop into the answer
> to my long search for a way of declaring identities in user-defined
> types that allow high-level optimizations by the optimizer, thus
> allowing user-defined types to be on par with built-in types in
> optimizability. Currently, the compiler is able to optimize x+x+x+x into
> 4*x if x is an int, for example, but it can't if x is a user-defined
> type (e.g. BigInt), because it can't know if opBinary was defined in a
> way that obeys this identity. But if we can assert that this holds for
> the user-defined type, e.g., BigInt, then the compiler can make use of
> that axiom to perform such an optimization.  This would then allow code
> to be written in more human-readable forms, and still maintain optimal
> performance, even where user-defined types are involved.

This is a very powerful feature indeed, but to be used safely, the compiler needs to be able to detect invalid uses reliably at compile time. This is currently not the case:

    void onlyOddNumbersPlease(int n) {
        assert(n % 2);
    }

    void foo() {
        onlyOddNumbersPlease(42);    // shouldn't compile, but does
    }

It would be great if this were possible. In the example of `front` and `popFront`, programs that call these methods on a range that could theoretically be empty wouldn't compile. This might be useful for optimization, but above that it's useful for verifying correctness.

Unfortunately this is not what has been suggested (and was evidently intended from the beginning)...