August 05, 2014
On Tuesday, 5 August 2014 at 18:35:32 UTC, Walter Bright wrote:
> (limited connectivity for me)
>
> For some perspective, recently gcc and clang have introduced optimizations based on undefined behavior in C/C++. The undefined behavior has been interpreted by modern optimizers as "these cases will never happen". This has wound up breaking a significant amount of existing code. There have been a number of articles about these, with detailed explanations about how they come about and the new, more correct, way to write code.
>
> The emerging consensus is that the code breakage is worth it for the performance gains.
>
> That said, I do hear what people are saying about potential code breakage and agree that we need to address this properly.

Well, then at least we agree there is some kind of tradeoff being made here if the definition of assert is changed: potential performance vs a bunch of negatives.

In my estimation, the upside is small and the tradeoff is not close to being justified. If performance is a top goal, there are many other things that could be done which have lesser (or zero) downsides. Just to give one example, addition of a forceinline attribute would be of great assistance to those attempting to micro optimize their code.

And of course, adding this as a new function instead of redefining an existing one would eliminate the code breakage and C compatibility issues. The undefined behavior issue would remain, but at least defining assume as @system would alleviate the @safe issue somewhat (there could still be leaks due to inlining), and make it more clear to users that it's a dangerous feature. It would also make it more searchable, for code auditing purposes.

Anyways, if I have at least made you and others aware of all the downsides, and all the contradictions of this proposal with D's stated goals, then I guess I have done my part for this issue.
August 05, 2014
On Tuesday, 5 August 2014 at 20:50:06 UTC, Jeremy Powers via Digitalmars-d wrote:
>>
>>
>>> Well, yes: Undefined behaviour in the sense
>>>
>> "And there will be no injection of undefined behaviour
>>                                    ^~~~~~~~~~~~~~~~~~~
>>                                    conventional sense
>>
>>
>>  - the undefined behaviour is already there if the asserted constraints
>>        ^~~~~~~~~~~~~~~~~~~
>>        altered sense
>>
>> are not valid."
>>
>
>
> I still don't quite see your point.  Perhaps I should have said:  In the
> case where an asserted constraint is not met, the program is invalid.
>  Being invalid it has undefined behaviour if it continues.
>
>>From another:
>
>> There is a difference between invalid and undefined: A program is invalid
>> ("buggy"), if it doesn't do what it's programmer intended, while
>> "undefined" is a matter of the language specification. The (wrong)
>> behaviour of an invalid program need not be undefined, and often isn't in
>> practice.
>>
>
> I disagree with this characterization.  Something can be buggy, not doing
> what the programmer intended, while also a perfectly valid program.  You
> can make wrong logic that is valid/reasonable in the context of the program.
>
> Invalid in this case means the programmer has explicitly stated certain
> constraints must hold, and such constraints do not hold.  So if you
> continue with the program in the face of invalid constraints, you have no
> guarantee what will happen - this is what I mean by 'undefined'.

You're using a nonstandard definition of undefined behavior. Undefined behavior has a precise meaning, that's why Timon linked the wiki article for you.

The regular definition of assert does not involve any undefined behavior, only the newly proposed one.
August 05, 2014
>
> You're using a nonstandard definition of undefined behavior. Undefined behavior has a precise meaning, that's why Timon linked the wiki article for you.
>
> The regular definition of assert does not involve any undefined behavior, only the newly proposed one.
>

But the 'newly proposed one' is the definition that I have been using all along.  The 'regular' definition of assert that you claim is what I see as the redefinition - it is a definition based on the particular implementation of assert in other languages, not on the conceptual idea of assert as I understand it (and as it appears to be intended in D).

This appears to be the root of the argument, and has been circled repeatedly... it's not my intent to restart another round of discussion on that well traveled ground, I just wanted to state my support for the definition as I understand it.


August 05, 2014
On Tuesday, 5 August 2014 at 18:19:00 UTC, Jeremy Powers via Digitalmars-d wrote:
> This has already been stated by others, but I just wanted to pile on - I
> agree with Walter's definition of assert.
>
> 2. Semantic change.
>> The proposal changes the meaning of assert(), which will result in
>> breaking existing code.  Regardless of philosophizing about whether or not
>> the code was "already broken" according to some definition of assert, the
>> fact is that shipping programs that worked perfectly well before may no
>> longer work after this change.
>
>
>
> Disagree.
> Assert (as I always understood it) means 'this must be true, or my program
> is broken.'  In -release builds the explicit explosion on a triggered
> assert is skipped, but any code after a non-true assert is, by definition,
> broken.  And by broken I mean the fundamental constraints of the program
> are violated, and so all bets are off on it working properly.
> A shipping program that 'worked perfectly well' but has unfulfilled asserts
> is broken - either the asserts are not actually true constraints, or the
> broken path just hasn't been hit yet.

This is the 'already broken' argument, which I mentioned in the quote above.

This kind of change could never be made in C or C++, because there is too much legacy code that depends on it. Perhaps D can still afford to break this because it's still a young language. That is a strength of young languages. If you believe in this case that the upside justifies the breakage, by all means, just say so and accept the consequences. Don't try to escape responsibility by retroactively redefining previously working code as broken :)

>
> Looking at the 'breaking' example:
>
> assert(x!=1);
> if (x==1) {
>  ...
> }
>
> If the if is optimized out, this will change from existing behaviour.  But
> it is also obviously (to me at least) broken code already.  The assert says
> that x cannot be 1 at this point in the program, if it ever is then there
> is an error in the program.... and then it continues as if the program were
> still valid.  If x could be one, then the assert is invalid here.  And this
> code will already behave differently between -release and non-release
> builds, which is another kind of broken.

Not everything that breaks will be so obvious as that. It can get much more hairy when assumptions propagate across multiple levels of inlining.

Also, some code purposely uses that pattern. It is (or rather, was) valid for a different use case of assert.

>
> 3a. An alternate statement of the proposal is literally "in release mode,
>> assert expressions introduce undefined behavior into your code in if the
>> expression is false".
>>
>
> This statement seems fundamentally true to me of asserts already,
> regardless of whether they are used for optimizations.  If your assert
> fails, and you have turned off 'blow up on assert' then your program is in
> an undefined state.  It is not that the assert introduces the undefined
> behaviour, it is that the assert makes plain an expectation of the code and
> if that expectation is false the code will have undefined behaviour.
>

This is not the standard definition of undefined behavior.

With regular assert, execution is still well defined. If you want to know what happens in release mode when the assert condition is not satisfied, all you need to do is read the source code to find out.

With assume, if the condition is not satisfied, there is no way to know what will happen. _anything_ can happen, it can even format your hard drive. That's true undefined behavior.

>
> 3b. Since assert is such a widely used feature (with the original
>> semantics, "more asserts never hurt"), the proposal will inject a massive
>> amount of undefined behavior into existing code bases, greatly increasing
>> the probability of experiencing problems related to undefined behavior.
>>
>
> I actually disagree with the 'more asserts never hurt' statement.  Exactly
> because asserts get compiled out in release builds, I do not find them very
> useful/desirable.  If they worked as optimization hints I might actually
> use them more.
>
> And there will be no injection of undefined behaviour - the undefined
> behaviour is already there if the asserted constraints are not valid.

This uses your own definition of UB again, it isn't true for the regular definition.

>  Maybe if the yea side was consulted, they might easily agree to an
>> alternative way of achieving the improved optimization goal, such as
>> creating a new function that has the proposed semantics.
>>
>
> Prior to this (incredibly long) discussion, I was not aware people had a
> different interpretation of assert.  To me, this 'new semantics' is
> precisely what I always thought assert was, and the proposal is just
> leveraging it for some additional optimizations.  So from my standpoint,
> adding a new function would make sense to support this 'existing' behaviour
> that others seem to rely on - assert is fine as is, if the definition of
> 'is' is what I think it is.

That doesn't mean you don't have buggy asserts in your code, or buggy assers in libraries you use (or the standard library), so you are not immune from undefined behavior.
August 05, 2014
> But the 'newly proposed one' is the definition that I have been using all
> along.

+1. Until this came up, I didn't know another definition existed.

> The 'regular' definition of assert that you claim is what I see as
> the redefinition - it is a definition based on the particular
> implementation of assert in other languages, not on the conceptual idea of
> assert as I understand it (and as it appears to be intended in D).

In my view, it's also a redefinition of -release. My view is influenced by Common Lisp. If you want speed, you test your program, and then when you feel comfortable, set the optimization levels to get as much speed as possible. If you want safety and debugging, set the optimization levels accordingly. I was always under the impression that -release was a statement to the compiler that "I've tested the program, make it run as fast as possible, and let me worry about any remaining bugs."
August 05, 2014
On Tuesday, 5 August 2014 at 22:25:59 UTC, Jeremy Powers via Digitalmars-d wrote:
>>
>> You're using a nonstandard definition of undefined behavior. Undefined
>> behavior has a precise meaning, that's why Timon linked the wiki article
>> for you.
>>
>> The regular definition of assert does not involve any undefined behavior,
>> only the newly proposed one.
>>
>
> But the 'newly proposed one' is the definition that I have been using all
> along.

OK, but my point was you were using a different definition of undefined behavior. We can't communicate if we aren't using the same meanings of words.

> The 'regular' definition of assert that you claim is what I see as
> the redefinition - it is a definition based on the particular
> implementation of assert in other languages, not on the conceptual idea of
> assert as I understand it (and as it appears to be intended in D).

The 'regular' definition of assert is used in C, C++ and for the last >10years (afaik), in D. If you want to change it you need a good justification. I'm not saying such justification necessarily exist doesn't either, maybe it does but I have not seen it.

>
> This appears to be the root of the argument, and has been circled
> repeatedly... it's not my intent to restart another round of discussion on
> that well traveled ground, I just wanted to state my support for the
> definition as I understand it.

I disagree. I don't think the fact that some people already had the new definition in mind before is really all that relevant. That's in the past. This is all about the pros and cons of changing it now and for the future.
August 06, 2014
On 8/3/2014 7:31 PM, John Carter wrote:
> Compiler users always blame the optimizer long before they blame their crappy code.
>
> Watching the gcc mailing list over the years, those guys bend over backwards to
> prevent that happening.
>
> But since an optimization has to be based on additional hard information, they
> have, with every new version of gcc, used that information both for warnings and
> optimization.

Recent optimization improvements in gcc and clang have also broken existing code that has worked fine for decades.

In particular, overflow checks often now get optimized out, as the check relied on, pedantically, undefined behavior.

This is why D has added the core.checkedint module, to have overflow checks that are guaranteed to work.

Another optimization that has broken existing code is removal of dead assignments. This has broken crypto code that would overwrite passwords after using them. It's also why D now has volatileStore() and volatileLoad(), if only someone will pull them.

I.e. silent breakage of existing, working code is hardly unknown in the C/C++ world.

August 06, 2014
On 8/3/2014 7:26 PM, Tove wrote:
> It is possible, just not as a default enabled warning.
>
> Some compilers offers optimization diagnostics which can be enabled by a switch,
> I'm quite fond of those as it's a much faster way to go through a list of
> compiler highlighted failed/successful optimizations rather than being forced to
> check the asm output after every new compiler version or minor code refactoring.
>
> In my experience, it actually works fine in huge projects, even if there are
> false positives you can analyse what changes from the previous version as well
> as ignoring modules which you know is not performance critical.

If you build dmd in debug mode, and then run it with -O --c, it will give you a list of all the data flow transformations it does.

But the list is a blizzard on non-trivial programs.

August 06, 2014
On 8/3/2014 4:24 PM, Martin Krejcirik wrote:
> Couldn't this new assert behaviour be introduced as a new optimization
> switch ? Say -Oassert ? It would be off by default and would work both
> in debug and release mode.

It could, but every one of these:

1. doubles the time it takes to test dmd, it doesn't take many of these to render dmd untestable

2. adds confusion to most programmers as to what switch does what

3. adds complexity, i.e. bugs

4. interactions between optimization switches often exhibits emergent behavior - i.e. extremely hard to test for

August 06, 2014
On 8/3/2014 4:51 PM, Mike Farnsworth wrote:
> This all seems to have a very simple solution, to use something like: expect()

I see code coming that looks like:

   expect(x > 2);  // must be true
   assert(x > 2);  // check that it is true

All I can think of is, shoot me now :-)