July 28, 2014
Daniel Murphy:

> Sure, we could add a new construct (ie 'assume') to the language, and use that to pass information.  But are they really different?

Conflating the two concepts with a single name is a mistake that will cause troubles. Different features need different names or different syntax.

Bye,
bearophile
July 28, 2014
"bearophile"  wrote in message news:sfgfebbskhatxqlzthpr@forum.dlang.org...

> Conflating the two concepts with a single name is a mistake that will cause troubles. Different features need different names or different syntax.

Did you read the rest of my post?  Assert already fits this feature, we just don't take advantage of it like we could. 

July 28, 2014
On Monday, 28 July 2014 at 15:52:23 UTC, John Colvin wrote:
>> If asserts were used as optimization constraints
>
> all available code is fair game as optimisation constraints. What you are asking for is a special case for `assert` such that the optimiser is blind to it.

Yes, because they are not to have side-effects. If asserts change behaviour, then they have side-effects.

You add "asserts" as a weak form for verification of partial correctness. That is based on the probability of not repeating the same mistake twice being less than specifying it once implicitly in the executed code. Then you test it with limited coverage of potential inputs.

That means:

1. you don't know if the specified program is correct
2. you don't know if the partial verification (which is weak) is correct
3. you don't know if 2. contradicts 1.

Then you remove the verification test in release. If you then continue to assume that 2. is true then you are potentially worse off than just having 1. You basically introduce contradictions by having a high probability of using facts that are not represented in the specified program (unless you have formally proven the facts to hold, no partial testing can cover this).

> Why should the situation be different if I use the builtin `assert` instead?

No difference. Except with a builtin you can get additional debugger support, so that you don't have to recompile when following a complicated trace that trigger asserts.
July 28, 2014
On Monday, 28 July 2014 at 19:16:24 UTC, Ola Fosheim Grøstad wrote:
>> Why should the situation be different if I use the builtin `assert` instead?
>
> No difference. Except with a builtin you can get additional debugger support, so that you don't have to recompile when following a complicated trace that trigger asserts.

Well, actually, that is not correct. There are other advantages too.

If the compiler guarantees that asserts don't have side effects (on optimization) and aren't affected by the optimizer then you are more likely to see the same behaviour in debug and release builds even with heavy optimization turned on (such as various math approximations). E.g. you can verify with exact math in the assert() that the fast-math approximations the optimizer makes on running code are within the acceptable tolerance level.
July 28, 2014
On Monday, 28 July 2014 at 16:32:46 UTC, Daniel Murphy wrote:
> Yes.  I think you'll find advanced optimizers are already doing this sort of thing to your code, when it decides code is unreachable.

The problem is that the code is not unreachable when you compile with -release.

> Sure, we could add a new construct (ie 'assume') to the language, and use that to pass information.  But are they really different?

The difference is that if you assume(), then you have to formally prove it to hold. That means it cannot be allowed to contradict the program code.

It is very important to keep the provided facts and the derived properties separate.

> Assert says 'the program is in error if this is not true'.

enforce() says: always runtime check this (?)
assert() says: this ought to hold when I do testing
assume() says: I have proven this to hold
prove() says: don't generate code until you can prove this to hold

> -release says 'compile my program as if it has no errors'.

-release says: compile my program without runtime checks for correctness
-no-assume might say: ignore all assumptions, I don't trust myself
-assume-is-assert(): turn assumptions into runtime checks
etc.

> These two combined give the compiler a huge amount of power.

Yes, if used carefully, but the language and/or compiler have to treat user provided facts differently than derived knowledge that originates within the executable code. assert() is not really for providing facts, it is for testing them. So the threshold for providing an assertion is low if it is completely free of side effects (in release), but the threshold for providing an assumption should be very high.
July 28, 2014
> assert() says: this ought to hold when I do testing
> assume() says: I have proven this to hold

Just as another example of how these two are different:

You might want to assert() that you have enough headroom in a statically allocated buffer when testing. Say you assert that the buffer is never more than 50% full for normal loads. You obviously don't want the -release build to make such an assumption!

So the content of an assert() is in this regard different from a prove(). It can sometimes be desirable to put too strong constraints on the program when testing, then relax it on release to get that extra headroom for "odd situations".

You can probably make the same argument for tolerances in computations that are approximate/randomized/iterative. Testing for a tight tolerance does not mean that you want the optimizer to assume that the tolerance is always within that bound. You want some extra headroom in release to be sure that it behaves correctly under more difficult circumstances.
July 28, 2014
"Ola Fosheim Grøstad" " wrote in message news:jikanohklijgfyczizmv@forum.dlang.org...

> The problem is that the code is not unreachable when you compile with -release.

Whether or not this is a problem depends on the meaning of assert.

> > Assert says 'the program is in error if this is not true'.
>
> enforce() says: always runtime check this (?)
> assert() says: this ought to hold when I do testing
> assume() says: I have proven this to hold
> prove() says: don't generate code until you can prove this to hold

This might be how you use assert, but it is not the only possible semantics we could define.

> > -release says 'compile my program as if it has no errors'.
>
> -release says: compile my program without runtime checks for correctness
> -no-assume might say: ignore all assumptions, I don't trust myself
> -assume-is-assert(): turn assumptions into runtime checks
> etc.

Same thing.  Yes, disabling assertion checking is currently what -release does, but that doesn't mean it's the only thing it can do.

What I'm proposing is a hybrid of your definitions - in release mode assert becomes assume.

> Yes, if used carefully, but the language and/or compiler have to treat user provided facts differently than derived knowledge that originates within the executable code. assert() is not really for providing facts, it is for testing them. So the threshold for providing an assertion is low if it is completely free of side effects (in release), but the threshold for providing an assumption should be very high.

I'm not convinced they must be treated differently.  If you define assert and release the way I have above it is perfectly valid.

You've argued that it's the wrong way to do things, but all that matters is that the advantages outweigh the downsides.  The optimization possibilities here are enormous.

According to the spec on 'assert(0)': "The optimization and code generation phases of compilation may assume that it is unreachable code.".

So we already have this in the language somewhat - you just need to write it like this:

if (condition)
   assert(0);

instead of:

assert(condition);

It seems like a straightforward extension to make them equivalent for this purpose. 

July 28, 2014
"bearophile"  wrote in message news:tijfjtihtubozpjojgvl@forum.dlang.org...

> > One murky area is that assert(0) is currently used to mean both 'unreachable' and 'unimplemented'.  It's unclear what the compiler is allowed to do with an assert(0) in release mode.
>
> I'd still like to have a halt() in D, and have assert(0) with the same semantics of the other asserts. In many cases when you have a different feature it's good to also give it a different name.

Turns out it's not so murky according to the spec:

The expression assert(0) is a special case; it signifies that it is unreachable code. Either AssertError is thrown at runtime if it is reachable, or the execution is halted (on the x86 processor, a HLT instruction can be used to halt execution). The optimization and code generation phases of compilation may assume that it is unreachable code.

I'm quite happy to accept that using assert(0) as halt can only be relied on in non-release mode. 

July 28, 2014
On 7/28/2014 1:42 AM, bearophile wrote:
> void main() {
>      import std.stdio, core.checkedint;
>
>      ubyte x = 100;
>      ubyte y = 200;
>
>      bool overflow = false;
>      immutable result = muls(x, y, overflow);
>      assert(!overflow);
>      writeln("Product: ", result);
> }
>
>
> Here I am not willing to add an assume(). Here at the call point of muls() x and
> y have a full range of ubytes, so the range analysis tell the compiler their
> product will not overflow, so it can replace the muls() with a regular product
> and leave overflow untouched. This is faster.

Here there are no new language features required. muls() being an intrinsic means the compiler knows about it. The compiler already does data flow analysis on constants, meaning it knows that x is 100, and will replace subsequent instances of x with 100. This is called "constant propagation", and has been a standard feature of compilers for 30 years.

Some optimizers (not including dmd's) go further with this data flow analysis and do ranges. assert()s are mined by the optimizer for information about ranges. ('assume' adds absolutely nothing here.) With this, since the compiler knows about muls(), it can optimize it.

The point being:

    NO NEW LANGUAGE FEATURES ARE NECESSARY

bearophile, I know you are interested in these sorts of things, but you are unaware of routine data flow analysis that compilers have done for decades. I strongly recommend you pick up a compiler book like "Compilers Principles Techniques and Tools" and thumb through the optimization section. I think you'll find it enjoyable and useful.

July 29, 2014
On 7/28/2014 8:53 AM, bearophile wrote:
> Daniel Murphy:
>
>> One murky area is that assert(0) is currently used to mean both 'unreachable'
>> and 'unimplemented'.  It's unclear what the compiler is allowed to do with an
>> assert(0) in release mode.
>
> I'd still like to have a halt() in D, and have assert(0) with the same semantics
> of the other asserts. In many cases when you have a different feature it's good
> to also give it a different name.

The marginal utility of all these proposals is pretty much zero. assert(0) serves just fine.