March 05, 2018
On 03/05/2018 09:55 PM, Walter Bright wrote:
> On 3/5/2018 7:48 AM, Timon Gehr wrote:
>> Again: assert is @safe. Compiler hints are @system. Why should assert give compiler hints?
> 
> Asserts give expressions that must be true. Why not take advantage of them?

Because it's exactly what @safe is not supposed to do. You're trusting the programmer to get their asserts right. Trusting the programmer to get it right is @system.

[...]> It's the programmer's option to leave those runtime checks in if he
> wants to.

As far as I understand, Timon only asks for a third option: to simply compile the code as if the asserts weren't there, without assuming that they would pass.

That way you get a speedup from the omitted asserts, but you don't get UB from a mistaken assert. This is not an unreasonable thing to want, is it?

You say that DMD does not currently use assert information, so -release currently does this.

[...]
>> There was no "-check=off" flag before.
> 
> Yes there was, it's the "release" flag.

But the controversial aspect is not implemented. And it will be very surprising if you ever do implement it.

I'm actually pretty shocked that -release is described that way. It makes a point of keeping bounds checks in @safe code. The reason is that it would be unsafe to remove them. What's the point of that when safety is compromised anyway by assuming that asserts would pass?
March 05, 2018
On 05.03.2018 21:55, Walter Bright wrote:
> On 3/5/2018 7:48 AM, Timon Gehr wrote:
>> Again: assert is @safe. Compiler hints are @system. Why should assert give compiler hints?
> 
> Asserts give expressions that must be true.

"Trust the programmer" does not always scale.

> Why not take advantage of them?

For some use cases it might be fine, but not for others, because you can't know whether the program and the assertions are really consistent.

Basically, I think the flags should be:

-check-{assert,invariant,precondition,postcondition,...}={on,off,assume}

E.g.:

$ dmd -check-assert=on test.d     # throw on assertion failure
$ dmd -check-assert=off test.d    # ignore assertions
$ dmd -check-assert=assume test.d # assertions are assumptions for code generation

Then the spec says that "assume" is potentially dangerous and can break @safe-ty guarantees, like -boundscheck=off.

> See Spec# which based an entire language around that notion:
> 
>   https://en.wikipedia.org/wiki/Spec_Sharp
> ...

Spec# is the opposite of what you claim. It verifies statically that the conditions actually hold. Also, it is type safe. (I.e. no UB.)

> Some possible optimizations based on this are:
> 
> 1. elimination of array bounds checks
> 2. elimination of null pointer checks
> 3. by knowing a variable can take on a limited range of values, a cheaper data type can be substituted
> 4. elimination of checks for 'default' switch values
> 5. elimination of overflow checks
> 
> dmd's optimizer currently does not extract any information from assert's. But why shut the door on that possibility?
> ...

We should not do that, and it is not what I am arguing for. Sorry if that did not come across clearly.

> 
>> But the whole point of having memory safety is to not have UB when the programmer screwed up. Behavior not foreseen by the programmer (a bug) is not the same as behavior unconstrained by the language specification (UB).
> 
> It's the programmer's option to leave those runtime checks in if he wants to.
> ...

My point is that either leaving them in or turning failures into UB are too few options. Also, @safe is a bit of a joke if there is no way to _disable contracts_ without nullifying the guarantees it's supposed to give.

> 
>> 'in'-contracts catch AssertError when being composed. How can the language not be designed to support that?
> 
> That is indeed an issue. It's been proposed that in-contracts throw a different exception, say "ContractException" so that it is not UB when they fail. There's a bugzilla ER on this. (It's analogous to asserts in unittests not having UB after they fail.)
> ...

This is ugly, but I don't think there is a better solution.

> 
>> - I usually don't want UB in programs I am working on. I want the runtime behavior of the programs to be determined by the source code, such that every behavior observed in the wild (intended or unintended) can be traced back to the source code (potentially in a non-deterministic way, e.g. void initialization of an integer constant). This should be the case always, even if me or someone else on my team made a mistake. The @safe D subset is supposed to give this guarantee. What good is @safe if it does not guarantee absence of buffer overrun attacks?
> 
> It guarantees it at the option of the programmer via a command line switch.
> ...

You mean, leave in checks?

> 
>> - Using existing assertions as compiler hints is not necessary. (Without having checked it, I'm sure that LDC/GDC have a more suitable intrinsic for this already.)
>>
>> As far as I can discern, forcing disabled asserts to give compiler hints has no upsides.
> 
> I suspect that if:
> 
>      compiler_hint(i < 10);
> 
> were added, there would be nothing but confusion as to its correct usage vs assert vs enforce. There's already enough confusion about the latter two.

I have never understood why. The use cases of assert and enforce are disjoint.

> In fact, I can pretty much guarantee it will be rarely used correctly.
> ...

Me too, but that's mostly because it will be rarely used.

> 
>> I know. Actually version(assert) assert(...); also works. However, this is too verbose, especially in contracts.
> 
> You can wrap it in a template.
> ...

That won't work for in contracts if they start catching ContractException instead of AssertError. Also, I think we'd actually like to _shorten_ the contract syntax (there is another DIP on this).

For other uses, a function suffices, but I ideally want to keep using standard 'assert'. Everybody already knows what 'assert' means.

> 
>> I'd like a solution that does not require me to change the source code. Ideally, I just want the Java behavior (with reversed defaults).
> 
> But you'll have to change the code to compiler_hint().
> ...

I don't, because I don't want that behavior. Others who want that behavior also should not have to. This should be a compilation switch.

> 
>> (enforce is _completely unrelated_ to the current discussion.)
> 
> It does just what you ask (for the regular assert case).
> ...

No. "enforce" does not document that the programmer thinks that the condition will never fail. It's not a contract. Hence, enforcements are never removed from the executable, because it would not make sense to do so, so they definitely do not fit my requirements.

> 
>>> It being UB was my doing, not Mathias'. DIP1006 is not redefining the semantics of what assert does.
>> This is not really about assert semantics, this is about the semantics of "disabling the check".
> 
> It is very much about the semantics of assert.
> 
> 
>> There was no "-check=off" flag before.
> 
> Yes there was, it's the "release" flag.
> ...

Depends on what you mean by "off". :o)

In my book, "-release" is "-check-assert=assume -boundscheck=safeonly". A very strange combination!

> 
>> The DIP uses terminology such as "disable assertions" as opposed to "disable assertion checks (but introduce compiler hints)".
> 
> Yes, the language could be more precise, but I couldn't blame Mathias for that.

I don't understand why that would be necessary. The point of the preliminary DIP review is not to blame the author for the DIP's shortcomings, it's to collect feedback on the DIP to make it better, ideally in an interactive way. It is also very well possible that Mathias was simply unaware of the UB behavior with -release.

> I also disagree with the word "hint", because it implies things like "this branch is more likely to be taken" to guide code generation decisions, rather than "assume X is absolutely always incontrovertibly true and you can bet the code on it".
> 

Makes sense. I'll call them "assumptions" from now on.

March 05, 2018
On 05.03.2018 22:11, Walter Bright wrote:
> On 3/5/2018 11:34 AM, Timon Gehr wrote:
>> On 05.03.2018 11:30, Walter Bright wrote:
>>> The hints will usually not make a significant difference in performance anyway.
> 
> Reasonable people will disagree on what is significant or not.
> ...

My point exactly! Hence, compiler flag.

> ...
>>
>> I did not make the code any more wrong by adding the assertion.
>> Why should I get more UB?
> 
> Because you put in an assert that did not hold, and disabled the check.
> ...

(Maybe let's assume it was not me who did it, to stop the whole silly "you deserve what you got because you made a mistake" notion.)

Again, my question is not about the _mechanics_ of the status quo. I know it very well. It's the rationale that matters.

> 
>> Now we have the following options:
>>
>> - Leave contracts in -- fail performance requirements.
>>
>> - Remove contracts -- fail safety requirements.
>>
>> - Track down all 'assert's, even those in external libraries, and replace them by a custom home-cooked solution that is incompatible with everyone else's -- fail maintainability requirements.
>>
>> To me this situation is ridiculous.
> 
> It's completely under the control of the programmer. I know you disagree with that notion.

I don't. I can use a manual patch of the compiler that has the additionally required flags and replicate the official packaging effort and make everyone who wants to compile my programs use that version. I just don't want to, as it seems silly.

It would be a lot better if the standard DMD compiler had the flags. Do you disagree that there should be an additional option to ignore contracts completely?

> You can even create your own `myassert` to produced your desired semantics.
> ...

That's the third option above. It's not a practical solution. Putting the flag into a compiler fork is trivial by comparison.

> 
>> FWIW, this is what all contract systems that I'm aware of do, except D, and maybe C asserts in certain implementations (if you want to call that contracts).
> 
> D is better (!).
> 
> (C's asserts are not part of the language, so impart no semantics to the compiler.)
> 

(That's why I said "in certain implementations".)
March 05, 2018
On Monday, 5 March 2018 at 10:30:12 UTC, Walter Bright wrote:
> The idea behind removal of the runtime checks is as a performance optimization done on a debugged program. It's like turning on or off array bounds checking. Many leave asserts and array bounds checking on even in released code to ensure memory safety.
>
> At a minimum, turning it off and on will illuminate just what the checks are costing you.
>
> It's at the option of the programmer.

void safeCode1(int a, ref int[2] b) @safe
{
    assert(a < 2);
    b[a] = 0;
}

So, if I compile this with `-release -O`, the compiler is free to remove the bounds-check, which will cause a buffer overrun if `a > 1`. Ok.

void safeCode2(int a, ref int[2] b) @safe
{
    b[a] = 0;
}

And here the compiler is *not* free to remove the bounds check.

This just feels bad. Adding extra failsafes for my debug program shouldn't make my release program less safe.
March 05, 2018
On 05.03.2018 22:24, ag0aep6g wrote:
> On 03/05/2018 10:11 PM, Walter Bright wrote:
>> On 3/5/2018 11:34 AM, Timon Gehr wrote:
> [...]
>>>       int[] x=[];
>>>       writeln(x[0]); // range violation even with -release
>>>                      // defined behavior even with -boundscheck=off (!)
>>
>> It is not defined behavior with -boundscheck=off.
> 
> Dereferencing null is not defined with -boundscheck=off?

This was my bad. It's not dereferencing null. The compiler is free to assume 0<x.length, which means it is allowed to think that the main function is dead code.

Anyway, a similar point can be made by considering contracts that say that specific values are non-null. They will turn null values into UB even though without them, null dereferences would have been defined to crash.
March 06, 2018
On 03/05/2018 11:57 PM, Timon Gehr wrote:
> On 05.03.2018 22:24, ag0aep6g wrote:
>> On 03/05/2018 10:11 PM, Walter Bright wrote:
[...]
>>> It is not defined behavior with -boundscheck=off.
>>
>> Dereferencing null is not defined with -boundscheck=off?
> 
> This was my bad. It's not dereferencing null. The compiler is free to assume 0<x.length, which means it is allowed to think that the main function is dead code.

How is it free to assume that?

This was the full snippet (before I mutilated it in my quote):

----
void main()@safe{
     int[] x=[];
     writeln(x[0]); // range violation even with -release
                    // defined behavior even with -boundscheck=off (!)
}
----

There is no `assert(0<x.length);` in this one. -release doesn't do anything, because there are no contracts, no asserts, and main is @safe. -boundscheck=off just makes it so that the length isn't checked before x.ptr is dereferenced. x.ptr is null, so the code is defined to dereference null, no?

If -boundscheck=off somehow does introduce UB here, we have the weird situation that using `x.ptr[0]` is more safe than in this scenario than `x[0]`. Because surely `x.ptr[0]` is a null dereference that's not affected by -boundscheck=off, right?
March 06, 2018
On 06.03.2018 00:52, ag0aep6g wrote:
> On 03/05/2018 11:57 PM, Timon Gehr wrote:
>> On 05.03.2018 22:24, ag0aep6g wrote:
>>> On 03/05/2018 10:11 PM, Walter Bright wrote:
> [...]
>>>> It is not defined behavior with -boundscheck=off.
>>>
>>> Dereferencing null is not defined with -boundscheck=off?
>>
>> This was my bad. It's not dereferencing null. The compiler is free to assume 0<x.length, which means it is allowed to think that the main function is dead code.
> 
> How is it free to assume that?
> ...

By Walter's definition. -boundscheck=off makes the compiler assume that all array accesses are within bounds. ("off" is a misleading term.)

> This was the full snippet (before I mutilated it in my quote):
> 
> ----
> void main()@safe{
>       int[] x=[];
>       writeln(x[0]); // range violation even with -release
>                      // defined behavior even with -boundscheck=off (!)
> }
> ----
> 
> There is no `assert(0<x.length);` in this one. -release doesn't do anything, because there are no contracts, no asserts, and main is @safe. -boundscheck=off just makes it so that the length isn't checked before x.ptr is dereferenced.

It's not checked, but the compiler may still assume that it has actually been checked. The story is similar to asserts.

> x.ptr is null, so the code is defined to dereference null, no?
> 
> If -boundscheck=off somehow does introduce UB here, we have the weird situation that using `x.ptr[0]` is more safe than in this scenario than `x[0]`. Because surely `x.ptr[0]` is a null dereference that's not affected by -boundscheck=off, right?

Yes, I think that's a good point (though it hinges on the assumption that x.ptr[i] is equivalent to *(x.ptr+i), which I'm not sure the specification states explicitly).
March 05, 2018
On 3/5/2018 2:30 PM, John Colvin wrote:
> This just feels bad. Adding extra failsafes for my debug program shouldn't make my release program less safe.

Then use `enforce()`.
March 06, 2018
On 06.03.2018 03:05, Walter Bright wrote:
> On 3/5/2018 2:30 PM, John Colvin wrote:
>> This just feels bad. Adding extra failsafes for my debug program shouldn't make my release program less safe.
> 
> Then use `enforce()`.

That makes no sense at all. Enforce is for conditions that are expected to fail in exceptional circumstances. It does not document the same intent as assert does and it cannot be ignored in release builds.

Anyway, "do not use assert" is not the solution, as I have explained many times now.
March 06, 2018
On 3/6/2018 12:45 AM, Timon Gehr wrote:
> Anyway, "do not use assert" is not the solution, as I have explained many times now.

My interpretation is you want D assert to behave like C assert. C assert and enforce are purely creatures of the library, with semantics defined by their library implementation, and have no effect on the core language.

I recommend creating your own library assert, call it 'check' for example, and give it the semantics you wish. You can even have it expand to nothing for release builds.

Creating library asserts is why D has special support for __FILE__ and __LINE__ like C does, and for the same reasons.