March 07, 2018
On Wednesday, March 07, 2018 09:22:40 Paolo Invernizzi via Digitalmars-d wrote:
> On Wednesday, 7 March 2018 at 09:11:10 UTC, Jonathan M Davis
>
> wrote:
> >> So, the request is to just leave assert active as a default in @safe code, like the bounds checks?
> >
> > No. I'm saying that no optimizations should be enabled which introduce potential memory corruption. Assertions should have zero impact on whether code is @safe or not unless the code in the condition or which is generating the message for the assertion is @system, and it's no more reasonable to assume that an assertion is going to pass than it is to assume that bounds checking won't fail. Regardless, the key thing here is that @safe code should be guaranteed to be @safe so long as @trusted code is vetted properly. It should _never_ be possible for the compiler to introduce memory safety issues into @safe code.
> >
> >> So, the reasoning is that UB should not lead to memory corruption, right?
> >
> > The reasoning is that no @safe code should ever have memory corruptions in it unless it calls @trusted code that was incorrectly vetted by the programmer. The compiler is supposed to guarantee that @safe code is @safe just like it's supposed to guarantee that a const variable isn't mutated or that a pure function doesn't access mutable, global variables. And as such, introducing anything into @safe code which could be memory unsafe is a violation of the compiler's responsibility.
>
> Jonathan, I understand your reasoning, but it's not what I'm asking: are we asking for UB that do not lead to memory corruption?

I'm saying that @safe code must not be violated by the compiler. Beyond that I'm not arguing about UB one way or the other. If UB must be disallowed to avoid violating @safe, then it must be disallowed. If some form of UB can be allowed, because it's restricted in a manner that it can't violate @safe but may do something else stupid because the assertion would have failed if it weren't compiled out, I don't care. If an assertion would have failed if it weren't compiled out, then you have a bug regardless, and if the code is buggier because of an optimization, then that's fine with me. You have a bug either way. What isn't fine is that that result violate @safe, because that would defeat the entire purpose of @safe and make it far, far more difficult to track down @safety problems.

Right now, since no optimizations like Walter has been talking about are done by the compiler, if you have memory corruption, you know that you only have to look at @system and @trusted code to find it, whereas with the unsafe optimizations that Walter is talking about, it then becomes possible that you're going to have to look through the entire program to find the problem. And right now, you can be sure that you don't have @safety problems in @safe code if you use @trusted correctly, whereas with what Walter is talking about, simply adding an assertion could add @safety problems to your code.

- Jonathan M Davis

March 07, 2018
On Wednesday, 7 March 2018 at 11:52:05 UTC, Jonathan M Davis wrote:
> On Wednesday, March 07, 2018 09:22:40 Paolo Invernizzi via Digitalmars-d wrote:
>> On Wednesday, 7 March 2018 at 09:11:10 UTC, Jonathan M Davis
>>
>> wrote:
>> >> So, the request is to just leave assert active as a default in @safe code, like the bounds checks?
>> >
>> > No. I'm saying that no optimizations should be enabled which introduce potential memory corruption. Assertions should have zero impact on whether code is @safe or not unless the code in the condition or which is generating the message for the assertion is @system, and it's no more reasonable to assume that an assertion is going to pass than it is to assume that bounds checking won't fail. Regardless, the key thing here is that @safe code should be guaranteed to be @safe so long as @trusted code is vetted properly. It should _never_ be possible for the compiler to introduce memory safety issues into @safe code.
>> >
>> >> So, the reasoning is that UB should not lead to memory corruption, right?
>> >
>> > The reasoning is that no @safe code should ever have memory corruptions in it unless it calls @trusted code that was incorrectly vetted by the programmer. The compiler is supposed to guarantee that @safe code is @safe just like it's supposed to guarantee that a const variable isn't mutated or that a pure function doesn't access mutable, global variables. And as such, introducing anything into @safe code which could be memory unsafe is a violation of the compiler's responsibility.
>>
>> Jonathan, I understand your reasoning, but it's not what I'm asking: are we asking for UB that do not lead to memory corruption?
>
> I'm saying that @safe code must not be violated by the compiler. Beyond that I'm not arguing about UB one way or the other.

And that's clear.

> If UB must be disallowed to avoid violating @safe, then it must be disallowed.

And how to do this, in practise I mean?

> If some form of UB can be allowed, because it's restricted in a manner that it can't violate @safe but may do something else stupid because the assertion would have failed if it weren't compiled out, I don't care.

And that's the original question: are we asking for UB that do not lead to memory corruption?

> If an assertion would have failed if it weren't compiled out, then you have a bug regardless, and if the code is buggier because of an optimization, then that's fine with me. You have a bug either way.

Agreed.

> What isn't fine is that that result violate @safe, because that would defeat the entire purpose of @safe and make it far, far more difficult to track down @safety problems.

So, see above, the original question, agreed.

> Right now, since no optimizations like Walter has been talking about are done by the compiler, if you have memory corruption, you know that you only have to look at @system and @trusted code to find it, whereas with the unsafe optimizations that Walter is talking about, it then becomes possible that you're going to have to look through the entire program to find the problem.

Or you can just turn on assertion, right?

If we have corrupted memory in release, there's a bug, somewhere, in the logic or in the implementation of the logic.
As you have told, we must audit @system and @trusted, we can imagine to use static checkers or some strange beast like that.
But, while doing that, I think that the most common practise is keep running the code with assertion on, do you agree?

> And right now, you can be sure that you don't have @safety problems in @safe code if you use @trusted correctly, whereas with what Walter is talking about, simply adding an assertion could add @safety problems to your code.

Nope, not adding an assertion, but having the process in UB state.
And we are back again to the original question.

/Paolo


March 07, 2018
On Wednesday, 7 March 2018 at 08:58:50 UTC, Paolo Invernizzi wrote:
> Just to understand, otherwise, if the assert is removed and it does not hold, you are in UB,

You're not. Just let the compiler treat the code as if the asserts weren't there. If the resulting code has UB, it won't compile, because @safe code is statically checked to not have UB.

> so the request is to guarantee memory safety in a UB state, right?

I don't think anyone is asking for that. The request is for no UB in @safe code.
March 07, 2018
On Wednesday, March 07, 2018 13:24:19 Paolo Invernizzi via Digitalmars-d wrote:
> > Right now, since no optimizations like Walter has been talking about are done by the compiler, if you have memory corruption, you know that you only have to look at @system and @trusted code to find it, whereas with the unsafe optimizations that Walter is talking about, it then becomes possible that you're going to have to look through the entire program to find the problem.
>
> Or you can just turn on assertion, right?
>
> If we have corrupted memory in release, there's a bug, somewhere,
> in the logic or in the implementation of the logic.
> As you have told, we must audit @system and @trusted, we can
> imagine to use static checkers or some strange beast like that.
> But, while doing that, I think that the most common practise is
> keep running the code with assertion on, do you agree?

That would make assertions a lot worse to use, because then they would be in production code slowing it down. Also, as it stands, -release is not supposed to violate @safe. To do that, you have to use -boundscheck=off to turn off bounsd checking. That was a very purposeful design decision, because we did not want -release to violate @safe, and if the compiler is allowed to add optimizations which are unsafe based on assertions, then that completely destroys the ability to have @safe code with -release. And if we were going to do that, why did we leave array bounds checking on with -release?

Assertions are to help debug code, and most people disable them in production with zero expectation that that's going to result in their @safe code suddenly becoming unsafe.

It's a huge change if -release makes code unsafe, and IMHO, doing so would make assertions an immediate code smell and that assertions should then never be used except by folks who are willing to leave them in all the time. I don't see how it can be argued that allowing assertions to introduce unsafe behavior into @safe code is not in complete violation of what @safe is supposed to do and guarantee.

It's already bad enough that we talk about how memory safe D code is when @safe isn't the default, but to then completely bypass @safe like this seems unconscionable to me. That would be like deciding that we're now going to introduce cast_cast and mutable into the language and allow const's guarantees to be violated whenever the programmer feels like it. @safe needs to actually be guaranteed to be @safe or it's worthless.

- Jonathan M Davis

March 07, 2018
On Wednesday, 7 March 2018 at 13:32:37 UTC, ag0aep6g wrote:
> On Wednesday, 7 March 2018 at 08:58:50 UTC, Paolo Invernizzi wrote:
>> Just to understand, otherwise, if the assert is removed and it does not hold, you are in UB,
>
> You're not. Just let the compiler treat the code as if the asserts weren't there. If the resulting code has UB, it won't compile, because @safe code is statically checked to not have UB.
>
>> so the request is to guarantee memory safety in a UB state, right?
>
> I don't think anyone is asking for that. The request is for no UB in @safe code.

Are we asking to statically check things like:

Assign Expressions [1]
Undefined Behavior:
  if the lvalue and rvalue have partially overlapping storage
  if the lvalue and rvalue's storage overlaps exactly but the types are different

Is that doable, in practise?

[1] https://dlang.org/spec/expression.html#assign_expressions

/Paolo


March 07, 2018
On Wednesday, March 07, 2018 14:01:31 Paolo Invernizzi via Digitalmars-d wrote:
> On Wednesday, 7 March 2018 at 13:32:37 UTC, ag0aep6g wrote:
> > On Wednesday, 7 March 2018 at 08:58:50 UTC, Paolo Invernizzi
> >
> > wrote:
> >> Just to understand, otherwise, if the assert is removed and it does not hold, you are in UB,
> >
> > You're not. Just let the compiler treat the code as if the asserts weren't there. If the resulting code has UB, it won't compile, because @safe code is statically checked to not have UB.
> >
> >> so the request is to guarantee memory safety in a UB state, right?
> >
> > I don't think anyone is asking for that. The request is for no UB in @safe code.
>
> Are we asking to statically check things like:
>
> Assign Expressions [1]
> Undefined Behavior:
>    if the lvalue and rvalue have partially overlapping storage
>    if the lvalue and rvalue's storage overlaps exactly but the
> types are different
>
> Is that doable, in practise?
>
> [1] https://dlang.org/spec/expression.html#assign_expressions

In places where the compiler can statically check things, it does. In the places where it can't, it either introduces runtime checks (e.g. array bounds checking), or it treats the code as @system, forcing the programmer to ensure that the code is @safe, since the compiler can't determine whether it is or not. Either way, we then get the guarantee that @safe code is memory safe so long as @trusted is used correctly.

- Jonathan M Davis

March 07, 2018
On Wednesday, 7 March 2018 at 13:55:11 UTC, Jonathan M Davis wrote:
> On Wednesday, March 07, 2018 13:24:19 Paolo Invernizzi via Digitalmars-d wrote:
>> [...]
>
> That would make assertions a lot worse to use, because then they would be in production code slowing it down. Also, as it stands, -release is not supposed to violate @safe. To do that, you have to use -boundscheck=off to turn off bounsd checking. That was a very purposeful design decision, because we did not want -release to violate @safe, and if the compiler is allowed to add optimizations which are unsafe based on assertions, then that completely destroys the ability to have @safe code with -release. And if we were going to do that, why did we leave array bounds checking on with -release?
>
> [...]

Jonathan, I understand your point, but still I can't find an answer to clarify my doubts.

Are we asking for no UB in @safe code?
Are we asking for UB in @safe code but constrained to no memory corruptions?

/Paolo
March 07, 2018
On Wednesday, March 07, 2018 14:08:35 Paolo Invernizzi via Digitalmars-d wrote:
> On Wednesday, 7 March 2018 at 13:55:11 UTC, Jonathan M Davis
>
> wrote:
> > On Wednesday, March 07, 2018 13:24:19 Paolo Invernizzi via
> >
> > Digitalmars-d wrote:
> >> [...]
> >
> > That would make assertions a lot worse to use, because then they would be in production code slowing it down. Also, as it stands, -release is not supposed to violate @safe. To do that, you have to use -boundscheck=off to turn off bounsd checking. That was a very purposeful design decision, because we did not want -release to violate @safe, and if the compiler is allowed to add optimizations which are unsafe based on assertions, then that completely destroys the ability to have @safe code with -release. And if we were going to do that, why did we leave array bounds checking on with -release?
> >
> > [...]
>
> Jonathan, I understand your point, but still I can't find an answer to clarify my doubts.
>
> Are we asking for no UB in @safe code?
> Are we asking for UB in @safe code but constrained to no memory
> corruptions?

@safe is all about guaranteeing memory safety. That's it's entire job. No more, no less. What happens with UB beyond that is irrelevant. If satisfying the requirement that @safe code be memory safe means that UB cannot be allowed in @safe code, then UB cannot be allowed in @safe code. If there is some form of UB that is constrained enough that it's guaranteed that it can't violate memory safety, then I don't see any reason why it can't be in @safe code any more than it can't be in @system code, because it's not violating the guarantees that @safe is intended to provide - that the code is memory safe.

Other language rules may make UB illegal or explicitly allow it for one reason or another (e.g. it's supposed to be guaranteed that function arguments are evaluated left-to-right, though I'm not sure if that's ever been implemented like it's supposed to be), but in the case of @safe, it's all about what's memory safe. And what is or isn't allowed with regards to UB in @safe therefore has to be a function of what is required to guarantee that the code is memory safe.

- Jonathan M Davis

March 07, 2018
On 03/07/2018 03:01 PM, Paolo Invernizzi wrote:
> On Wednesday, 7 March 2018 at 13:32:37 UTC, ag0aep6g wrote:
[...]
>> I don't think anyone is asking for that. The request is for no UB in @safe code.
> 
> Are we asking to statically check things like:
> 
> Assign Expressions [1]
> Undefined Behavior:
>    if the lvalue and rvalue have partially overlapping storage
>    if the lvalue and rvalue's storage overlaps exactly but the types are different

If it can't be guaranteed that some code has defined behavior, then it's not allowed in an @safe function (or it should not be allowed). We are not asking for all valid code to be @safe.

Guaranteeing no UB is exactly @safe's purpose. The spec says: "Safe functions are functions that are statically checked to exhibit no possibility of undefined behavior." [1]

> Is that doable, in practise?

If you think that's not doable, what do you think @safe should aim for?


[1] https://dlang.org/spec/function.html#function-safety
March 07, 2018
On 03/07/2018 03:01 PM, Paolo Invernizzi wrote:
> Are we asking to statically check things like:
> 
> Assign Expressions [1]
> Undefined Behavior:
>    if the lvalue and rvalue have partially overlapping storage
>    if the lvalue and rvalue's storage overlaps exactly but the types are different
A simple way to get overlapping storage is with a union. Unfortunately, DMD accepts this:

----
struct S
{
    union
    {
        int i;
        byte b;
        float f;
        struct
        {
            byte b2;
            align(1) int i2;
        }
    }
}

void main() @safe
{
    S s;
    s.i = s.b; /* Partially overlapping, different types. */
    s.f = s.i; /* Exactly overlapping, different types. */
    s.i = s.i2; /* Partially overlapping, same type. */
}
----

I've filed an issue:
https://issues.dlang.org/show_bug.cgi?id=18568

If you have more examples of UB in @safe functions, don't hesitate to file them as bugs.