July 24, 2015
On 7/24/15 3:35 PM, Jonathan M Davis wrote:
> On Friday, 24 July 2015 at 14:07:46 UTC, Steven Schveighoffer wrote:
>> On 7/23/15 11:58 PM, Jonathan M Davis wrote:
>>> On Friday, 24 July 2015 at 03:02:30 UTC, Steven Schveighoffer wrote:
>>>> Basically, if we say this is undefined behavior, then inout is
>>>> undefined behavior.
>>>
>>> inout is done by the compiler. It knows that it's safe to cast the
>>> return type to mutable (or immutable), because it knows that the return
>>> value was either the argument that it passed in or something constructed
>>> within the function and thus safe to cast. The compiler knows what's
>>> going on, so it can ensure that it doesn't violate the type system and
>>> is well-defined.
>>
>> The compiler knows everything that is going on inside a function. It
>> can see the cast and knows that it should execute it, and also that
>> the original variable is mutable and could be the one being mutated.
>> This isn't any different.
>
> You're assuming that no separate compilation is going on, which in the
> case of a template like you get with RedBlackTree, is true, because the
> source has to be there, but in the general case, separate compilation
> could make it so that the compiler can't see what's going on inside the
> function.

No, I'm not. Using an inout function is like inserting a wrapper around the real function that casts the result back to the right type. The inout rules inside the function make the casting sane without having to examine the code, but the code itself inside does not do any casting.

> It relies on the separate compilation step having verified the
> inout attribute appropriately when the function's body was compiled, and
> all it has to go on is the inout in the signature. If you were to try
> and implement inout yourself in non-templated code, the compiler
> wouldn't necessarily be able to see any of what's going on inside the
> function when it compiles the calling code. Without inout, in
> non-templated code, even if the compiler were being very smart about
> this, it wouldn't have a clue that when you cast away const on the
> return value that it was the same one that was passed in.

The compiler doesn't see "inout", it sees mutable, const, immutable -- three versions of the function. It calls the right one, and inside the function, the casting happens outside the implementation. The compiler doesn't have to know it's the same value, it doesn't even have to care whether the value is modified. It just has to accept that it can't optimize out the loading of the mutable variable again.

In other words, if the compiler compiles this:

int *foo(int *x);

void main()
{
   int x;
   auto y = foo(&x);
   y = 5;
}

It doesn't have to know that y is or is not pointing at x. What it just knows is that x may have changed inside foo, and that it's possible y is pointing at it (and therefore changed it as well).

>>> All of the lines with pureFunc* could be removed outright, because
>>> they're all pure function calls, and they can't possibly have mutated
>>> myFoo. I wouldn't expect a lot of dead code like that, and maybe
>>> something like that would never be implemented in the compiler, but it
>>> could be as long as the compiler can actually rely on const not being
>>> mutated.
>>
>> And my interpretation of the spec doesn't change this. You can still
>> elide those calls as none of them should be casting away const and
>> mutating internally.
>
> You seem to be arguing that as long as you know that a const reference
> refers to mutable data, it is defined behavior to cast away const and
> mutate it.

No. If you *create* a const reference to mutable data, you can cast away that const back to mutable, because everything is there for the compiler to see.

> And if that were true, then if you knew that myFoo referred
> to mutable data, it would be valid to cast away const and mutate it
> inside of one of the pureFunc* functions, because you know that it's
> mutable and not immutable.

No, because those pure functions don't know whether the data is mutable, and the compiler is allowed to infer that they don't based on their signatures.

Basically, it's the difference between these 2 calls:

pure void foo(int *x) { *x = 5;}
pure void bar(const(int) *x) { *(cast(int *)x) = 10;}

void main()
{
   int x;
   const int *y = &x;
   foo(cast(int *)y); // should be OK, can't be elided, and the compiler can see what is going on here
   bar(y); // BAD, compiler is free to remove
}

> And this shows why that isn't enough. And
> that is the major objection I have with what you're arguing here. In the
> general case, even if immutable is not used in the program even once,
> casting away const and mutating is not and cannot be defined behavior,
> or const guarantees nothing - just like in C++.
>
> The exact use case that you're looking for - essentially inout - works
> only because when you cast it back, no const reference that was
> generated by calling the function with a mutable reference escaped that
> function except via the return value, so there's no way for the compiler
> to optimize based on the const reference, because there isn't one
> anymore. And that's _way_ more restricted than saying that it's defined
> behavior to cast away const and mutate as long as you know that the
> underlying data is actually mutable.

I'm not saying that general statement. I'm saying in restricted situations, casting away const is not undefined behavior.

> Your case works, because the const reference is gone after the cast, and
> there are no others that were created from the point that it temporarily
> became const. So, it's a very special case. And maybe the rule can be
> worded in a way that incorporates that nicely, whereas simply saying
> that it's undefined behavior to cast away const and mutate would not
> allow it. But we cannot say that it's defined behavior to cast away
> const and mutate simply because you know that the data is mutable, or we
> do not have physical const, and const provides no real guarantees.

I agree, we can't just make the general case that you can cast away const if you know the data is mutable, given some configuration of function calls. There has to be complete visibility to the compiler within the same function to allow the possibility that some mutable data changed.

We can start with "casting away const and mutating, even if you know the underlying data is mutable, is UB, except for these situations:..."

And relax from there.

-Steve
July 24, 2015
On 07/24/2015 10:08 PM, Steven Schveighoffer wrote:
>
> We can start with "casting away const and mutating, even if you know the
> underlying data is mutable, is UB, except for these situations:..."
>
> And relax from there.

But what is the point?
July 24, 2015
On 7/24/15 4:20 PM, Timon Gehr wrote:
> On 07/24/2015 10:08 PM, Steven Schveighoffer wrote:
>>
>> We can start with "casting away const and mutating, even if you know the
>> underlying data is mutable, is UB, except for these situations:..."
>>
>> And relax from there.
>
> But what is the point?

The original PR is to add a const and immutable version of upperBound and lowerBound to RedBlackTree.

Both of these functions are effectively const. But we must return a range that matches the constancy of the tree itself.

So for example:

RedBlackTree!int m;
const RedBlackTree!int c = m;

auto x = m.upperBound(5); // should return range over mutable ints
auto y = c.upperBound(5); // should return range over const ints.

The chosen implementation was to cast away const inside the const upperBound function, and run the mutable one, knowing that the actual algorithm doesn't modify any data. But I objected saying that it's better to run the code as const, and cast away const at the end in the mutable version, since the compiler will then be mechanically ensuring the const promise in the case of a const RedBlackTree.

The resulting discussion was that this is undefined behavior. But upperBound itself isn't modifying any data, it's just restoring the constancy of the range. But the range itself could potentially be used to modify the data. It didn't seem to me like this should be undefined behavior, since the compiler would have to make a very long connection through the various calls in order to see that everything would be const.

inout would work perfectly here, except you can't create a custom struct with an inout member that implicitly casts back to mutable/const/immutable.

So I don't know the answer. It seems very bad to cast away const to run a complex algorithm without mechanical checking. But ironically, that may be the only defined way to do it (aside from copy-paste implementation, or using a templated implementation).

The advantage of simply clarifying the spec is that the current compiler behavior (which should work) doesn't need to change, we just change the spec.

Ideally, we should just fix the situation with tail-const and we could have the best answer.

I think I'll give up on this argument. There isn't much use in putting in a rule for the spec that covers over a missing feature that we will likely add later.

Also, I just thought of a better way to do this that doesn't require any casting.

Forget this thread ever happened :)

-Steve
July 24, 2015
On Friday, 24 July 2015 at 20:08:11 UTC, Steven Schveighoffer wrote:
> On 7/24/15 3:35 PM, Jonathan M Davis wrote:
>> Your case works, because the const reference is gone after the cast, and
>> there are no others that were created from the point that it temporarily
>> became const. So, it's a very special case. And maybe the rule can be
>> worded in a way that incorporates that nicely, whereas simply saying
>> that it's undefined behavior to cast away const and mutate would not
>> allow it. But we cannot say that it's defined behavior to cast away
>> const and mutate simply because you know that the data is mutable, or we
>> do not have physical const, and const provides no real guarantees.
>
> I agree, we can't just make the general case that you can cast away const if you know the data is mutable, given some configuration of function calls. There has to be complete visibility to the compiler within the same function to allow the possibility that some mutable data changed.
>
> We can start with "casting away const and mutating, even if you know the underlying data is mutable, is UB, except for these situations:..."

The only except that makes any sense to me is when you're casting away const from the last const reference, so there are no const references left for the compiler to make any assumptions - so the case where you're trying to mimic inout. Something like

----
int x;
const int *y = &x;
*(cast(int *)y) = 5;
----

should be completely invalid IMHO. I don't see any reason to make it valid to cast away const and mutate just because the compiler can see that that's what you're doing, especially when it doesn't buy you anything, since you have access to the mutable reference anyway. Allowing it would just complicate things.

It might be possible to word the spec in a way to essentially allow you to do your own inout when inout doesn't cut it, since you're not really violating what const is supposed to guarantee, but for the rest, I say leave it undefined, because in that case you are violating it.

- Jonathan M Davis
July 24, 2015
On Friday, 24 July 2015 at 20:44:44 UTC, Steven Schveighoffer wrote:
> The advantage of simply clarifying the spec is that the current compiler behavior (which should work) doesn't need to change, we just change the spec.
>
> Ideally, we should just fix the situation with tail-const and we could have the best answer.

Yeah. That needs to be fixed. As I understand it, it's feasible without any language improvements, but it's horrific. Jonathan Crapuchettes talked at one point about doing it at EMSI (and how hard it was). The last time I tried it, I ran into problems with recursive template definitions, though static if can probably solve those.

Regardless, the situation with it is ugly and not well understood, even if there is a solution, and ideally, we'd find a way to implement it that was a lot easier and cleaner. Without that, almost no one is going to be doing it - probably even if there's an article on dlang.org explaining how - simply because of how annoying it is to do.

> I think I'll give up on this argument. There isn't much use in putting in a rule for the spec that covers over a missing feature that we will likely add later.
>
> Also, I just thought of a better way to do this that doesn't require any casting.
>
> Forget this thread ever happened :)

Well, regardless of whether mimicking inout like we're talking about with RedBlackTree should be considered defined behavior or not, I think that the spec should be updated so that the situation is clearer. It needs to be clear to the community at large that you _cannot_ be casting away const and mutating simply because you know that the data is mutable underneath rather than immutable.

- Jonathan M Davis
July 26, 2015
On Friday, 24 July 2015 at 21:12:57 UTC, Jonathan M Davis wrote:
> Well, regardless of whether mimicking inout like we're talking about with RedBlackTree should be considered defined behavior or not, I think that the spec should be updated so that the situation is clearer. It needs to be clear to the community at large that you _cannot_ be casting away const and mutating simply because you know that the data is mutable underneath rather than immutable.

Pull request for that:
https://github.com/D-Programming-Language/dlang.org/pull/1047

August 06, 2015
On Friday, 24 July 2015 at 21:12:57 UTC, Jonathan M Davis wrote:
> Yeah. That needs to be fixed. As I understand it, it's feasible without any language improvements, but it's horrific. Jonathan Crapuchettes talked at one point about doing it at EMSI (and how hard it was). The last time I tried it, I ran into problems with recursive template definitions, though static if can probably solve those.
>
> Regardless, the situation with it is ugly and not well understood, even if there is a solution, and ideally, we'd find a way to implement it that was a lot easier and cleaner. Without that, almost no one is going to be doing it - probably even if there's an article on dlang.org explaining how - simply because of how annoying it is to do.

Please open a Bugzilla issue to keep track of this and raise awareness. If we're going to need a language feature we need to start collecting arguments, and maybe someone can still come up with a clean solution.
It's an important issue b/c it affects every container range.
1 2
Next ›   Last »