-preview=in might break code (page 15)

On 10/5/2020 4:32 AM, Iain Buclaw wrote: > I don't consider there to be any difference between the two as far as parameter passing is concerned. As I understood from the review, the point of ref passing is to elide copies. I see a major difference, as relying on the number of copies is not the same as memory corruption. Eliding copies is the bread and butter of optimizers, btw. > Because this is allowed as an optimization only, none of > what it does should spill out into user code. If people notice then something > has gone wrong in the implementation. The examples posted here shows it DOES. If an `in` passes by `const ref`, and another mutable reference to the same memory object decides to free the memory, the `in` reference now is a live dangling pointer.

On Monday, 5 October 2020 at 17:18:57 UTC, Walter Bright wrote: > On 10/5/2020 4:32 AM, Iain Buclaw wrote: >> I don't consider there to be any difference between the two as far as parameter passing is concerned. As I understood from the review, the point of ref passing is to elide copies. > I see a major difference, as relying on the number of copies is not the same as memory corruption. Eliding copies is the bread and butter of optimizers, btw. > > > > Because this is allowed as an optimization only, none of > > what it does should spill out into user code. If people > notice then something > > has gone wrong in the implementation. > > The examples posted here shows it DOES. > > If an `in` passes by `const ref`, and another mutable reference to the same memory object decides to free the memory, the `in` reference now is a live dangling pointer. None of the posted examples I've seen here affect the implementation being trialed in GDC. Though I've already said that I'm likely being more conservative than DMD.

On 05.10.20 19:11, Walter Bright wrote: > On 10/5/2020 12:56 AM, Iain Buclaw wrote: >> Actually, I think there is zero mention of aliasing in the language spec, so the following can only be interpreted as being valid and precisely defined to work in D. >> --- >> float f = 1.0; >> bool *bptr = cast(bool*)&f; >> bptr[2] = false; >> assert(f == 0.5); >> --- > > @safe code won't allow such a cast. @safe allows the cast just fine. It doesn't allow the pointer arithmetic, but that can easily be worked around: ---- void main() @safe { float f = 1.0; bool[4]* bptr = cast(bool[4]*) &f; (*bptr)[2] = false; assert(f == 0.5); } ---- (Compile with `-preview=dip1000`, because `f` is on the stack.)

On Monday, 5 October 2020 at 15:25:15 UTC, Iain Buclaw wrote: > On Monday, 5 October 2020 at 13:27:00 UTC, kinke wrote: >> >> Wrt. the concerns about differing ref/value decisions for PODs across compilers/platforms and thus implementation-dependent potential aliasing issues for lvalue args: a possible approach could be leaving everything as-is ABI-wise, but have the compiler create and pass a temporary in @safe callers if the callee takes a ref, unless it can prove there's no way the arg can be aliased. E.g., assuming x87 `real` for Win64: >> >> void callee(in real x); // e.g., by-ref for Win64, by-value for Posix x86_64 >> >> void safeCaller1(ref x) @safe >> { >> callee(x); // x might be aliased by global state >> // for Win64: auto tmp = x, callee(tmp); >> // Posix x86_64: by-value, so simply `callee(x)` >> } > > So then `in` would come with its own semantic, that requires new code to handle, rather than piggy-backing off of `ref`? It already has its own semantic with -preview=in, so this would be a concession for all those raising concerns about implementation-dependent aliasing issues. It would just reduce new `in` copy elisions for PODs in @safe code and prevent all related aliasing trouble (again, PODs only - aliasing could still be an issue for non-PODs, but that's not implementation-dependent). @safe is already slower due to enabled bounds checks even with `-release`, so I could live with it.

On 10/5/2020 10:46 AM, Iain Buclaw wrote: > None of the posted examples I've seen here affect the implementation being trialed in GDC. Though I've already said that I'm likely being more conservative than DMD. I could implement it in DMD by whenever the spec allows it to be by value, doing it by value. Then the problem would never happen. (This is how I implemented __restrict__ in Digital Mars C.) But that would defeat the purpose of the feature. And since the specification allows it to be ref, but does not require it, I would be forced to disallow `in` in any code under my purview.

October 06, 2020

Re: -preview=in might break code

Posted by Mathias LANG
in reply to Andrei Alexandrescu

Permalink

Mathias LANG

Posted in reply to Andrei Alexandrescu

Permalink

On Saturday, 3 October 2020 at 22:55:36 UTC, Andrei Alexandrescu wrote:
> On 10/3/20 5:36 PM, Mathias LANG wrote:
>> [...]
>> 
>>  From the caller's point of view, it's also simpler with `in`. The same function will always be called
>
> Not across long distance changes and platform particulars. This is a very important detail.

This merely sidesteps the question, not actually answering the point raised.
Very little, if anything, can resist long distance changes and platform particulars.
We don't ban `size_t` from code because adding two values might have a different result depending on the platform. We don't ban `extern(C)` because someone might use a name that happens to be a D symbol and break completely unrelated things.

A templated function with an `auto ref` parameter can lead to a different function being called based on your architecture, I demonstrated so in my previous message. Regarding long distance change, we would have to define what a degree is, and how many degrees is long distance. But unless "change" is bound to a very specific meaning made to overfit what happens with `in` and not `auto ref`, rest assured that both (and probably many other languages features) will be affected just the same.

I did a bit of digging on `auto ref`, to supplement this conversation. There was one post from Jonathan M. Davies that phrased it well:
> With auto ref, you're specifically saying that you don't care whether the function is given an lvalue or rvalue. You just want it to avoid unnecessary copies. That's very different. And auto ref then not only then protects you from cases of passing an rvalue to a function when it needs an lvalue, but it makes it clear in the function signature which is expected.

https://forum.dlang.org/post/mailman.3031.1356562349.5162.digitalmars-d@puremagic.com

I found a few other discussions, including this: https://forum.dlang.org/thread/rehsmhmeexpusjwkfnoy@forum.dlang.org (page 6 was quite relevant), and of course https://github.com/dlang/dmd/pull/4717

Much to my surprise, the challenges of parameter aliasing was never brought up in any of those topics, because, as quoted before, "you don't care whether the function is given an lvalue or a rvalue", which conversely means "you don't care if your function receives an lvalue or a rvalue". For this to be possible without affecting the observable behavior of the function, one has to rule out mutation via aliasing.

On Monday, 5 October 2020 at 17:11:52 UTC, Walter Bright wrote: > > The problems come from: > > 1. the user not knowing if `in` is passing by ref or not > > 2. being "implementation defined" meaning that the user simply cannot know (1) because it can change from version to version, or with changes in compiler switch settings. Not just in switching from one compiler to another (although that's bad enough) > > 3. the user would have to look at the disassembly to determine (1) or not, and this is unreasonable > > 4. if the function is a template with `in T t` as a parameter, the user cannot know if `t` is passed by ref or not 1. Just like `auto ref`, `in` should be user when the user doesn't care whether he gets an lvalue or a rvalue. This means that the user either expect no aliasing, or that aliasing does not affect the observed behavior. 2. Answered in (1) 3. `__traits(isRef)` works perfectly fine, no need to look at the assembly. It's not currently tested though, I'll add it to the test suite. 4. The user can know using `__traits(isRef)`. In general, the user shouldn't care (see 1), but the ability is there.

On Monday, 5 October 2020 at 17:18:57 UTC, Walter Bright wrote: > On 10/5/2020 4:32 AM, Iain Buclaw wrote: >> I don't consider there to be any difference between the two as far as parameter passing is concerned. As I understood from the review, the point of ref passing is to elide copies. > > I see a major difference, as relying on the number of copies is not the same as memory corruption. Eliding copies is the bread and butter of optimizers, btw. > > > Because this is allowed as an optimization only, none of > > what it does should spill out into user code. If people > notice then something > > has gone wrong in the implementation. > > The examples posted here shows it DOES. > > If an `in` passes by `const ref`, and another mutable reference to the same memory object decides to free the memory, the `in` reference now is a live dangling pointer. The complains seem to be about observable difference, not memory corruption. It was suggested a few times that `in` should just be `ref`. If the current status is really unworkable (but again, I recommend anyone to give it a try first), that would be my preferred course of action. The issue of freeing live data is not specific to `in`, it shows up with `ref` and pointers as well.

On Friday, 2 October 2020 at 22:11:01 UTC, Walter Bright wrote: > On 10/2/2020 10:31 AM, Steven Schveighoffer wrote: >> And this might not be true on a different compiler. > > This is looking like a serious problem. I agree. I did mention at the time that how parameters are actually passed should be left to the backend.

Forums