October 04, 2020
On Sunday, 4 October 2020 at 21:45:03 UTC, Walter Bright wrote:
> `in` is a nice, friendly looking construct. It looks like pass-by-value, which we're all familiar with

Not to a C# programmer fyi:

https://docs.microsoft.com/en-us/dotnet/csharp/language-reference/keywords/in-parameter-modifier
October 05, 2020
On Sunday, 4 October 2020 at 23:40:29 UTC, Ola Fosheim Grøstad wrote:
> It might help some if compilers would run unit tests 3 times with different 'in' implementations.
>
> 1 mixed value/ref
> 2 value
> 3 ref

I was about to propose something like this (restricted to POD types), possibly augmented by some indeterministic fuzzing. Code ported to the new `in` semantics could then even show some previously unintended aliasing issues; getting to the root of the problem would probably still be non-trivial though.

A compiler mode enforcing by-ref, coupled with some sort of runtime sanitizer detecting invalid writes to live in-params, would probably be a very valuable tool for validation & troubleshooting.
October 04, 2020
On 10/4/2020 3:27 PM, kinke wrote:
> And yet all major C++ compilers support it one way or another (AFAIK, just for more aggressive optimizations though, no need to check for overlaps).

Ask a random C++ programmer what __restrict__ means. At least it has an ugly syntax, which people who don't know what it means will avoid. `in` is a gingerbread house.


> Anyway, I probably shouldn't have brought up __restrict__ - simply because every D dev should already know about aliasing pitfalls wrt. regular refs:
> 
> int foo(const ref int a, ref int b)
> {
>      b = 123;
>      return a;
> }
> 
> void main()
> {
>      int a = 0;
>      assert(foo(a, a) == 0); // oops, aliasing
> }

That's right, but they *do not* know about `in` introducing such, and there's no way to reliably detect it.


>> `in` is a nice, friendly looking construct. It looks like pass-by-value, which we're all familiar with, and in fact we've trained users to regard it as pass-by-value because that's the existing behavior for 20 years. Changing its behavior so it *may* introduce memory corruption in very un-obvious and un-checkable ways is a very serious problem.
> 
> I've been unhappy about the wasted potential for `in` ever since working on the ABI details of LDC, and new `-preview=in` is something I've wanted for years. Again, the aliasing issue should be quite easily avoided by imagining `in` params as `const scope ref` at the call sites.

I'd be fine calling it `const scope ref`. At least it says what it is doing, and its behavior is not implementation defined.

> For those who find that too complicated, well, just don't opt into the preview feature. We'll see how the adoption goes.

Few if any users will notice a problem caused by it. But they're going to hate us when they do have a problem in the field with it.

The thing is, I've been working for years on making D as memory safe as we can. This feature is a big step backwards.
October 04, 2020
On 10/3/2020 7:10 AM, Steven Schveighoffer wrote:
> But the problem *IS* that it is *very uncommon* that you will run into issues. An issue that is common is a problem in it's own right. An issue that is subtle and very uncommon (so uncommon that you haven't found a case of it yet) is a different kind of problem. I don't believe that less frequent means we have a better situation, I think it's worse.

This. The rarity (and unpredictability) of it causing a problem is the problem.

When I have a bug in my code, I want it to fail hard as soon, as obviously, and as often as possible, so it won't escape into a released product.
October 04, 2020
On 10/3/2020 7:48 AM, Adam D. Ruppe wrote:
> But maybe like you said later, the spec should say any `in` is treated at the language level as a `ref` (except for rvalue issues of course) just optimized to value in defined ABI places. That'd probably be good enough.

But if the user *expected* to change that const ref view of the data, and relied on it, then that behavior randomly breaks.

October 04, 2020
On 10/3/2020 7:49 AM, Steven Schveighoffer wrote:
> `in ref` is a reference, and it's OK if we make this not a reference in practice, because it's const.

No, it is not. Because a `const ref` can be changed by another mutable reference to the same memory object. This is defined behavior.

This suggestion turns defined behavior into implementation-defined behavior, meaning it will break existing code written in good faith in unpredictable, unreliable ways.

October 04, 2020
On 10/4/20 10:19 AM, Iain Buclaw wrote:
> 3. There is no danger of ref/non-ref mismatches in ABI, because `in` parameters that are inferred `ref` are going to be passed in memory anyway.
> 
> In the following:
> ```
> alias A = void delegate(in long);
> alias B = void delegate(const long);
> ```
> Either `long` always gets passed in memory, or always gets passed in registers, in both cases, they are always going to be covariant.  The same remains true whatever type you replace `long` with.

As the change says, it's up to the back end to decide but is currently types over 2 machine word size. So this means on 32-bit systems, 80-bit reals would be passed by reference for example.

In practice, I don't know what this means for future or other platform compilers.

But as a trivial counter-example, "the same remains true whatever type you replace `long` with" isn't correct:

struct S
{
  size_t[3] data;
}

alias A = void delegate(in S);
alias B = void delegate(const S);

static assert(is(A : B)); // fails with -preview=in

One can imagine this being a sticking point where code that builds fine on 64-bit systems because of the "lucky chance" that a struct can be passed by value is not true on a 32-bit system.

I just found it odd that this was touted as a benefit.

-Steve
October 04, 2020
On 10/4/20 10:08 PM, Walter Bright wrote:
> On 10/3/2020 7:49 AM, Steven Schveighoffer wrote:
>> `in ref` is a reference, and it's OK if we make this not a reference in practice, because it's const.
> 
> No, it is not. Because a `const ref` can be changed by another mutable reference to the same memory object. This is defined behavior.

My logic in thinking about it is that the function can't know at all that a ref-to-value optimization happened, simply because it cannot infer or prove that another argument or a global, or whatever, is aliased to that ref (and it can't modify the data via the ref, unlike C++).

But actually, that's not the whole story, and I'm thinking that really we can't do that.

A *caller* can know that two parameters are aliased (trivially -- pass the same parameter twice), and in that case, expect a certain outcome.

So yeah, we can't do this. Just make `in` always ref, and bind to rvalues.

In a response a few posts up, Mathias says "Perhaps you don't know this, but the very first implementation, the one I had when I opened the PR on April 3rd actually always used `ref`. It had quite a few issues."

What were those issues? Why can't in mean ref always? Or maybe just bind `in ref` to rvalues?

-Steve
October 05, 2020
On Monday, 5 October 2020 at 01:59:58 UTC, Walter Bright wrote:
> On 10/4/2020 3:27 PM, kinke wrote:
>> Again, the aliasing issue should be quite easily avoided by imagining `in` params as `const scope ref` at the call sites.
>
> I'd be fine calling it `const scope ref`. At least it says what it is doing, and its behavior is not implementation defined.

But that's not the whole picture. It's about optimizing small-POD cases such as `in float4` - no, I don't want (extremely verbose) `const scope ref float4` to dump the argument to stack, pass a pointer to it in a GP register, and then have the callee load the 4 floats from the address in that GP register into an XMM register again - I want to pass the argument directly in an XMM register. Similar thing for an int - why waste a GP register for an address to something that fits into that register directly?
With `-preview=in`, this works beautifully for generic templated code too - non-PODs and large PODs are passed by ref, small PODs by value, something C++ can only dream of.

> The thing is, I've been working for years on making D as memory safe as we can. This feature is a big step backwards.

But you're solely focusing on static analysis. Static analysis is great but quite obviously limited. Runtime sanitizers are a necessary supplement, and easy to integrate with LLVM for example - adding support for existing language-independent address and thread sanitizers for LDC was pretty simple, and excluding tests, probably amounted to a few hundred lines, mostly wrt. copying and correctly linking against prebuilt libs.

Wrt. aliasing, I think it's more of a general problem, and I guess a `const ref` param being mutated while executing that function is almost always an unintended bug. -preview=in would make that definitely a bug for each `in` param.
October 04, 2020
On 10/4/20 10:19 AM, Iain Buclaw wrote:
> On Saturday, 3 October 2020 at 05:02:36 UTC, Andrei Alexandrescu wrote:
>> On 10/2/20 6:11 PM, Walter Bright wrote:
>>> On 10/2/2020 10:31 AM, Steven Schveighoffer wrote:
>>>> And this might not be true on a different compiler.
>>>
>>> This is looking like a serious problem.
>>
>> They say one should choose one's fights wisely, so I spent some time
>> pondering. I could just let this thread scroll by and not think twice
>> about it. By next week, I may as well forget.
>>
>> But then I realized this is /exactly/ the kind of crap that we'll all
>> scratch our heads six months from now, "How did this ever pass review?
> 
> I take offense to that.  I'd prefer if you'd moderate your tone please.

Of course. Please accept my apologies.

>> Who approved this? How in the world did a group of competent,
>> well-intended people, looked at this and said - yep, good idea. Let's."
>>
>> ???
>>
> 
> *You* approved it.
> 
> https://github.com/dlang/dmd/pull/11000#issuecomment-675605193

Oi. Touché.

>> Please, we really need to put back the toothpaste in the tube here. I could on everybody's clear head here to reconsider this.
> 
> Frankly, I think you are making a mountain out of a molehill here.  You are imagining a problem that doesn't exist; and if one does find an issue, the fault lies with the DMD compiler and not the D language specification.  Though evidently having clearer wording in the spec benefits all.

I think my STL examples have put the narrative that confusing aliasing is rare to rest.

> If you read nothing more of this reply, at least finish up until the end of this paragraph.  Please hold fire until GDC and LDC have implemented this feature, then we can discuss the pitfalls that we've encountered with it.  Basing decisions on behaviors observed with DMD is not the right approach, and if you are currently finding the situation to be a mess, it is a mess of DMD's own doing.

Implementation details of dmd are not of concern here, and in fact the more different ldc/gdc/dmd are from one another, the more problematic the entire matter is.

> 2. Aliasing was raised multiple times throughout the review, I even gave this example at time to demonstrate my concerns:
> ```
> void bar(in int a, out int b) { }
> int a = 42;
> bar(a, a);
> ```
> But ultimately, worrying about this is missing the point, as the problem is already present in the compiler even without `-preview=in`, and Walter is working on fixing it.  In the meantime, these sorts of cases are relatively trivial to pick up and can be added as warnings in GDC and LDC until the front-end implements the semantic guarantees.

That's nice. If the "in" feature is paired appropriately with some sort of verification, my concerns are entirely allayed. "Entirely" of course if the problem is resolved entirely as well. Thanks.