July 20, 2018
On Fri, 20 Jul 2018 at 12:51, Jonathan M Davis via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
>
> On Friday, July 20, 2018 12:21:42 Manu via Digitalmars-d wrote:
> > I think disabling it is going to be the overwhelmingly niche case, but it's good that you can still do it and be satisfied if that's what you want to do.
> >
> > Can you link me to a single line of your own code where you've used this pattern?
>
> What, having ref mutate its argument and that being its entire purpose for being there? How about almost every time I have ever used ref ever. e.g. this is exactly what some functions in dxml do, and it would be just plain wrong to pass an rvalue - e.g. the overload of parseDOM that operates on an EntityRange:
>
> http://jmdavisprog.com/docs/dxml/0.3.2/dxml_dom.html#.parseDOM
>
> or a function like writeTaggedText
>
> http://jmdavisprog.com/docs/dxml/0.3.2/dxml_writer.html#.writeTaggedText
>
> which writes to the XMLWriter that it's given. Having either of those accept rvalues would just cause bugs.
>
> In the few cases where it seems appropriate to accept both rvalues and lvalues without copying either, I can almost always use auto ref, because I rarely use classes. I agree that being able to accept lvalues without copying them would be useful for non-templated functions as well, but I do not at all agree that that is how ref should work in general. IMHO, as with auto ref, it really should be marked with an attribute to indicate that it is purposefully accepting rvalues. I would not consider it a niche case at all for ref to not accept rvalues.

Okay, this is such the kind of niche use case I'm describing, that the documentation goes out of it's way to explain to the user the reason why 'ref' may be used:

"Note that default initialization, copying, and assignment are disabled for XMLWriter. This is because XMLWriter is essentially a reference type, but in many cases, it doesn't need to be passed around, and forcing it to be allocated on the heap in order to be a reference type seemed like an unnecessary heap allocation. So, it's a struct with default initialization, copying, and assignment disabled so that like a reference type, it will not be copied or overwritten. Code that needs to pass it around can pass it by ref or use the constructor to explicitly allocate it on the heap and then pass around the resulting pointer."

This demonstrates to me that the explicit application of @disable in this case (and cases like it) is absolutely acceptable. It's more than acceptable; it's entirely preferable to make the API just as explicit as the documentation.

Secondly, given this API, I can't imagine a valid construction where
the user accidentally provided an rvalue, and the program still built
and runs (with bugs).
What conceivable bug is the rule preventing here? If the user
accidentally provided an rvalue and the function did nothing... what
would the program even do? What's the 'bug'? How would the programmer
call 'save' to write the xml to disk or whatever when the variable was
never declared anywhere. The bug prevents itself in other ways.

This link is self-defeating. Can you find more?

> > > Either way, what this DIP proposes means that existing uses of ref will need to be changed - and in an extremely verbose way - in order to get back their current behavior.
> >
> > Their currently behaviour is mostly wrong, and the exact thing I'm trying to fix though.
>
> Why would the current behviour be mostly wrong? If the purpose of using ref is to accept the current value of a variable and mutate it such that the argument is mutated,

That's not "the purpose of using ref".
That is _one use_ of ref, and in my world, it's comparatively rare.

> then the current behaviour is _exactly_ what's
> desirable, and it prevents bugs.

The currently behaviour is absolutely not desired by every other practical use of 'ref'.

I'd still like to know an example of the kinds of bugs it's preventing that wouldn't be prevented implicitly in other ways?
July 20, 2018
On Friday, 20 July 2018 at 16:39:46 UTC, Dukc wrote:
> On Friday, 20 July 2018 at 09:39:47 UTC, Nicholas Wilson wrote:
>>> appending something (like .byRef or byRef!long, the latter making an implicit type conversion)
>>
>> That can't work: either it returns an expired stack temporary (*very* bad), or allocates with no way to deallocate (bad).
>
> How so? It could be made it act exactly as if the temporary was made just before the function call, meaning the lifetime would end at the end of current scope.
>

... which is exactly what this DIP proposes ...

> Of course, this required compiler magic. A library solution would have exactly the limits you said.

... and why its a DIP, and not a phobos PR.
July 20, 2018
On Friday, July 20, 2018 14:35:57 Manu via Digitalmars-d wrote:
> > > Their currently behaviour is mostly wrong, and the exact thing I'm trying to fix though.
> >
> > Why would the current behviour be mostly wrong? If the purpose of using ref is to accept the current value of a variable and mutate it such that the argument is mutated,
>
> That's not "the purpose of using ref".
> That is _one use_ of ref, and in my world, it's comparatively rare.

Comparatively rare? It's exactly what most functions using output ranges need to do. For many output ranges, if the function copies the range instead of copying, then you have a serious bug, because then the data that gets put to the output range inside the function does not affect the function argument - or worse, if it's a pseudo-reference type, it can partially mutate the argument, putting it in an invalid state. So, unless a function that accepts an output range is supposed to be taking ownership of it, the function needs to mark the output range parameter with ref, and it would be a bug for it to accept an rvalue.

If anyone screws up and does something like return an output range from a function without that function returning ref and then passes it to a function that is intended to operate on an output range, they would currently get an error, whereas with this DIP, they would instead get subtle bugs with their severity depending on what type of ouput range it was and what the code was doing with it.

And yes, in most cases, programmers would not pass something other than a variable to a function that is clearly designed to take its argument by ref and mutate it, but with the heavy use of properties in a lot of D code, it can be trivial to operate on something that looks like an lvalue but isn't. Right now, such code gets caught, because it's an error to pass an rvalue to a ref parameter, but with this DIP, that would no longer be true. The situation gets even worse when you consider code maintenance issues like when someone initially has something be a variable and then later changes it to a property function and doesn't make it return by ref (and property functions usually don't return by ref). With this DIP, there could easily be silent code breakage as a result, whereas such breakage would not be silent with the current semantics for ref.

Honestly, I don't see how accepting rvalues by ref is anything but error-prone for any case where the ref is there because the argument is supposed to be mutated and then be used after the function call. And while you may very well want to use ref simply to avoid copying - and may do that frequently right now in spite of that being a pain with rvalues - I have exactly the opposite experience that you seem to with regards to how ref is used. In my experience, using ref to avoid copies is rare, and using it to have the argument be mutated is what's common.

So, while I sympathize with you about making it more user-friendly to pass rvalues by ref in the case where you just want to avoid copies, I really think that it merits its own attribute rather than messing up ref for what I would have considered the common case.

- Jonathan M Davis

July 21, 2018
On Friday, 20 July 2018 at 23:35:46 UTC, Jonathan M Davis wrote:
> On Friday, July 20, 2018 14:35:57 Manu via Digitalmars-d wrote:
> Comparatively rare? It's exactly what most functions using output ranges need to do. For many output ranges, if the function copies the range instead of copying, then you have a serious bug, because then the data that gets put to the output range inside the function does not affect the function argument - or worse, if it's a pseudo-reference type, it can partially mutate the argument, putting it in an invalid state. So, unless a function that accepts an output range is supposed to be taking ownership of it, the function needs to mark the output range parameter with ref, and it would be a bug for it to accept an rvalue.
>
> If anyone screws up and does something like return an output range from a function without that function returning ref and then passes it to a function that is intended to operate on an output range, they would currently get an error, whereas with this DIP, they would instead get subtle bugs with their severity depending on what type of ouput range it was and what the code was doing with it.

This is an interesting case not completely covered by Manu's prior reasoning (that you can't muck it up because you need to use it after e.g. appender). But for the case that you would not need it later either you already have it as a lvalue (e.g. from a function parameter) or its some kind of singleton (e.g. nullSink or put's to a global variable).

> And yes, in most cases, programmers would not pass something other than a variable to a function that is clearly designed to take its argument by ref and mutate it, but with the heavy use of properties in a lot of D code, it can be trivial to operate on something that looks like an lvalue but isn't. Right now, such code gets caught, because it's an error to pass an rvalue to a ref parameter, but with this DIP, that would no longer be true. The situation gets even worse when you consider code maintenance issues like when someone initially has something be a variable and then later changes it to a property function and doesn't make it return by ref (and property functions usually don't return by ref). With this DIP, there could easily be silent code breakage as a result, whereas such breakage would not be silent with the current semantics for ref.
>
> Honestly, I don't see how accepting rvalues by ref is anything but error-prone for any case where the ref is there because the argument is supposed to be mutated and then be used after the function call. And while you may very well want to use ref simply to avoid copying - and may do that frequently right now in spite of that being a pain with rvalues - I have exactly the opposite experience that you seem to with regards to how ref is used. In my experience, using ref to avoid copies is rare, and using it to have the argument be mutated is what's common.

So this problem is restricted output range @properties that somehow don't return by ref, aren't intended to be used after having data written to them _and_ aren't singleton like?

July 20, 2018
On Saturday, July 21, 2018 00:10:09 Nicholas Wilson via Digitalmars-d wrote:
> So this problem is restricted output range @properties that somehow don't return by ref, aren't intended to be used after having data written to them _and_ aren't singleton like?

It's a problem for any function that accepts by ref with the intention of mutating the argument rather than to avoid copying, and something other than an lvalue is passed. Output ranges are just a particularly common use case. And property functions are just an easy way to think that you're passing an lvalue when you're not. For that matter, it's an easy mistake to make with lots of functions in D, because parens are unnecessary when there are no function arguments or when the only function argument is passed via UFCS. And of course, if someone doesn't read the docs carefully enough (which happens far too often), there are plenty of problems that someone could run into where they pass an rvalue to a function designed to operate on an lvalue. Right now, ref catches any and all mistakes where someone tries to pass an rvalue by ref, but if ref accepts rvalues, it won't catch any of them.

- Jonathan M Davis

July 20, 2018
On Fri, 20 Jul 2018 at 17:33, Jonathan M Davis via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
>
> On Saturday, July 21, 2018 00:10:09 Nicholas Wilson via Digitalmars-d wrote:
> > So this problem is restricted output range @properties that somehow don't return by ref, aren't intended to be used after having data written to them _and_ aren't singleton like?
>
> It's a problem for any function that accepts by ref with the intention of mutating the argument rather than to avoid copying, and something other than an lvalue is passed. Output ranges are just a particularly common use case.

They are. But it's particularly hard to imagine a non-contrived
situation where the compile error would alert the user to a bug that
they wouldn't naturally discover in some other way (likely, the very
next line).
Such cases are so few, that I can't find that they balance the broad
value in this DIP.
Further, if you are writing code where this is at unusually high-risk
for some reason, use the @disable technique.

> And property functions are just an easy way to think that you're passing an lvalue when you're not. For that matter, it's an easy mistake to make with lots of functions in D, because parens are unnecessary when there are no function arguments or when the only function argument is passed via UFCS. And of course, if someone doesn't read the docs carefully enough (which happens far too often), there are plenty of problems that someone could run into where they pass an rvalue to a function designed to operate on an lvalue.

This is just saying stuff.

I think you're describing now a bug where a function returns an
lvalue, but it was meant to return an rvalue.
The user then uses that buggy function in conjunction with ref, and
the output is lost (I still can't quite imagine the scenario, given
the restrictive criteria that Nicholas highlighted earlier).

1. The function that's buggy is not the function that involves ref.
2. You're expecting the function that involves ref to catch a bug in
an unrelated function via the existing ref semantics. I think you're
exceeding reason.
3. While it _might_ catch your issue, the buggy function is still
buggy, and may manifest the bug in any number of other far more likely
ways:
   - it may not be involved in a call to ref (and therefore bug not
caught that way)
   - the accidentally-by-val return value may be accessed directly, or
via a method

You're describing an unrelated bug, and it's not ref's job to catch it. At best, it's just a matter of luck that ref's existing semantic was able to prevent a bug like you describe, but I think such a responsibility it well outside of the design's charter.

> Right now, ref catches any and all mistakes where someone tries to
> pass an rvalue by ref, but if ref accepts rvalues, it won't catch any of
> them.

I'm still yet to identify a single realistic scenario where such a
mistake is likely.
I'm sure it's possible that some situations exist, but they're
evidently rare, and if you do identify such a particularly high risk
situation, use the @disable technique. It makes the at-risk construct
beautifully explicit in its intended design.

Adding an @rvalue attribute eliminates the primary value of this DIP.
Situation with meta would become substantially worse.
Meta case-handling which falls back to text-mixins (necessary when
attributes outside the type system appear) is the very worst kind of
'metacrobatics'.
July 20, 2018
On Fri, 20 Jul 2018 at 18:02, Manu <turkeyman@gmail.com> wrote:
>
> [...]
>
> I think you're describing now a bug where a function returns an lvalue, but it was meant to return an rvalue.

Sorry, wrong way around! I meant to say:
I think you're describing now a bug where a function returns an
rvalue, but it was meant to return an lvalue (ie, a ref).
July 20, 2018
On Friday, July 20, 2018 18:04:26 Manu via Digitalmars-d wrote:
> On Fri, 20 Jul 2018 at 18:02, Manu <turkeyman@gmail.com> wrote:
> > [...]
> >
> > I think you're describing now a bug where a function returns an lvalue, but it was meant to return an rvalue.
>
> Sorry, wrong way around! I meant to say:
> I think you're describing now a bug where a function returns an
> rvalue, but it was meant to return an lvalue (ie, a ref).

The function returning an rvalue isn't necessarily a bug. It's the fact that it was then used in conjunction with a function that accepts its argument by ref in order to mutate it. If a function accepts its argument by ref in order to mutate it, then it's a but for it to be given an rvalue regardless of whether the rvalue comes from. It's just that some cases are more obviously wrong than others (e.g. passing foo + bad might be obviously wrong, whereas passing foo.bar may be wrong but look correct).

- Jonathan M Davis

July 20, 2018
On Fri, 20 Jul 2018 at 18:17, Jonathan M Davis via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
>
> On Friday, July 20, 2018 18:04:26 Manu via Digitalmars-d wrote:
> > On Fri, 20 Jul 2018 at 18:02, Manu <turkeyman@gmail.com> wrote:
> > > [...]
> > >
> > > I think you're describing now a bug where a function returns an lvalue, but it was meant to return an rvalue.
> >
> > Sorry, wrong way around! I meant to say:
> > I think you're describing now a bug where a function returns an
> > rvalue, but it was meant to return an lvalue (ie, a ref).
>
> The function returning an rvalue isn't necessarily a bug. It's the fact that it was then used in conjunction with a function that accepts its argument by ref in order to mutate it.

It sounded like you were specifically describing something like a
property that was used to access an output range that was a member of
some other object.
Returning by-val was the 'bug' that made a copy of the output range
instead of returning a reference to it as was intended.

Was that not the scenario you were presenting?

> If a function accepts its argument by ref in
> order to mutate it, then it's a bug for it to be given an rvalue regardless
> of whether the rvalue comes from.

I don't think that's true though. It's just as much a 'bug' to ignore
such output as it is to ignore the return value of a function. (I
shamelessly ignore return values all the time!)
If the function output is not used, then it's not used.

In most cases of course, the result is used, and the programmer would notice that they have no variable to refer to when they attempt to refer to the result on the very next line.

> It's just that some cases are more
> obviously wrong than others (e.g. passing foo + bad might be obviously
> wrong, whereas passing foo.bar may be wrong but look correct).

Right, and I think in this example, foo.bar is to be interpreted such
that bar is a property, and it might accidentally return by-val
instead of by-ref?
In that case, 'bar' is the bug. What could it possibly mean for a
property to return an output range by-value?

It's not ref's charter to catch such a bug, and it's much more likely
to manifest in any number of other scenarios rather than an
interaction with ref.
I can't see how this is a compelling reason to dismiss all the
advantages of this DIP in favour of keeping the current semantic.
July 20, 2018
On Friday, July 20, 2018 19:13:00 Manu via Digitalmars-d wrote:
> I can't see how this is a compelling reason to dismiss all the advantages of this DIP in favour of keeping the current semantic.

Honestly, I think we're just coming from points of view that are too different. IMHO, the core use case for ref is for a function to mutate an argument and have that result progagate to the argument, and having ref accept rvalues is not only counter to that, but it risks bugs that are currently impossible. I think that having a way to accept rvalues by ref for functions where you want to avoid copying is potentially useful but not particularly critical. On the other hand, you seem to see little or no value in having parameters that are intended to only accept lvalues and see great value in having functions that accept rvalues by ref in order to avoid copying.

- Jonathan M Davis