Jump to page: 1 28  
Page
Thread overview
The liabilities of binding rvalues to ref
May 05, 2013
Walter Bright
May 05, 2013
Jonathan M Davis
May 05, 2013
Peter Alexander
May 05, 2013
Timon Gehr
May 05, 2013
Timon Gehr
May 05, 2013
Peter Alexander
May 05, 2013
Timon Gehr
May 05, 2013
Timon Gehr
May 05, 2013
deadalnix
May 05, 2013
deadalnix
May 05, 2013
Dicebot
May 05, 2013
deadalnix
May 05, 2013
Dicebot
May 09, 2013
Manu
May 09, 2013
Peter Alexander
May 09, 2013
Manu
May 09, 2013
Peter Alexander
May 09, 2013
Namespace
May 09, 2013
Peter Alexander
May 09, 2013
Jonathan M Davis
May 09, 2013
Peter Alexander
May 09, 2013
Jonathan M Davis
May 09, 2013
Peter Alexander
May 09, 2013
Dmitry S
May 10, 2013
Dmitry S
May 09, 2013
Jonathan M Davis
May 09, 2013
Rob T
May 09, 2013
Rob T
May 09, 2013
Manu
May 09, 2013
Rob T
May 10, 2013
Manu
May 10, 2013
Peter Alexander
May 10, 2013
Jonathan M Davis
May 10, 2013
Manu
May 09, 2013
Timon Gehr
May 09, 2013
Jonathan M Davis
May 09, 2013
Timon Gehr
May 09, 2013
Manu
May 09, 2013
Timon Gehr
May 09, 2013
Manu
May 09, 2013
Jonathan M Davis
May 09, 2013
Manu
May 09, 2013
Timon Gehr
May 10, 2013
Manu
May 09, 2013
Rob T
May 10, 2013
Manu
May 10, 2013
Dmitry Olshansky
May 09, 2013
Jonathan M Davis
May 10, 2013
Jonathan M Davis
May 10, 2013
Jonathan M Davis
May 10, 2013
Jonathan M Davis
May 10, 2013
deadalnix
May 10, 2013
Jonathan M Davis
May 10, 2013
Dicebot
May 05, 2013
Here are the issues that need to be addressed by any solution reconciling rvalues and passing into functions. This post is not arguing for any particular solution or approach, but lays down the issues that any approach must be judged by.

1. The LRL (Lvalue-Rvalue-Lvalue) problem.

This has been long mentioned as an argument in defining C++'s references. The crux of the issue is that caller code passes an lvalue, and the callee code receives an lvalue, but in the middle there's an rvalue created by an implicit conversion. Consider:

void fix(ref double x) { if (isnan(x)) x = 0; }
...
float a;
...
fix(a);

If rvalues bind indiscriminately to ref, then the call is legal because of the implicit conversion float->double.

A possible solution is to disallow binding if the initial value bound is an lvalue.

2. Code evolution.

Jonathan mentioned this too. The problem here is that as code evolves, meaningful code doing real work becomes silently useless code that patently does nothing. Consider:

class Collection(T) {
  ref T opIndex(size_t i) { ... }
  ...
}

void fix(ref double x) { if (isnan(x)) x = 0; }

void fixAll(Collection!double c) {
  foreach (i; 0 .. c.length) {
    fix(c[i]);
  }
}

As design evolves, Collection's opIndex may change to return a T instead of ref T (e.g. certain implementations of sparse vectors). When that happens, the caller code will continue to compile and run. However, it won't do anything interesting: fix will be always called against a temporary plucked from the collection.

Changing return types from ref T to T or back and expecting no ill effects (aside from fixing compile-time errors) is a frequent operation in C++ projects I'm involved in. Doing worse than that would be arguably a language design regression.

Note that in function call chains fun(gun(hun())), which are common (written as fun.gun.hun etc) in the increasingly popular pipeline-style of defining processing, one function changing return style poisons the well for everybody down the pipeline. That may lead to pipelines that have only partial effect.

=======

There may be other important patterns to address at the core, please chime in. I consider (1) above easy to tackle, which leaves us with at least (2). My opinion is that any proposal for binding rvalues to ref must offer a compelling story about these patterns.


Andrei
May 05, 2013
One solution poffered was making rvalues only bindable to const ref. Andrei's position was that this prevented many important use cases.

The trouble was that const is transitive, and yet what was desired here was so-called "head const". D doesn't have a concept of head const.

(C++ const is always "head const".)
May 05, 2013
On Sunday, May 05, 2013 01:49:42 Andrei Alexandrescu wrote:
> There may be other important patterns to address at the core, please chime in. I consider (1) above easy to tackle, which leaves us with at least (2). My opinion is that any proposal for binding rvalues to ref must offer a compelling story about these patterns.

Another case is when you want to distinguish between lvalues and rvalues.  In fact, IIRC this came up in Ali's dconf talk. He had an opAssign which swapped guts with its argument when it was an rvalue (since it was known to be a temporary) and did a more normal assignment when it was an lvalue. You might still be able to do that if ref accepts rvalues (because the non-ref overload would be preferred in the rvalue case, and the ref overload would be preferred in the lvalue case), but I suspect that that would be incredibly error-prone - especially when there are multiple arguments to the function.

So, whatever solution we go with needs to allow us to reasonably overload on refness when we want to while still being able to have functions which accept both lvalues and rvalues by reference.

- Jonathan M Davis
May 05, 2013
Is there any intention to address the issue of the lvalue-ness of "this"? In C++, *this is always an lvalue, even if the member function was called with an rvalue, leading to situations like this:

struct Number
{
    void fix() { if (isnan(x)) x = 0; }
    double x;
}

class Collection(T) {
  ref T opIndex(size_t i) { ... }
  ...
}

void fixAll(Collection!Number c) {
  foreach (i; 0 .. c.length) {
    c[i].fix();
  }
}


Here, if Collection changes to return by non-ref then the fix() call is still valid, silently doing nothing of value. Analogous code in C++ is allowed as well.

Do we intend to fix this as well? I suspect there are use cases where such calls are useful, but I can't think of any right now.
May 05, 2013
On 05/05/2013 07:49 AM, Andrei Alexandrescu wrote:
> ...
>
> There may be other important patterns to address at the core, please
> chime in. I consider (1) above easy to tackle, which leaves us with at
> least (2). My opinion is that any proposal for binding rvalues to ref
> must offer a compelling story about these patterns.
> ...

Any solution that allows both lvalues and rvalues to bind to the same reference will have problem (2), and fixing up problem (1) introduces somewhat arbitrary rules. Probably we do not want to have to deal with them by default.

A possibility would be to fix (2) in a similar way to the proposed solution for (1): Only allow rvalues to bind to 'ref' if they cannot be turned into lvalues by changing the signature of a different function. I don't like this solution very much.

I still think auto ref should be extended to work for non-templated functions. The semantics should be the same (i.e. behave as if there were two copies that are eagerly semantically analyzed, but only generate code for one) where possible. It would be an error to have different behaviour of the two "copies". This would mean that in non-templated functions, it should be an error to rely on whether or not the passed argument was an lvalue. i.e. (auto) ref return of an auto ref argument (at least in @safe code) and __traits(isRef,autoRefArgument) should be disallowed. This would have the effect of making it illegal to return references to temporaries from functions (at least in @safe code), and hence the life time of rvalues would not have to be changed.

Both (1) and (2) would be close to non-issues with this solution.
May 05, 2013
On 05/05/2013 01:55 PM, Timon Gehr wrote:
> On 05/05/2013 07:49 AM, Andrei Alexandrescu wrote:
>> ...
>>
>> There may be other important patterns to address at the core, please
>> chime in. I consider (1) above easy to tackle, which leaves us with at
>> least (2). My opinion is that any proposal for binding rvalues to ref
>> must offer a compelling story about these patterns.
>> ...
>
> Any solution that allows both lvalues and rvalues to bind to the same
> reference will have problem (2), and fixing up problem (1) introduces
> somewhat arbitrary rules. Probably we do not want to have to deal with
> them by default.
>
> A possibility would be to fix (2) in a similar way to the proposed
> solution for (1): Only allow rvalues to bind to 'ref' if they cannot be
> turned into lvalues by changing the signature of a different function. I
> don't like this solution very much.
>
> I still think auto ref should be extended to work for non-templated
> functions. The semantics should be the same (i.e. behave as if there
> were two copies that are eagerly semantically analyzed, but only
> generate code for one) where possible. It would be an error to have
> different behaviour of the two "copies". This would mean that in
> non-templated functions, it should be an error to rely on whether or not
> the passed argument was an lvalue. i.e. (auto) ref return of an auto ref
> argument (at least in @safe code) and __traits(isRef,autoRefArgument)
> should be disallowed. This would have the effect of making it illegal to
> return references to temporaries from functions (at least in @safe
> code), and hence the life time of rvalues would not have to be changed.
>
> Both (1) and (2) would be close to non-issues with this solution.

It seems to make sense that template function instantiations should be split into two copies per auto ref argument only if it is necessary to support differing semantics for lvalues and rvalues.

Peter Alexander brings up a good point regarding the implicit this reference for structs. Probably it should be passed by auto ref.


May 05, 2013
On Sunday, 5 May 2013 at 11:55:28 UTC, Timon Gehr wrote:
> I still think auto ref should be extended to work for non-templated functions. The semantics should be the same (i.e. behave as if there were two copies that are eagerly semantically analyzed, but only generate code for one) where possible.

This is not the case currently for template auto ref functions. For example, the two "versions" of the auto ref function do not share local static variables, nor do they have the same function address. They are literally two (or four, or eight, ...) separate functions. I really dislike how they have been implemented.

May 05, 2013
On 05/05/2013 01:55 PM, Timon Gehr wrote:
> ...It would be an error to have
> different behaviour of the two "copies". This would mean that in
> non-templated functions, it should be an error to rely on whether or not
> the passed argument was an lvalue. i.e. (auto) ref return of an auto ref
> argument (at least in @safe code) and __traits(isRef,autoRefArgument)
> should be disallowed. ...

It also would need to be an error to have local static variables.
May 05, 2013
On 05/05/2013 02:11 PM, Peter Alexander wrote:
> On Sunday, 5 May 2013 at 11:55:28 UTC, Timon Gehr wrote:
>> I still think auto ref should be extended to work for non-templated
>> functions. The semantics should be the same (i.e. behave as if there
>> were two copies that are eagerly semantically analyzed, but only
>> generate code for one) where possible.
>
> This is not the case currently for template auto ref functions. For
> example, the two "versions" of the auto ref function do not share local
> static variables, nor do they have the same function address.

Very good point about the local static variables. This would be a wart of my proposed solution, as they'd need to be banned for auto ref functions.

> They are literally two (or four, or eight, ...) separate functions. I really
> dislike how they have been implemented.
>

Me too, but it makes sense to have two functions in some cases.
May 05, 2013
On Sunday, 5 May 2013 at 05:49:42 UTC, Andrei Alexandrescu wrote:
> There may be other important patterns to address at the core, please chime in. I consider (1) above easy to tackle, which leaves us with at least (2). My opinion is that any proposal for binding rvalues to ref must offer a compelling story about these patterns.
>

Yes ! With optional parenthesis, it is super unclear if we access an lvalue or an rvalue and update its content or not.

A good example of that is the captures property from std.regex. The property is an revalue, and would look like a filed that reset itself all the time if bound to an lvalue.
« First   ‹ Prev
1 2 3 4 5 6 7 8