November 08, 2012
On Thursday, 8 November 2012 at 03:07:00 UTC, Jonathan M Davis wrote:
> Okay. Here are more links to Andrei discussing the problem:
>
> http://forum.dlang.org/post/4F83DBE5.20800@erdani.com
> http://www.mail-archive.com/digitalmars-d@puremagic.com/msg44070.html
> http://www.mail-archive.com/digitalmars-d@puremagic.com/msg43769.html
> http://forum.dlang.org/post/hg62rq$2c2n$1@digitalmars.com

Thank you so much for these links, Jonathan.

So fortunately the special role of _const_ ref parameters has been acknowledged.

From the 2nd  link:
> The problem with binding rvalues to const ref is that
> once that is in place you have no way to distinguish an
> rvalue from a const ref on the callee site. If you do want
> to distinguish, you must rely on complicated conversion
> priorities. For example, consider:
>
> void foo(ref const Widget);
> void foo(Widget);
>
> You'd sometimes want to do that because you want to exploit
> an rvalue by e.g. moving its state instead of copying it.
> However, if rvalues become convertible to ref const, then
> they are motivated to go either way. A rule could be put in
> place that gives priority to the second declaration. However,
> things quickly get complicated in the presence of other
> applicable rules, multiple parameters etc. Essentially it
> was impossible for C++ to go this way and that's how rvalue
> references were born.
>
> For D I want to avoid all that aggravation and have a simple
> rule: rvalues don't bind to references to const. If you don't
> care, use auto ref. This is a simple rule that works
> promisingly well in various forwarding scenarios.

This is exactly what we propose (to be able to avoid pointer/reference indirection for rvalues in some absolutely performance-critical cases). Unlike Andrei though, I don't find the required overloading rules complicated at all, quite the contrary in fact.

From the 3rd link:
> Binding rvalues to const references was probably the single
> most hurtful design decisions for C++. I don't have time to
> explain now, but in short I think all of the problems that
> were addressed by rvalue references, and most of the
> aggravation within the design of rvalue references, are owed
> by that one particular conversion.

Here's where I totally disagree with Andrei. C++ rvalue references (T&&) aren't used to distinguish between lvalues and rvalues when expecting a _const_ reference (I still have to see a use case for 'const T&&'). They are used for _mutable_ references and primarily to enforce efficient move semantics in C++, i.e., to move _mutable_ rvalue arguments (instead of copying them) and to enforce 'Named Return Value Optimization' when returning lvalues (by using std::move; goal again is to avoid a redundant copy). D fortunately seems to implement move semantics out-of-the-box (at least now in v2.060), in both cases, see Rob T's posts and my replies in this thread.
Besides implementing move semantics, C++ with its rvalue refs also implicitly provides a way to distinguish between _mutable_ lvalue and rvalue references and so allows optimized implementations - that is something we'd also need in D, but that's what we've just covered with regard to the 2nd link.

So I still don't see a valid reason to preclude binding rvalues to const ref parameters.
November 08, 2012
11/7/2012 3:54 AM, Manu пишет:
> If the compiler started generating 2 copies of all my ref functions, I'd
> be rather unimpressed... bloat is already a problem in D. Perhaps this
> may be a handy feature, but I wouldn't call this a 'solution' to this issue.
> Also, what if the function is external (likely)... auto ref can't work
> if the function is external, an implicit temporary is required in that case.
>

What's wrong with going this route:

void blah(auto ref X stuff){
...lots of code...
}

is magically expanded to:

void blah(ref X stuff){
...that code..
}

and

void blah(X stuff){
	.blah(stuff); //now here stuff is L-value so use the ref version
}

Yeah, it looks _almost_ like a template now. But unlike with a template we can assume it's 2 overloads _always_. External  fucntion issue is then solved by treating it as exactly these 2 overloads (one trampoline, one real). Basically it becomes one-line declaration of 2 functions.

Given that temporaries are moved anyway the speed should be fine and there is as much bloat as you'd do by hand.

Also hopefully inliner can be counted on to do its thing in this simple case.



-- 
Dmitry Olshansky
November 08, 2012
11/7/2012 9:04 PM, martin пишет:
> On Wednesday, 7 November 2012 at 14:07:31 UTC, martin wrote:
>> C++:
>> void f(T& a) { // for lvalues
>>     this->resource = a.resource;
>>     a.resetResource();
>> }
>> void f(T&& a) { // for rvalues (moved)
>>     this->resource = a.resource;
>>     a.resetResource();
>> }
>>
>> D:
>> void f(ref T a) { // for lvalues
>>     this.resource = a.resource;
>>     a.resetResource();
>> }
>> void f(T a) { // rvalue argument is not copied, but moved
>>     this.resource = a.resource;
>>     a.resetResource();
>> }
>
> You could probably get away with a single-line overload, both in C++ and D:
>
> C++:
> void f(T& a) { // for lvalues
>      // convert a to mutable rvalue reference and
>      // invoke the main overload f(T&&)
>      f(std::move(a));
> }
>
> D:
> void f(T a) { // rvalue argument is not copied, but moved
>      // the original argument is now named a (an lvalue)
>      // invoke the main overload f(ref T)
>      f(a);
> }

Yup, and I'd like auto ref to actually do this r-value trampoline for me.

-- 
Dmitry Olshansky
November 08, 2012
On Thursday, 8 November 2012 at 18:28:44 UTC, Dmitry Olshansky wrote:
> What's wrong with going this route:
>
> void blah(auto ref X stuff){
> ...lots of code...
> }
>
> is magically expanded to:
>
> void blah(ref X stuff){
> ...that code..
> }
>
> and
>
> void blah(X stuff){
> 	.blah(stuff); //now here stuff is L-value so use the ref version
> }
>
> Yeah, it looks _almost_ like a template now. But unlike with a template we can assume it's 2 overloads _always_. External  fucntion issue is then solved by treating it as exactly these 2 overloads (one trampoline, one real). Basically it becomes one-line declaration of 2 functions.
>
> Given that temporaries are moved anyway the speed should be fine and there is as much bloat as you'd do by hand.
>
> Also hopefully inliner can be counted on to do its thing in this simple case.

That second overload for rvalues would be a shortcut to save the lvalue declarations at each call site - and it really doesn't matter if the compiler magically added the lvalue declarations before each call or if it magically added the rvalue overload (assuming all calls are inlined). But it would create a problem if there already was an explicit 'void blah(X)' overload in addition to 'void blah(auto ref X)' (not making much sense obviously, but this would be something the compiler needed to handle somehow).
What this 'auto ref' approach (both as currently implemented for templates and proposed here for non-templated functions) lacks is the vital distinction between const and mutable parameters.

For the much more common const ref parameters, I repeatedly tried to explain why I'm absolutely convinced that we don't need another keyword and that 'in/const ref' is sufficient, safe, logical and intuitive (coupled with the overload rule that pass-by-value (moving) is preferred for rvalues). Please prove me wrong.

For the less common mutable ref parameters, I also repeatedly tried to explain why I find it dangerous/unsafe to allow rvalues to be bound to mutable ref parameters. But if there are enough people wanting that, I'd have no problem with an 'auto ref' approach for it (only for mutable parameters!). That may actually be a good compromise, what do you guys think? :)

'auto ref T' for templates expands to 'ref T' (lvalues) and 'T' (rvalues), duplicating the whole function and providing best performance - no pointer/reference indirection for rvalues in contrast to 'auto ref T' (proposed above) for non-templates, otherwise the concept would be exactly the same. But it's only for mutable parameters.
Such a templated option may also be worth for const parameters though (expanding to 'const ref T' and 'const T'), so maybe something like the (ambiguous) 'in/const auto ref T' wouldn't actually be that bad (assuming there are only a few use cases, and only for templates! It'd still be 'in ref T' for non-templates).
November 08, 2012
That's cute, but it really feels like a hack.
All of a sudden the debugger doesn't work properly anymore, you need to
step-in twice to enter the function, and it's particularly inefficient in
debug builds (a point of great concern for my industry!).

Please just with the compiler creating a temporary in the caller space. Restrict is to const ref, or better, in ref (scope seems particularly important here).


On 8 November 2012 20:28, Dmitry Olshansky <dmitry.olsh@gmail.com> wrote:

> 11/7/2012 3:54 AM, Manu пишет:
>
>  If the compiler started generating 2 copies of all my ref functions, I'd
>> be rather unimpressed... bloat is already a problem in D. Perhaps this
>> may be a handy feature, but I wouldn't call this a 'solution' to this
>> issue.
>> Also, what if the function is external (likely)... auto ref can't work
>> if the function is external, an implicit temporary is required in that
>> case.
>>
>>
> What's wrong with going this route:
>
> void blah(auto ref X stuff){
> ...lots of code...
> }
>
> is magically expanded to:
>
> void blah(ref X stuff){
> ...that code..
> }
>
> and
>
> void blah(X stuff){
>         .blah(stuff); //now here stuff is L-value so use the ref version
> }
>
> Yeah, it looks _almost_ like a template now. But unlike with a template we can assume it's 2 overloads _always_. External  fucntion issue is then solved by treating it as exactly these 2 overloads (one trampoline, one real). Basically it becomes one-line declaration of 2 functions.
>
> Given that temporaries are moved anyway the speed should be fine and there is as much bloat as you'd do by hand.
>
> Also hopefully inliner can be counted on to do its thing in this simple case.
>
>
>
> --
> Dmitry Olshansky
>


November 08, 2012
11/8/2012 11:30 PM, martin пишет:
> On Thursday, 8 November 2012 at 18:28:44 UTC, Dmitry Olshansky wrote:
>> What's wrong with going this route:
>>
>> void blah(auto ref X stuff){
>> ...lots of code...
>> }
>>
>> is magically expanded to:
>>
>> void blah(ref X stuff){
>> ...that code..
>> }
>>
>> and
>>
>> void blah(X stuff){
>>     .blah(stuff); //now here stuff is L-value so use the ref version
>> }
>>
>> Yeah, it looks _almost_ like a template now. But unlike with a
>> template we can assume it's 2 overloads _always_. External fucntion
>> issue is then solved by treating it as exactly these 2 overloads (one
>> trampoline, one real). Basically it becomes one-line declaration of 2
>> functions.
>>
>> Given that temporaries are moved anyway the speed should be fine and
>> there is as much bloat as you'd do by hand.
>>
>> Also hopefully inliner can be counted on to do its thing in this
>> simple case.
>
> That second overload for rvalues would be a shortcut to save the lvalue
> declarations at each call site - and it really doesn't matter if the
> compiler magically added the lvalue declarations before each call or if
> it magically added the rvalue overload (assuming all calls are inlined).

The scope. It's all about getting the correct scope, destructor call and you know, the works. Preferably it can inject it inside temporary scope.

Anticipating bugs in the implementation of this feature let me warn that re-writing this:
... code here ...
auto  r = foo(SomeResource(x, y, ..)); //foo is auto ref
... code here ...

Should not change semantics e.g. imagine the resource is a lock, we'd better unlock it sooner. That is call destructor right after foo returns. So we need {} around the call. But this doesn't work as it traps 'r':

{
auto someRef = SomeResource(x, y, ..);
auto r  = foo(someRef);
}

So it's rather something like this:

typeof(foo(...)) r = void;
{
someRef = SomeResource(x, y, ..);
r = foo(someRef); // should in fact construct in place not assign
}

I suspect this is hackable to be more clean inside of the compiler but not in terms of a re-write.

> But it would create a problem if there already was an explicit 'void
> blah(X)' overload in addition to 'void blah(auto ref X)' (not making
> much sense obviously, but this would be something the compiler needed to
> handle somehow).

Aye. But even then there is an ambiguity if there is one version of function with ref T / T and one with auto ref T.

> What this 'auto ref' approach (both as currently implemented for
> templates and proposed here for non-templated functions) lacks is the
> vital distinction between const and mutable parameters.
>
> For the much more common const ref parameters, I repeatedly tried to
> explain why I'm absolutely convinced that we don't need another keyword
> and that 'in/const ref' is sufficient, safe, logical and intuitive
> (coupled with the overload rule that pass-by-value (moving) is preferred
> for rvalues). Please prove me wrong.

I'd rather restrict it to 'auto ref' thingie. Though 'in auto ref' sounds outright silly.
Simply put const ref implies that callee can save a pointer to it somewhere (it's l-value). The same risk is with 'auto ref' but at least there an explicitly written 'disclaimer' by the author of accepting temporary stuff.

In the ideal world name 'auto ref' would be shorter, logical and more to the point but we have what we have.

>
> For the less common mutable ref parameters, I also repeatedly tried to
> explain why I find it dangerous/unsafe to allow rvalues to be bound to
> mutable ref parameters. But if there are enough people wanting that, I'd
> have no problem with an 'auto ref' approach for it (only for mutable
> parameters!). That may actually be a good compromise, what do you guys
> think? :)

I think that function plucked with auto ref is a enough indication that author is fine with passing to it mutable r-values and not seeing changes outside and related blah-blah. In most (all?) of cases it means that parameter is too big to be passed by copy so rather it takes it by ref.
Also certain stuff can't be properly bitwise const because of C-calls and what not. Logical const is the correct term but in the D world it's simply mutable.

>
> 'auto ref T' for templates expands to 'ref T' (lvalues) and 'T'
> (rvalues), duplicating the whole function and providing best performance
> - no pointer/reference indirection for rvalues in contrast to 'auto ref
> T' (proposed above) for non-templates, otherwise the concept would be
> exactly the same. But it's only for mutable parameters.

I'd say that even for templates the speed argument is mostly defeated by the bloat argument. But that's probably only me.

> Such a templated option may also be worth for const parameters though
> (expanding to 'const ref T' and 'const T'), so maybe something like the
> (ambiguous) 'in/const auto ref T' wouldn't actually be that bad
> (assuming there are only a few use cases, and only for templates! It'd
> still be 'in ref T' for non-templates).



-- 
Dmitry Olshansky
November 08, 2012
On Thursday, 8 November 2012 at 20:15:51 UTC, Dmitry Olshansky wrote:
> The scope. It's all about getting the correct scope, destructor call and you know, the works. Preferably it can inject it inside temporary scope.
>
> typeof(foo(...)) r = void;
> {
> someRef = SomeResource(x, y, ..);
> r = foo(someRef); // should in fact construct in place not assign
> }
>
> I suspect this is hackable to be more clean inside of the compiler but not in terms of a re-write.

Right, I forgot the scope for a moment. I'd illustrate the rvalue => (const) ref binding to a novice language user as follows:

T   const_foo(  in ref int x);
T mutable_foo(auto ref int x);

int bar() { return 5; }

T result;

result = const_foo(bar());
/* expanded to:
{
    immutable int tmp = bar(); // avoidable for literals
    result = const_foo(tmp);
} // destruction of tmp
*/

result = mutable_foo(bar());
/* expanded to:
{
    int tmp = bar();
    result = mutable_foo(tmp);
} // destruction of tmp
*/

> I'd rather restrict it to 'auto ref' thingie. Though 'in auto ref' sounds outright silly.
> Simply put const ref implies that callee can save a pointer to it somewhere (it's l-value). The same risk is with 'auto ref' but at least there an explicitly written 'disclaimer' by the author of accepting temporary stuff.

'in ref' as opposed to 'const ref' should disallow this escaping issue we've already tackled in this thread, but I'm not sure if it is already/correctly implemented. Anyway, this issue also arises with (short-lived) local lvalues at the caller site:

foreach (i; 0 .. 10)
{
    int scopedLvalue = i + 2;
    foo(scopedLvalue); // passed by ref
} // scopedLvalue is gone

> In the ideal world name 'auto ref' would be shorter, logical and more to the point but we have what we have.

+1, but I don't have a better proposal anyway. ;)

> I think that function plucked with auto ref is a enough indication that author is fine with passing to it mutable r-values and not seeing changes outside and related blah-blah.

Agreed.

> Also certain stuff can't be properly bitwise const because of C-calls and what not. Logical const is the correct term but in the D world it's simply mutable.

As you know, I'd definitely allow rvalues to be bound to const ref parameters as alternative (that would also be useful for a lot of existing code). People who generally don't use const (Timon Gehr? :)) are free to only use 'auto ref', I'm most likely only going to use 'in ref', and there will certainly be people using both. Sounds like a really good compromise to me.

> I'd say that even for templates the speed argument is mostly defeated by the bloat argument. But that's probably only me.

I haven't performed any benchmarks, but I tend to agree with you, especially since multiple 'auto ref' parameters lead to exponential bloating. I could definitely do without a special role for templates, which would further simplify things considerably. If performance is really that critical, an explicit pass-by-value (move) overload for rvalues ought to be enough flexibility imo.
November 08, 2012
On 11/08/2012 02:45 AM, martin wrote:
> On Wednesday, 7 November 2012 at 21:39:52 UTC, Timon Gehr wrote:
>> You can pass him an object that does not support operations you want
>> to preclude. He does not have to _know_, that your book is not changed
>> when he reads it. This is an implementation detail. In fact, you could
>> make the book save away his reading schedule without him noticing.
>
> I don't see where you want to go with this. Do you suggest creating
> tailored objects (book variants) for each function you're gonna pass it
> to just to satisfy perfect theoretical encapsulation?

No. The point is that the language should _support_ what you call 'perfect theoretical encapsulation'.

> So foo() shouldn't
> be able to change the author => change from inout author reference to
> const reference? bar() should only be allowed to read the book title,
> not the actual book contents => hide that string? ;) For the sake of
> simplicity, by using const we have the ability to at least control if
> the object can be modified or not.

It is not _just_ the object. Anyway, this is what I stated in my last post.

> So although my colleague doesn't have
> to _know_ that he can't modify my book in any way (or know that the book
> is modifiable in the first place), using const is a primitive but
> practical way for me to prevent him from doing so.
>

It also weakens encapsulation, which was the point.


> In the context of this rvalue => (const) ref discussion, const is useful
> due to a number of reasons.
>
> 1) All possible side effects of the function invokation are required to
> be directly visible by the caller. Some people may find that annoying,
> but I find it useful, and there's a single-line workaround (lvalue
> declaration) for the (in my opinion very rare) cases where a potential
> side-effect is either known not to occur or simply uninteresting
> (requiring exact knowledge about the function implementation, always,
> i.e., during the whole life-time of that code!).
>

Wrong. Not everything is a perfect value type. (and anyway, the code that actually will observe the change may be a few frames up the call stack.)

> 2) Say we pass a literal string (rvalue) to a const ref parameter. The
> location of the string in memory can then be freely chosen by the
> compiler, possibly in a static data segment of the binary (literal
> optimization - only one location for multiple occurrences). If the
> parameter was a mutable ref, the compiler should probably allocate a
> copy on the stack before calling the function, otherwise the literal may
> not be the same when accessed later on, potentially causing funny bugs.
>

Ambiguous to me and all the interpretations are either wrong or irrelevant.

> 3) Implicit type conversion isn't a problem. Say we pass an int rvalue
> to a mutable double ref parameter. The parameter will then be a
> reference to another rvalue (the int cast to a double) and altering it
> (the hidden double rvalue) may not really be what the coder intended.
> Afaik D doesn't support implicit casting for user-defined types, so that
> may not be a problem (for now at least).

Maybe you should stop trying to show that 'const' is sufficient for resolving those issues. The point is that it is not _necessary_. It is too strong.


November 08, 2012
On Thursday, November 08, 2012 21:49:58 Manu wrote:
> That's cute, but it really feels like a hack.
> All of a sudden the debugger doesn't work properly anymore, you need to
> step-in twice to enter the function, and it's particularly inefficient in
> debug builds (a point of great concern for my industry!).
> 
> Please just with the compiler creating a temporary in the caller space. Restrict is to const ref, or better, in ref (scope seems particularly important here).

I honestly wish that in didn't exist in the language. The fact that it it's an alias two different attributes is confusing, and people keep using it without realizing what they're getting into. If scope worked correctly, you'd only want it in specific circumstances, not in general. And since it doesn't work correctly aside from delegates, once it _does_ work correctly, it'll break code all over the place, because people keep using in, because they like how it corresponds with out or whatever.

- Jonathan M Davis
November 08, 2012
On Thursday, 8 November 2012 at 22:34:03 UTC, Timon Gehr wrote:
> Ambiguous to me and all the interpretations are either wrong or irrelevant.

My point is that it may affect performance. If there was no const, the compiler would need to allocate a dedicated copy of a literal whenever passing it to a mutable ref parameter unless the optimizer worked so well it can prove it's not going to be modified (which I'm sure you'd expect though :D).

> Maybe you should stop trying to show that 'const' is sufficient for resolving those issues. The point is that it is not _necessary_. It is too strong.

In that case it actually is - who cares if the read-only double rvalue the function is passed is the result of an implicit cast (and there's a reason for it being implicit) of the original argument (int rvalue)?

Anyway, I think we have moved on in this thread, so maybe you could contribute to trying to settle this rvalue => (const) ref issue once and for all by commenting my latest proposal.