View mode: basic / threaded / horizontal-split · Log in · Help
January 09, 2013
[Issue 9238] Support rvalue references
http://d.puremagic.com/issues/show_bug.cgi?id=9238


Andrei Alexandrescu <andrei@erdani.com> changed:

          What    |Removed                     |Added
----------------------------------------------------------------------------
                CC|                            |andrei@erdani.com


--- Comment #10 from Andrei Alexandrescu <andrei@erdani.com> 2013-01-09 15:03:08 PST ---
Desiderata
==========

Design choices may sometimes invalidate important use cases, so let's start
with what we'd like to have:

1. Safety

We'd like most or all uses of ref to be safe. If not all are safe, we should
have easy means to distinguish safe from unsafe cases statically. If that's not
possible, we should be able to enforce safety with simple runtime checks in
@safe code.

2. Efficient passing of values

The canonical use case of ref parameters is to allow the callee to modify a
value in the caller. However, a significant secondary use case is as an
optimization for passing arguments into a function. In such cases, the caller
is not concerned with mutation and may actually want to prevent it. The
remaining problem is that ref traditionally assumes the caller holds an actual
lvalue, whereas in such cases the caller may want to pass an rvalue.

3. Transparently returning references to ref parameters

One important use case is functions that return one of their reference
parameters, the simplest being:

ref T identity(T)(ref T obj) { return obj; }

We'd like to allow identity and to make it safe by design. If we don't, we
disallow a family of use cases such as min() and max() that return by
reference, call chaining idioms etc.

4. Sealed containers

This important use case is motivated by efficient and safe allocators. We want
to support scoped and region-based allocation, and at the same time we want to
combine such allocators with containers that return references to their data.

Consider as a simple example a scoped container:

struct ScopedContainer(T)
{
   private T[] payload;
   this(size_t n) { payload = new T[n]; }
   this(this) { payload = payload.dup; }
   ~this() { delete payload; }
   void opAssign(ref ScopedContainer rhs) {
     payload = rhs.payload.dup;
   }
   ref T opIndex(size_t n) { return payload[n]; }
}

The container eagerly allocates its state and deallocates it when it leaves
scope. We'd like to allow opIndex to typecheck and guarantee safety.

5. Simplicity

We wish to get the design right with maximum economy in language design. One
thing easily forgotten when focusing minutia while carrying significant context
in mind is that whatever language additions we make come on top of an already
large machinery.

There have been ideas based on defining "scope ref", "in ref", or "@attribute
ref". We'd like to avoid such and instead make sure plain "ref" is useful,
safe, and easy to understand.  

------------

These desiderata and the interactions among them impose constraints on the
design space. In the following post I'll sketch some possible designs dictated
by prioritizing desiderata, and analyze the emerging tradeoffs.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
January 09, 2013
[Issue 9238] Support rvalue references
http://d.puremagic.com/issues/show_bug.cgi?id=9238



--- Comment #11 from Jonathan M Davis <jmdavisProg@gmx.com> 2013-01-09 15:11:59 PST ---
> There have been ideas based on defining "scope ref", "in ref", or "@attribute
> ref". We'd like to avoid such and instead make sure plain "ref" is useful,
> safe, and easy to understand.  

I would argue that it's vital that ref which requires an lvalue and ref which
doesn't care whether it's given an lvalue or rvalue be distinguished. You're
just begging for bugs otherwise. It should be clear in a function's signature
whether it's intending to take an argument by ref and mutate it or whether it's
simply trying to avoid unnecessary copying.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
January 10, 2013
[Issue 9238] Support rvalue references
http://d.puremagic.com/issues/show_bug.cgi?id=9238



--- Comment #12 from Andrei Alexandrescu <andrei@erdani.com> 2013-01-09 16:07:04 PST ---
Design #1: statically sealed ref
================================

One possible design is to give desideratum "4. Sealed containers" priority and
start from there.

Continuing the ScopedContainer example, we notice that to make it work we need
the lifetime of c[n] is bounded by the lifetime of c. We set out to enforce
that statically. The simplest and most conservative rule would be:

----------
For functions returning ref, the lifetime of the returned object spans at least
through the scope of the caller.
----------

Impact on desiderata:

To enforce safety we'd need to disallow any ref-returning function from
returning a value with too short a scope. Examples:

ref int fun(int a) { return a; }
// Error: escapes address of by-value parameter

ref int gun() { int a; return a; }
// Error: escapes address of local

ref int hun() { return *(new int); }
// fine

ref int iun(int* p) { return *p; }
// fine

ref int identity(ref int a) { return a; }
// Should work

This last function typechecks if and only if the argument is guaranteed to have
a lifetime that expands through the end of the scope of the caller. In turn, if
we want to observe (2) and allow rvalues to bind to ref, that means any rvalue
created in the caller must exist through the end of the scope in which the
rvalue was created. This is a larger extent than what D currently allows
(destroy rvalues immediately after the call) and also larger than what C++
allows (destroy rvalues at the end of the full expression). It is unclear
whether this has bad consequences; probably not.

One interesting consequence is that ref returns are intransitive, i.e. cannot
be passed "up". Consider:

ref int identityImpl(ref int a) { return a; }
ref int identity(ref int a) { return identityImpl(a); }

Under the rule above this code won't compile although it is safe. This is
because from the viewpoint of identity(), identityImpl returns an int that can
only last through the scope of identity(). Attempting to return that is
tantamount to returning a local as far as identity() is concerned, so it won't
typecheck.

This limitation is rather severe. One obvious issue is that creating wrappers
around objects will be seriously limited. For example, a range can't forward
the front of a member:

struct Range {
 private AnotherRange _source;
 // ... inside some Range implementation ...
 ref T front() { return _source.front; } // error
}

Summary
=======

1. Design is safe
2. Rvalues can be bound to ref (subject to unrelated limitations) ONLY if the
lifetime of rvalues is prolonged through the end of the scope they're created
in. (Assessment: fine)
3. Implementing identity(): possible but intransitive, i.e. references can't be
passed up call chains. (Asessment: limitation is problematic.)
4. Sealed containers: possible and safe, but present wrapping problems due to
(3).
5. Simplicity: good

I'll next present a refinement of this design that improves on its
disadvantages without losing the advantages.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
January 10, 2013
[Issue 9238] Support rvalue references
http://d.puremagic.com/issues/show_bug.cgi?id=9238



--- Comment #13 from Andrei Alexandrescu <andrei@erdani.com> 2013-01-09 18:32:23 PST ---
Design #2: ref return is sealed by arguments
============================================

So design #1 has the obvious issue that call chains can't propagate ref returns
upwards even when it's safe to do so. To improve on that, let's devise a
refined rule:

----------
For functions returning ref, the lifetime of the returned object spans at least
the lifetime of its shortest-lived argument.
----------

Impact on desiderata:

Reconsidering the troublesome example:

ref int identityImpl(ref int a) { return a; }
ref int identity(ref int a) { return identityImpl(a); }

When compiling identity(), the compiler (without seeing the body of
identityImpl) figures that the lifetime of the value returned by
identityImpl(a) is at least as long as the lifetime of a itself. Therefore
identity() typechecks because it is allowed to return a proper.

Safety is still guaranteed however. This is because a function can never escape
a reference to an object of shorter lifetime than the lifetime of the
reference. Reconsidering the front() example:

struct Range {
 private AnotherRange _source;
 // ... inside some Range implementation ...
 ref T front() { return _source.front; } // fine
}

front() compiles because front is really a regular function taking a "ref Range
this". Then _source is scoped inside "this" so from a lifetime standpoint
"this", _source, and the result are in good order.

ref int fun() {
  Range r;
  return r.front; // error
}

fun() does not compile because the call r.front returns a value with the
lifetime of r, so returning a ref is tantamount to escaping the address of a
local.

ref int gun(Range r) {
  return r.front; // error
}

This also doesn't compile because the result of r.front has the lifetime of r,
which is passed by value into gun.

ref int gun(ref Range r) {
  return r.front; // fine
}

This does work because the result has the same lifetime as r.

The question remains on how to handle rvalues bound to ref parameters. The
previous design required that rvalues live as long as the scope, and this
design would allow that too. But this design also allows the C++-style
destruction of rvalues: in the call foo(bar()), if foo returns a ref, it must
be used immediately because bar will be destroyed at the end of the full
expression.

If we want to keep the current D rule of destroying rvalue parameters right
after the call to the function, that effectively disallows any use of the ref
result. This may actually be a meaningful choice.

The largest problem of this design is lifetime pollution. Consider the
ScopedContainer example:

ref T opIndex(size_t n) { return payload_[n]; }

In the call c[42], the shortest lifetime is actually that of n, which binds to
the rvalue 42. So the compiler is forced to a shorter guarantee of the result
lifetime than the actual lifetime, because of an unrelated parameter.

Summary
=======

1. Design is safe
2. Design allows binding rvalues to ref parameters. For usability, temporaries
must last at least as long as the current expression (C++ style).
3. Returning ref parameters works with fewer restrictions than the previous
design.
4. Sealed containers are implementable.
5. Difficulty is moderate on the implementation side and moderate on the user
side.

Next iteration of the design will attempt to refine the lifetime of results so
as to avoid pollution.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
April 23, 2013
[Issue 9238] Support rvalue references
http://d.puremagic.com/issues/show_bug.cgi?id=9238



--- Comment #14 from Andrei Alexandrescu <andrei@erdani.com> 2013-04-23 12:03:18 PDT ---
Adding an example that should work by Steve:
http://forum.dlang.org/thread/ylebrhjnrrcajnvtthtt@forum.dlang.org?page=11

struct S
{
   int x;
   ref S opOpAssign(string op : "+")(ref S other) { x += other.x; return  
this;}
}

ref S add5(ref S s)
{
   auto o = S(5);
   return s += o;
}

void main()
{
   auto s = S(5);
   S s2 = add5(s);
}

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Next ›   Last »
1 2
Top | Discussion index | About this forum | D home