Thread overview
Transitive R-Value Inference and Move Construction a la Rust
Sep 26, 2016
Nordlöw
Sep 26, 2016
Nordlöw
Sep 26, 2016
Nordlöw
Sep 26, 2016
ag0aep6g
Sep 26, 2016
Nordlöw
Sep 26, 2016
ag0aep6g
September 26, 2016
I just discovered the following interesting fact about the current state of DMD:

/// Sample struct that keeps track of the number of copy constructions (postblits) that is made in a `C`-instance.
struct C
{
    ...
    this(this) // postblit
    {
        ++copyCount; // yet another reference
    }

    struct RCStore
    {
    }

    uint copyCount = 0;          ///< copy counter
}

/// Empty
void f(T)(T a)
{
    assert(a.copyCount == 1);
}

/// Top function.
void top(T)(T a)                // TODO infer that `a` is passed as an r-value?
{
    assert(a.copyCount == 0);
    sub(a);                     // so we can do move construction here aswell?
}

/// Sub function.
void sub(T)(T a)
{
    assert(a.copyCount == 1);   // this is not neccessary
}

unittest
{
    import std.stdio : writeln;

    C x;
    f(x);
    assert(x.copyCount == 0);
    const y = x;
    assert(y.copyCount == 1);

    top(C());
}

namely that D will infer move semantics of `top`'s parameter when passing an r-value in the call to `top()` but not transitively in `top`'s call to `sub` even though this is possible.

What stops us from extending D's introspection possibilities to also provide us with knowledge about whether `C` was called with an r-value? Thereby making it possible to infer r-value referenceness, and in turn move semantics, of (templated) function parameters transitively.

A problem here is that the number of different template instantiations of a function will grow exponentially with the number of the function parameters that have a postblit. But then again aren't we all interested in maximum performance here and reduce the need for GC usage here in the case when the postblit does GC-allocated duplication of `C`-local allocations?

If template bloat will be a problem an alternative solution could be to add yet another trait, say, `__traits(isRValue, a)` the tells whether the parameter `a` was called using an r-value. That together with `static if` could branch transitive parameter passing semantics when needed. However, the code generation for such a template would not be affected if this trait is never used and code generation among templates not differing in call semantics could be reused.

This would make D even more competitive with Rust especially in the case when chaining lazy ranges with containers with allocating post-blits.

I realize that this is a somewhat competing approach to what is currently being worked on in DIP-1000.

I'm using DMD git master.

Destroy!
September 26, 2016
On Monday, 26 September 2016 at 15:24:35 UTC, Nordlöw wrote:
> `C` was called with an r-value?

Correction: Shall be:

  `top` _and_ transitively `sub` was called with an r-value

September 26, 2016
On Monday, 26 September 2016 at 15:24:35 UTC, Nordlöw wrote:
> I just discovered the following interesting fact about the current state of DMD:
> ...


Double-Doh!, I just realized my reasoning is wrong; the reason why the parameter `a` in call  `sub(a)` in `top` is not moved is instead because the compiler currently doesn't check whether `a` is used below the call to `sub(a)`. Something I believe is called data flow analysis. This optimization is however, still, very much possible and doesn't lead to the template bloat I argued about. Even better.

Sorry for the inconvenience and lack of insight.
September 26, 2016
On 09/26/2016 05:24 PM, Nordlöw wrote:
> I just discovered the following interesting fact about the current state
> of DMD:
[...]
> void top(T)(T a)                // TODO infer that `a` is passed as an
> r-value?
> {
>     assert(a.copyCount == 0);
>     sub(a);                     // so we can do move construction here
> aswell?
> }
[...]
> unittest
> {
[...]
>     top(C());
> }
>
> namely that D will infer move semantics of `top`'s parameter when
> passing an r-value in the call to `top()` but not transitively in
> `top`'s call to `sub` even though this is possible.

The move happens because `C()` is an rvalue. This happens at the call site. The code generation for `top` is not affected by this. `top` doesn't know or care if the argument it gets is the result of a copy or a move.

Inside `top`, `a` is an lvalue. It wouldn't be correct to move `a` just because it originated from an rvalue.

There may be cases where `sub(a)` could move `a` into position instead of copying it, but `a` having been moved or copied before is not the only thing to consider, if it even needs consideration.
September 26, 2016
On Monday, 26 September 2016 at 16:28:18 UTC, ag0aep6g wrote:
> Inside `top`, `a` is an lvalue. It wouldn't be correct to move `a` just because it originated from an rvalue.

AFAICT, `top` can be moved in the call to `sub` if it's not used not returned below the call, right?
September 26, 2016
On 09/26/2016 07:15 PM, Nordlöw wrote:
> On Monday, 26 September 2016 at 16:28:18 UTC, ag0aep6g wrote:
>> Inside `top`, `a` is an lvalue. It wouldn't be correct to move `a`
>> just because it originated from an rvalue.
>
> AFAICT, `top` can be moved in the call to `sub` if it's not used not
> returned below the call, right?

(I'm assuming `top` is a typo and you meant `a`.)

I don't know. To the layman that I am, it looks similar to tail call elimination. When the last use of `a` is in a call, it seems plausible that it could be possible to skip the postblit and the destructor, and let the called function take care of it. But it's probably not as simple as that.