__rvalue and Move Semantics first draft (page 3)

January 16

Re: __rvalue and Move Semantics first draft

Posted by Quirin Schroll
in reply to Walter Bright

Permalink

Quirin Schroll

Posted in reply to Walter Bright

Permalink

On Saturday, 9 November 2024 at 09:33:24 UTC, Walter Bright wrote:

https://github.com/WalterBright/documents/blob/5dbf6728d7d0ae46a411c720ec41e3603310172b/rvalue.md

From the DIP:

An rvalue argument is considered to be owned by the function called. Hence, if an lvalue is matched to the rvalue argument, a copy is made of the lvalue to be passed to the function. The function will then call the destructor (if any) on the parameter at the conclusion of the function. An rvalue argument is not copied, as it is assumed to already be unique, and is also destroyed at the conclusion of the function. The destruction is automatically appended to the function body by the compiler.

The function cannot know if its parameter originated as an rvalue or is a copy of an lvalue.

This means that an __rvalue(lvalue expression) argument destroys the expression upon function return. Attempts to continue to use the lvalue expression are invalid. The compiler won't always be able to detect a use after being passed to the function, which means that the destructor for the object must reset the object's contents to its initial value, or at least a benign value.

I think that sections need revising. As I understand it, a function binds an argument by reference or by value:

void f(ref T reference); // binds by reference
void g(T value); // binds by value

In my mind, function parameters are essentially local variables of the function that are assigned by the caller (by providing arguments). If argument passing does not work exactly like initializing (local) variables, I’d consider that a flaw of the language.

This means:

If a parameter is bound by value, it will be destroyed as g returns (whether that is done by the caller or the callee is an implementation detail and not part of the language). If the caller passes x or __rvalue(x) is completely irrelevant for the callee. It only ever sees its parameter initialized and is responsible for its destruction. It cannot care where it came from.

If an argument is bound by reference, passing __rvalue(x) is either invalid or, if the rvaluerefparam preview is active, binds a temporary initialized in the stack frame of the caller by __rvalue(x). It does not bind x, that would be extremely confusing. In that case, the caller is responsible for the destruction of the temporary. (The callee knows nothing about the creation of the temporary.)

We could introduce a parameter storage class __rvalue ref that:

Corresponds to C++ rvalue references
Allows binding rvalues only, and for __rvalue(x) arguments, no temporary is created.

That would allow a function to freely move from an argument:

void tryAdd(__rvalue ref T x)
{
    if (…) this.x = __rvalue(x);
}

Contrary to the above, void tryAdd(T x) requires a move to pass an rvalue argument and another move to assign this.x. However, if moving a T is reasonably cheap, pass-by-value can make sense if binding lvalue arguments should be supported.

By itself, __rvalue(x) should do nothing. Only if an operation on it distinguishes rvalues and lvalues does it matter, which is its use case; then that usually leaves x in a moved-from state, but as shown above, there’s a use case for not moving from the variable. Thus, after tryAdd(__rvalue(x)) the variable x contains a valid T object or a moved-from T object.

A moved-from T object need not support all operations T allows, but in C++, it must allow for two operations:

being assigned
being destroyed

Most types can support an empty state, and moving from an object would put it in that state.

It seems your DIP Draft conflates moving and relocation (C++ lingo). A relocation is a move followed by destruction of the source. The notion of relocation is meaningful because there are types for which relocation is trivial but moving is not.

For example, a std::unique_ptr has a non-trivial move: It must set the source std::unique_ptr in a null state (such that it can be assigned again or destroyed without releasing the managed resource, which has a new owner). A std::unique_ptr has a trivial relocation, though. If we simply copy the internal pointer and do not run the destructor on the source, the managed resource has a new owner and we don’t waste time setting the source null and then checking if the source is null (to skip the freeing of a possible managed resource.)

An example for a type that is not trivially relocatable is a type with an internal pointer (such as std::string usually). It has to readjust that pointer the relocation.

Using a moved-from object is reasonable; C++ requires assignment to be valid, usually more/all operations are allowed for most types. D can require a moved-from object to be fully usable.

Using a relocated-from object(!) is fundamentally invalid. It is already destroyed (that is, conceptually destroyed, an actual destructor need not have run). Using the variable is valid for taking its address or using the storage (e.g. for placement new) are valid.

For reference, the Circle C++ language extension implements relocation as a built-in operation.

Relocation and placement new make lifetimes non-lexical. Moving, on the other hand, does not disturb lexical lifetime.

The last paragraph of the quote again:

That is probably not a good idea. It would render __rvalue a @system feature. Either the compiler can guarantee it’s safe to use or it can’t. Reliably recognizing use after destruction is probably impossible (definitely in @system code, and in purely @safe code, it at least requires difficult data-flow analysis). In C++, one is content saying it’s UB and moves on. D, with it’s focus on @safe, can’t do that (or rather shouldn’t, as it would make __rvalue immediately @system).

My suggestion: Require all D objects to be valid after being moved from (whatever the reason for a move was).

If you really want to explore relocation in the DIP, add __relocate(x) for that:

Requires the result is used (assigned to something, initializes something, or passed by value(!) as a function argument).
Removes the destructor call of x if it is a local and hasn’t used a placement new on it afterwards.
__relocate could maybe be @safe in very constrained circumstances: The argument must be a local and there must not exist any references or aliases.

Forums