February 28 Re: What ever happened to move semantics? | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | On 2/28/24 02:06, Walter Bright wrote: > On 2/27/2024 4:42 PM, Timon Gehr wrote: >> FWIW I have been pushing this a couple times at the DLF meetings, but in the end somebody will have to put in the work to implement it in the compiler and I cannot spend the time required for that atm. >> >> The move hole is also an issue for tuple unpacking though. > > Reviewing the DIP would be a big help if that can work for you. > > https://github.com/dlang/DIPs/blob/master/DIPs/DIP1040.md Sure! A lot of good stuff in there. Here's my review. Points 1 to 15 respond to the DIP contents. The main issue I see is the way move construction and assignment are declared by special-casing existing syntax that already means something else _and changing its observable behavior_. To fix this, I think there should be separate syntax for suppressing the destructor call. Furthermore, partial moving in general does not work in the way it is specified in the DIP, it bypasses the destructor of the enclosing struct without participation of that struct. Point 16 to 18 point out things that are missing from the DIP. The main issue I see here is that destructuring is missing from the DIP. This is crucial in order to be able to transform data from one type into data from another type while using only moves and no copies or destruction. 1. Regarding last use: > ```d > S s; > f(s); // copy > f(s); // copy > f(s); // move > ``` It would be useful to show examples with dynamic control flow (edit: I see some examples occur later too), such as: ```d S s foreach(i;0..3){ f(s); // ? } ``` I assume the line marked "?" will always copy? Maybe it would be better to allow implementation-defined copy elision (also see 11.). ```d S s; f(s); // copy f(s); // ? if(uniform(0,2)) return; f(s); // move ``` I assume the line marked "?" will always copy? Maybe it would be better to allow implementation-defined copy elision (also see 11.). 2. Regarding Existing State in D: - It would make sense to elaborate on `@disable`d copy constructors. This is similar to not implementing the `Copy` trait in Rust. The resulting values can only be moved. - In D, you can also have a `private` destructor. As far as I can tell, this is currently useless, but with move semantics this can be used to enforce explicit destruction via move, which is a nice way to design a library interface. 3. Regarding declaration syntax of Move Constructors and Move Assignment Operators I would highly recommend to use a distinct syntax for suppressing destruction of the argument. I will argue here specifically for the case of Move Constructors, but Move Assignment operators have exactly the same issue. > > A Move Constructor is a struct member constructor that moves, rather than copies, the argument corresponding to its first parameter into the object to be constructed. The argument is invalid after this move, and is not destructed. > > A Move Constructor for struct S is declared as: > > ```d > this(S s) { ... } > ``` This is a breaking language change. Also, consider ```d struct S{ ... this(T)(T t){ ... } ... } ``` This constructor will be a move constructor iff T=S. Therefore, that the destructor is not called on the argument in some cases may be very surprising to programmers. A similar example is this one ```d struct S{ ... this(T...)(S s, T args){ ... } ... } ``` Here, the constructor is a move constructor iff no additional `args` are passed. Overall, the proposed syntax introduces a surprising special case. Also, what is the syntax for a copy constructor? Would it be `this(ref S s){ ... }` ? 4. Regarding `nothrow` on Move Constructors and Move Assignment Operators. > The Move Constructor is always nothrow, even if nothrow is not explicitly specified. A Move Constructor that throws is illegal. This special case should be motivated in the DIP. I assume the motivation is that because the argument is not destructed, throwing is particularly error-prone here. In general, I would advise against built-in requirements on specified attributes unless absolutely necessary. 5. Regarding Default Move Constructor > If a Move Constructor is not defined for a struct that has a Move Constructor in one or more of its fields, a default one is defined, and fields without a Move Constructor are moved using a bit copy. This is missing a specification of what the default move constructor does. (I assume it is implemented as a move for each field, in lexical order, where fields without a Move Constructor are moved using a bit copy.) 6. Regarding Default Move Constructor and Default Move Assignment Operator. > If a Move Constructor is not defined for a struct that has a Move Assignment Operator, a default Move Constructor is defined and implemented as a move for each of its fields, in lexical order. > This generated move constructor will often do the wrong thing. A correct way to do it would be to default-initialize a new instance and then call the Move Assignment Operator on it. It is also worth considering if instead, a Move Constructor Operator should not just be required to be defined explicitly in any struct that has an explicit Move Assignment Operator defined. > If a Move Assignment Operator is not defined for a struct that has a Move Assignment Operator in one or more of its fields, a default Move Assignment Operator is defined, and fields without a Move Assignment Operator are moved using a bit copy. > > If a Move Assignment Operator is not defined for a struct that has a Move Constructor, a default Move Assignment Operator is defined and implemented as a move for each of its fields, in lexical order. This generated move assignment operator will usually do the wrong thing. A correct but inefficient way to do it would be to destroy the current object and reconstruct it using the Move Constructor. It is also worth considering if instead, a Move Assignment Operator should not just be required to be defined explicitly in any struct that has an explicit Move Constructor defined. 7. Regarding EMO > An EMO is a struct that has both a Move Constructor and a Move Assignment Operator. An EMO defaults to exhibiting move behavior when passed and returned from functions rather than the copy behavior of non-EMO objects. This definition is not self-contained and should therefore refer to the discussion further below for clarification. 8. Regarding Move Ref > A Move Ref is a parameter that is a reference to an EMO. (The ref is not used.) For small structs, the additional indirection from the implicit reference will introduce overhead. 9. Regarding NRVO of EMO objects > If NRVO cannot be performed, s is copied to the return value on the caller's stack. This is surprising to me. I would have expected `s` to be moved to the return value on the caller's stack instead. 10. Regarding Returning an EMO by Move Ref This is too cute, because it changes the meaning of `return` in one specific special case. Consider: ```d struct S{ int* ptr; this(S s){ this.ptr=s.ptr; } void opAssign(S s){ this.ptr=s.ptr; } } S func(return S s){ return S(s); } ``` The `return` annotation is needed because the pointer again appears in the return value. Note that this is a simplified example, but we could think of similar ones with multiple involved pointers that need to be permuted (though I don't know how to implement that without destructuring or destruction). 11. Regarding Copy Elision Maybe it would be better to specify explicitly that an implementation is allowed to optimize the pattern: ```d auto s = t; // (copy) ... // arbitrary code not referring to `t` destroy(t); ``` to: ```d auto s = move(t); ``` 12. Regarding lifetimes. You make a point about nested functions and lambdas. However, this is not the only problem. Consider: ```d struct S{ int x; } int foo()@safe{ S s; scope p = &s.x; bar(s); // last use of s, moved return *p; // bad memory access } ``` 13. Regarding partial move. > Therefore, the generalized rule is that an access to an EMO field of an aggregate will be moved only if that is the last access of the containing variable. This does not work. You cannot elide the entire destructor of `S` based on moving a single field of `S`. 14. Regarding Destruction This is a bit inconsistent with what was presented earlier. I agree that implementation-defined copy elision is probably a good idea (see 11.). 15. Regarding C++ interop. I do not see anything obviously wrong, except that the requirement to opt out of rvalue references seems error prone. I think Manu has more expertise here. Also, it would be good to specify `@value` as a standalone thing in the DIP, as it may be useful beyond C++ interop (also see point 8.). What is missing from the DIP? 16. Missing: Redeclaration after Move ```d S s, t; func(s); // moved, `s` no longer accessible S s = t; // explicit construction via redeclaration ``` A nice feature of this is that the type of a variable can be changed on redeclaration. Note that Rust allows this. 17. Missing: Destructuring This is partially attempted in the DIP via partial move (which does not work). However, there must be a way to implement the following: ```d struct U(T...){ T fields; } struct S(T...){ T fields; @disable ~this(); ... // need support from S to bypass destructor } // fields of resulting U must be moved from the fields of S U fromS(S s){ ... } ``` 18. Missing: Moving the receiver ```d struct S{ T foo()@rvalue{ ... } @disable ~this(); } ``` void main(){ S s; auto t=s.foo(); // last use of s } ``` | |||
February 28 Re: What ever happened to move semantics? | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Timon Gehr | On 2/28/24 15:17, Timon Gehr wrote:
> ...
> What is missing from the DIP?
> ...
>
19. Explicit moves.
There is no way to force that a given occurrence of a variable is the last use and is moved. We can use std.algorithm.move, but the DIP as specified would just move a copy if something is used again.
What I would like to see is:
```d
void main(){
S s;
foo(move(s));
auto t=s; // error, `s` has been moved
S s; // ok, can redeclare `s`
}
```
Maybe there needs to be a parameter annotation that forces a move, then `move` can be implemented in terms of that.
| |||
February 29 Re: What ever happened to move semantics? | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Timon Gehr | On 29/02/2024 3:17 AM, Timon Gehr wrote:
> 12. Regarding lifetimes.
>
> You make a point about nested functions and lambdas. However, this is not the only problem. Consider:
>
> |struct S{ int x; } int foo()@safe{ S s; scope p = &s.x; bar(s); // last use of s, moved return *p; // bad memory access }|
I've been exploring this problem recently with isolated.
What this suggests to me is that all move operator supporting types are implicitly isolated.
Both s and p would belong to the same subgraph, when s gets moved, the subgraph that s and p are in would get invalidated making the return not possible.
Interesting, there appears to be some inter-relationship here that I had not considered.
| |||
February 28 Re: What ever happened to move semantics? | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Timon Gehr | On Wednesday, 28 February 2024 at 14:17:44 UTC, Timon Gehr wrote:
>
> - It would make sense to elaborate on `@disable`d copy constructors. This is similar to not implementing the `Copy` trait in Rust. The resulting values can only be moved.
>
Also, no discussion of `@disable`d move constructors.
| |||
February 28 Re: What ever happened to move semantics? | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Dakota | On Wednesday, 28 February 2024 at 06:06:23 UTC, Dakota wrote:
>
> For me ImportC is more important than DIP1040, it open the doors for very many possibilities.
I'm resorting to an argumentum ad industriam: move constructors have been explicitly demanded for at least a decade by the prominent industrial users.
| |||
February 28 Re: What ever happened to move semantics? | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Timon Gehr | On 2/28/2024 6:22 AM, Timon Gehr wrote: > On 2/28/24 15:17, Timon Gehr wrote: >> ... >> What is missing from the DIP? >> ... >> > > 19. Explicit moves. > > There is no way to force that a given occurrence of a variable is the last use and is moved. We can use std.algorithm.move, but the DIP as specified would just move a copy if something is used again. Forcing a move just sounds like trouble. If it is not the last use, and it is moved, then wouldn't the result be undefined behavior? Determining last use should be in the purview of the compiler, not the user, so it is reliable. The Ownership/Borrowing system does determine last use, using data flow analysis. > What I would like to see is: > > ```d > void main(){ > S s; > foo(move(s)); > auto t=s; // error, `s` has been moved > S s; // ok, can redeclare `s` > } > ``` > > Maybe there needs to be a parameter annotation that forces a move, then `move` can be implemented in terms of that. What is the point of declaring another variable of the same name? We already disallow shadowing declarations, and that has prevented a number of bugs at least in my own code (they're very difficult to spot with a visual check). | |||
February 28 Re: What ever happened to move semantics? | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | On 2/28/24 21:38, Walter Bright wrote: > On 2/28/2024 6:22 AM, Timon Gehr wrote: >> On 2/28/24 15:17, Timon Gehr wrote: >>> ... >>> What is missing from the DIP? >>> ... >>> >> >> 19. Explicit moves. >> >> There is no way to force that a given occurrence of a variable is the last use and is moved. We can use std.algorithm.move, but the DIP as specified would just move a copy if something is used again. > > Forcing a move just sounds like trouble. If it is not the last use, and it is moved, then wouldn't the result be undefined behavior? No, the idea is that the compiler enforces that it is indeed the last use and produces a compile-time error message if it cannot prove that it is the case. > Determining last use should be in the purview of the compiler, not the user, so it is reliable. > ... Yes. It still holds that one may want to make sure that a value is really moved at a given point. Sometimes this matters. Anyway, this is by far not the most important point. > The Ownership/Borrowing system does determine last use, using data flow analysis. > > >> What I would like to see is: >> >> ```d >> void main(){ >> S s; >> foo(move(s)); >> auto t=s; // error, `s` has been moved >> S s; // ok, can redeclare `s` >> } >> ``` >> >> Maybe there needs to be a parameter annotation that forces a move, then `move` can be implemented in terms of that. > > What is the point of declaring another variable of the same name? From my previous post: > > 16. Missing: Redeclaration after Move > > ```d > S s, t; > func(s); // moved, `s` no longer accessible > S s = t; // explicit construction via redeclaration > ``` > > A nice feature of this is that the type of a variable can be changed on redeclaration. Note that Rust allows this. This is a relatively common idiom in languages that support moves. It is annoying if you have to invent a new name for each intermediate result. One use case would be type state: File!(FileState.flushed) file = open("file.txt"); File!(FileState.buffered) file = file.write("hello "); File!(FileState.buffered) file = file.writeln("world!"); // file.close(); // compile time error File!(FileState.flushed) file = file.flush(); file.close(); // moves "file" // file.write("hello"); // compile time error Assume you have some declarations like these and you want to comment out part of the statements. It would now be annoying to have to rename variables. I.e., you want to use the same name for different versions of the same thing, similarly to how you do not have to change the name of a variable when assigning to it. > We already disallow shadowing declarations, and that has prevented a number of bugs at least in my own code (they're very difficult to spot with a visual check). The reason why shadowing is error prone is that multiple variables with overlapping lifetimes are in scope and the compiler arbitrarily picks one of them. This case is different, as only one variable of the same name exists at any given time. This is not error prone. Requiring unique names is more error prone in this case, as you can accidentally copy an older version of a variable. Anyway, this is not the most important thing, please do check out the points I initially included in my review. This point is just something I had forgotten to include. | |||
February 28 Re: What ever happened to move semantics? | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Timon Gehr | As I've said in previous posts, I've written a lot of Rust and now a fair amount of D. I prefer D as it is, now that I've solved some issues with my own lack of knowledge about the available tools. I want to make a comment about all this chatter about move semantics. I freely admit that I have not read the proposal for doing this, if there is a fully-fleshed-out proposal, and I probably never will. My issue is this: Rust has led the move semantics charge. Grafting move semantics onto feels to me like me-too-ism and just cluttering the D environment with yet another wart that it does not need. Move semantics in Rust results in the programmer becoming an key part of the memory management system, whether they realize it or not. Anyone who has wrestled with the borrow checker or gotten themselves into lifetime hell with Rust will understand (unless they've drunk the Kool-Aid) what I am talking about. Rust is a fiendishly difficult language to learn. I've been doing this for a very long time and have written code in languages most of you have never heard of. None of them were as hard to learn as Rust and I include Haskell, which I know well. And what's the reward for all this diffculty? No GC. The Rust community's allergy to garbage collection is totally overblown, in my opinion. Yes, garbage collected languages are not good choices for embedded and/or real-time code. But ordinary applications? On today's multi-Ghz 32 Gb hardware? And the cost of avoiding garbage collection is great. You need to hear the observations of more people like me who know both Rust and D and I think the tune will be much the same -- D is much easier to learn and easier to use. The resulting code? D code runs fast, in my experience it is as fast as Rust, within a very acceptable tolerance. And benchmarks I've seen online confirm this. Why is an uninteresting language like Go so much more popular than Rust? It's easy to learn, gives acceptable performance, and has a rich environment. So before cluttering the D environment with an unneeded me-too effort (my opinion!) and wasting scarce developer resources on this when there is so much else to be done that D needs more than a borrow-checker, I urge you to think very carefully about this. | |||
February 28 Re: What ever happened to move semantics? | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Timon Gehr | On Wednesday, February 28, 2024 7:17:44 AM MST Timon Gehr via Digitalmars-d wrote: > > A Move Constructor is a struct member constructor that moves, rather than copies, the argument corresponding to its first parameter into the object to be constructed. The argument is invalid after this move, and is not destructed. > > > > A Move Constructor for struct S is declared as: > > > > ```d > > this(S s) { ... } > > ``` > > This is a breaking language change. > > Also, consider > > ```d > struct S{ > ... > this(T)(T t){ ... } > ... > } > ``` > > This constructor will be a move constructor iff T=S. Therefore, that the destructor is not called on the argument in some cases may be very surprising to programmers. I've already run into issues trying to add copy constructors at work where code broke because it had declared this(T)(T t) { ...} and that then conflicted with a copy constructor which was automatically added when I added a copy constructor to the type of one its member variables. Fortunately, the result was a compiler error, and I could then add a template constraint to avoid having the templated constructor take the type that it was constructing, but the risks with something like this suddenly being turned into a move constructor certainly should be considered. In some cases, such constructors likely work just fine when given a variable of the type that they're constructing, whereas in others, they're probably never actually used that way and wouldn't compile if you tried. So, if such a constructor became a move constructor, a type that was perfectly moveable before could become an error to try to move - or do something that is very much not a move if the constructor happens to compile when given a variable of the type that it's constructing, but it does something rather different from a move. Of course, to an extent, the same could be said of this(S s) { ... } though I would guess that in most cases, having that turned into a move constructor would work, whereas a templated constructor would be much more likely to not work with the type being constructed. Another idea here would be that if we required something like an attribute on the constructor - e.g. @movecons - then the compiler could not only see that you intended to make it a move constructor, but it could give you an error if it didn't have the appropriate signature. I'm not sure that we really want to go that route, but if we did require an attribute, at least we wouldn't accidentally have existing constructors turn into move constructors - and having the compiler verify for you that it's treating it as a move constructor could be valuable. > 4. Regarding `nothrow` on Move Constructors and Move Assignment Operators. > > > The Move Constructor is always nothrow, even if nothrow is not explicitly specified. A Move Constructor that throws is illegal. > This special case should be motivated in the DIP. I assume the motivation is that because the argument is not destructed, throwing is particularly error-prone here. > > In general, I would advise against built-in requirements on specified attributes unless absolutely necessary. Yeah. I don't know if it ever makes sense for move constructors to throw (e.g. arguably it shouldn't be allowed destructors to throw, though IIRC, it is currently allowed). That being said, IMHO, we should be absolutely minimizing how much we require _any_ kind of function attribute. It causes all kinds of trouble for code that doesn't want to (or can't) use them. At least with nothrow, you can do something like catch the exception and then assert false, but IMHO, in general, core language functionality should not require specific attributes. As it is, I'm having quite a bit of trouble with copy constructors and attribute requirements (e.g. getting compilation errors with regards to druntime code that insists on pure), and I need to put together some bug reports on it. Attributes can be great in theory, but they simply don't work in many cases with complex code. - Jonathan M Davis | |||
February 28 Re: What ever happened to move semantics? | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Don Allen | On 2/28/24 23:01, Don Allen wrote: > As I've said in previous posts, I've written a lot of Rust and now a fair amount of D. I prefer D as it is, now that I've solved some issues with my own lack of knowledge about the available tools. > > I want to make a comment about all this chatter about move semantics. I freely admit that I have not read the proposal for doing this, if there is a fully-fleshed-out proposal, and I probably never will. > ... Well, suit yourself. > My issue is this: Rust has led the move semantics charge. Linear typing is a very old idea. > Grafting move semantics onto feels to me like me-too-ism and just cluttering the D environment with yet another wart that it does not need. No, the warts are here now. Move semantics is a way to get rid of existing warts. > Move semantics in Rust results in the programmer becoming an key part of the memory management system, whether they realize it or not. It's just another tool. Use it when it is appropriate. > Anyone who has wrestled with the borrow checker or gotten themselves into lifetime hell with Rust will understand (unless they've drunk the Kool-Aid) what I am talking about. Move semantics is not the same as borrowing. > ... > > So before cluttering the D environment with an unneeded me-too effort (my opinion!) and wasting scarce developer resources on this when there is so much else to be done that D needs more than a borrow-checker, I urge you to think very carefully about this. You seem to confuse DIP1000 with move constructors. Move semantics is one of the things D needs more than a borrow checker. Anyway, move constructors _are_ among the top 10 priorities now for the language. | |||
Copyright © 1999-2021 by the D Language Foundation
Permalink
Reply