Discussion Thread: DIP 1040--Copying, Moving, and Forwarding--Community Review Round 1 (page 11) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » Discussion Thread: DIP 1040--Copying, Moving, and Forwarding--Community Review Round 1 (page 11)

March 18, 2021

Re: Discussion Thread: DIP 1040--Copying, Moving, and Forwarding--Community Review Round 1

Posted by Walter Bright
in reply to RazvanN

Walter Bright

Posted in reply to RazvanN

On 3/8/2021 5:21 AM, RazvanN wrote:
> On Monday, 8 March 2021 at 10:38:25 UTC, Walter Bright wrote:
>> On 3/8/2021 12:23 AM, RazvanN wrote:
>>> On Friday, 5 March 2021 at 12:19:54 UTC, Mike Parker wrote:
>>>>
> 
>>> Moreover, what happens in this case if we have a struct that's not an EMO, but defines a move constructor?
>>
>> That means it does not have a Move Assignment Operator. It doesn't get EMO semantics. The Move Constructor section applies.
>>
> Even if the move assignment operator is implicitly generated? The DIP states: "If a Move Assignment Operator is not defined for a struct that has a Move constructor, a default Move Assignment Operator is defined and implemented as a move for each of its fields, in lexical order."
> 
> Is it possible to have a non-EMO struct that defines solely a move constructor or solely a move assignment operator? It seems like if you define one, you implicitly get the other one.

That's a good point. Perhaps it should be an error to define only one.


>>> 2. What are the situations where a move constructor call is implicitly inserted by compiler? This is not explicitly stated in the DIP and it is rather confusing.
>>
>> It's in the Move Constructor section.
>>
>>
>>> For example, when an instance is passed by `move ref` is there a move constructor call?
>>
>> No, because it is passed by ref.
> Ok, correct me if I am wrong, but it seems that if you define a move constructor, you implicitly get a move assignment operator and viceversa. This means that your struct becomes an EMO if you define one or the other. Once you have an EMO struct, besides the trivial `S a = b` are there any other situations where the move constructor may be called?

void g(S b) {
   S a = b; // calls copy constructor (not last use of b)
   S c = b; // calls move constructor (last use of b)
}


 It seems that EMOs are always passed
> by reference. If that is the case, why bother defining a move constructor when it will not get called? If I am mistaken, can you please provide a non-trivial example where the move constructor gets called?

void f(S s);
void g(S a) {
    f(a);  // copy constructor called because not last use of `a`
    f(a);  // move constructor called because last use of `a`
}


>>> 3. The DIP should explicitly state what happens when you pass an rvalue instance of an EMO by ref. How does that interact with `-preview=rvaluerefparam` ?
>>
>> With EMOs, there is no need to use the 'ref' annotation. If you do use 'ref', the special EMO semantics do not apply.
>>
> So I assume you get an error?

No, it just works the way it does now.

> Also, what happens with `auto ref` deduction when called with an EMO.

Auto ref parameters are only for template functions. "An auto ref function template parameter becomes a ref parameter if its corresponding argument is an lvalue, otherwise it becomes a value parameter"

https://dlang.org/spec/template.html#auto-ref-parameters

It will continue to do exactly what it says.


> The DIP has this example:
> 
> ref S fwd(return ref S s) { return s; }
> 
> void f(S s);
> ...
> S s;
> f(fwd(s));
> f(fwd(S());
> 
> Assuming S is an EMO, when we have `f(fwd(S()))` what happens here? Is a reference to the rvalue passed to `fwd` or does the move constructor get called?
> If we simply call `f(S())`, what happens here? Is `S()` passed by move ref or do we have a move constructor call?
> 
> The DIP talks about move refs, but the examples only use lvalues. Is the move constructor ever called for an rvalue instance of an EMO?

What happens is just what the spec says:

"Ownership of the argument to fwd() is retained by the caller, and so the caller will be responsible for its destruction. When the call is made to f(), a copy is made."

No move copies or move assignments are done. BUT, if the compiler can look inside the fwd() function, it can see that `s` can be moved directly to `f` in the first case, and `S()` can be moved directly to `f` in the second. Thus, here the move constructor is used as an optimization.

March 18, 2021

Re: Discussion Thread: DIP 1040--Copying, Moving, and Forwarding--Community Review Round 1

Posted by Walter Bright
in reply to deadalnix

Walter Bright

Posted in reply to deadalnix

On 3/16/2021 3:24 PM, deadalnix wrote:
> Now what happens with move? Well,t he natural way to transpose the described design to move is as follow:
> 1/ move all fields one by one
> 2/ call the move constructor on the result to maintain invariants if need be.
> 
> To me, 2/ furiously sounds like a postblit, but I'm open to the fact that alternatives.

The move constructor is a postblit, but with two arguments and without the implicit initial copy.

> I however know for a fact that the proposed magic will open a bag of worm because it doesn't go with the same set of design principle as the rest of the contruction/destruction business.
> 
> For instance, if a field were to be added to a struct, then immediately the move assignment becomes invalid, silently. Worse, if the field itself contains something detructible, now there is something seriously wrong, potentially outside of the struct I'm working with. For instance, if that new field is a smart pointer, then the guarantee provided y the smart pointer are broken, silently.
> 
> Now we might decide, instead, that all field are going to be destroyed at the end of the move assign in the move struct, inc are there are leftovers. But we are now back to the situation where all struct MUST have a null state, or you won't be able to have them as fields of other structs.
> 
> Or we break the guarantees provided by the ctor/dtor mechanism, but in this case, why have it at all? The whole point of ctor and dtor is to ensure that invariant are kept within the program. Let's not break this invariant.
> 

If a field with a move constructor is added to a struct S without one, a default move constructor will be created for S that calls the field's move constructor.

If one is added to a struct with an explicit move constructor, it is up to the struct programmer to fold it in explicitly, just like he does for explicit constructors and destructors.

March 18, 2021

Re: Discussion Thread: DIP 1040--Copying, Moving, and Forwarding--Community Review Round 1

Posted by Walter Bright
in reply to Arafel

Walter Bright

Posted in reply to Arafel

On 3/11/2021 1:15 AM, Arafel wrote:
> On 11/3/21 10:01, Walter Bright wrote:
>> On 3/11/2021 12:42 AM, Walter Bright wrote:
>>> Constructing from an rvalue essentially is move construction.
>>
>> I forgot to mention that the new semantics only apply to EMO objects, which require both a move constructor and a move assignment operator. The move constructor is new syntax. Therefore, it shouldn't break existing code.
> 
> But we also read in the DIP, as it has already been mentioned:
> 
>> If a Move Constructor is not defined for a struct that has a Move Assignment Operator, a default Move Constructor is defined and implemented as a move for each of its fields, in lexical order.
> 
> It's not clear if a struct would be considered an EMO if either the Move Assignment Operator or the Move Constructor are defined by default.
> 
> If that's the case, any struct with an identity assignment operator would be silently "upgraded" to EMO, thus potentially breaking existing code: the original identity assignment might even throw, which according to the DIP will no longer be allowed.

Yes, it does appear to have a conflict with `this(S)`, which wasn't part of D when the DIP was originally worked on. That syntax is part of the "-preview=rvaluerefparam" feature. Perhaps that feature can be simply replaced with DIP1040.

March 18, 2021

Re: Discussion Thread: DIP 1040--Copying, Moving, and Forwarding--Community Review Round 1

Posted by Walter Bright
in reply to tsbockman

Walter Bright

Posted in reply to tsbockman

On 3/16/2021 11:11 AM, tsbockman wrote:
> You just have different results in mind from what your wrote in the DIP.

The desired result is clear - two objects exist before the move assignment, and one afterwards. If both objects are the same instance, then it should be a no-op.

The wording could be improved - want to make a stab at it?

March 18, 2021

Re: Discussion Thread: DIP 1040--Copying, Moving, and Forwarding--Community Review Round 1

Posted by Timon Gehr
in reply to Walter Bright

Timon Gehr

Posted in reply to Walter Bright

On 18.03.21 10:51, Walter Bright wrote:
> 
> If one is added to a struct with an explicit move constructor, it is up to the struct programmer to fold it in explicitly, just like he does for explicit constructors and destructors.

This is not the case, you usually don't have to do it explicitly in explicit destructors:

---
import std.stdio;

struct T{
    ~this(){ writeln("T destructor called"); }
}

struct S{
    T t;
    ~this(){
        writeln("S destructor called");
        // NOTE: does not explicitly call t.~this()
    }
}

void main(){
    S s;
}
---
S destructor called
T destructor called
---

You may be able to argue that it's not important, but let's not pretend that nothing new is going on here. If you don't explicitly _move_ a field, the destructor will _not_ be called.

Destructors do the right thing by default, even if you don't update them after adding a new field. This is not true for move constructors.

March 18, 2021

Re: Discussion Thread: DIP 1040--Copying, Moving, and Forwarding--Community Review Round 1

Posted by Timon Gehr
in reply to Walter Bright

Timon Gehr

Posted in reply to Walter Bright

On 18.03.21 10:07, Walter Bright wrote:
> On 3/16/2021 3:30 PM, deadalnix wrote:
>> This is self evident. This is so obvious that I don't know how to unpack it any further.
> 
> I'm sorry, I just don't understand your objection.

In condensed form, I think the main complaint is this. Let's start with a struct:

struct S{
    T field0;
    this(S r){
        field0=move(r.field0);
    }
}

Now someone adds a new field, but forgets to update the move constructor:

struct S{
    T field0, field1;
    this(S r){
        field0=move(r.field0);
    }
}

field1 is now leaked: _Its destructor will never run_. And this can happen in @safe code. (Ignoring the issue that field1 of the moved object will be the init value.)

This design is error-prone. postblit does not have this issue, because fields that are not explicitly referred to are moved correctly by default.

Hence the suggestion in my previous post to perhaps require all fields to be initialized in a move constructor and similar thoughts about opAssign.
This mitigates the risk, but unfortunately it does not eliminate it. (It is furthermore possible that such an error would be annoying in some cases.)

A possible scenario is:

1. There is a struct S, it's never moved around, so the move constructor is dead code.

2. Someone adds a new field, everything works fine, even when they forget to update the move constructor.

3. The compiler is updated to use more clever flow analysis, suddenly struct S is sometimes moved, leading to memory leaks and other bugs.

4. Spurious regression bug report, reputation damage, etc.

(PS: Sorry for emails that went to your inbox instead of the newsgroup. Thunderbird changed the interface for no reason.)

March 18, 2021

Re: Discussion Thread: DIP 1040--Copying, Moving, and Forwarding--Community Review Round 1

Posted by deadalnix
in reply to Timon Gehr

deadalnix

Posted in reply to Timon Gehr

On Wednesday, 17 March 2021 at 01:06:45 UTC, Timon Gehr wrote:
> [...]

Lots of good stuff in there that I didn't quote fully to not spam.

This kind of lead me to look at things in a new way: The main problem here is that we have loosely defined requirement, and when this is the case, it is very easy to fool oneself and fullfill most requirement most of the time, but actually provide zero useful guarantee, be it to the dev or the optimizer.

So here, we have an existing system: ctor/dtor. The goal of this system is to ensure that an object is available to the program once it has been put in a proper state, and that for each object constructed, there will be a corresponding dtor call that will give an opportunity to undo this state.

The guarantee provided is that ctor/dtor go by pair and the compiler ensures this. What has been constructed will be destroyed and vice versa.

This causes a problem: what to do when an object is duplicated? Then it is required to construct a new object, from the previous one, and that new object, like the previous one, will be destroyed.

Because construction and destruction might be expensive, we want to be able to group a copy and a destruction operation of an object (granted there are no further use between that copy and that destruction) into one: a move operation.

Which lead us to a primitive set of requirements:
1/ Construction and destruction of object map 1:1
2/ An object cannot be used prior its construction and after its destruction.
3/ As an optimization, we want to be able to remove copy/destruction pair when the object is not used after the copy.

1/ and 2/ are already provided, but might be inefficient. 3/ is a way to make things more efficient, and in more way than what you'd think.

One notorious difference between C++ and D is that in C++, objects must have a fixed address, while in D, they do not. This means that the D compiler is free to move objects around. This has more consequences that one would expect, consider for instance the following sample code: https://godbolt.org/z/9hWzxb

It is clear from the disassembly that *2* dereferences are happening, when the C++ code only has one. How come?

Because structs in C++ are not movable by default, and the struct preexist the function call, it must be living somewhere in the caller's stack, and is passed by reference at the ABI level. To make it look like it is passed by value, the called will make a copy and then destroy it after the call.

In practice, this is worse than it look in this simplified exemple. Because not only this means that a vast number of dereferences are executed across the program, but this is only the 1st order effect. Because numerous things now go through pointers, it forces the optimizer to prove a ton of things using alias analysis that would be self evident without that extra indirection, and in many cases, it cannot. For instance:

void foo(unique<int> a, unique<int> b) {
  a = make_unique<int>(...);
  // The compiler has to assume that b may have been modified here, because both a and b could be the same object, and therefore b be modified by this new assignment.
}

This is a major problem of C++ object model. This is not a problem D has at the moment. This is simply the wrong default. This leaves us with one more requirement:

4/ Objects must be usable by value at the ABI level, including object with ctor/dtor.

The obvious problem with this are interior pointers, and I'll come back to them later on. It must however be understood that the vast majority of struct usage do not involve interior pointers, and therefore throwing away 4/ for interior pointer supports seems to be self defeating.

Another peculiarity of the C++ object model can be seen in this sample: https://godbolt.org/z/KGeffj

In the code generated, we can see that the smart pointer is allocated and initialized, then passed to the function by reference, which we expect. But what's interesting comes after the function call: there is a test and a branch. The generated code test the value of the smart pointer, and only free the memory is the pointer isn't null. for readability, I made the fuinction nothrow, but remove it and you'll see that a vast amount of code has to be generate for exception handling too.

This is happening because the function could have moved the object away. While the sensible way to handle an object that has been moved is to not use it at all, this was not possible for C++ for backward compatibility reasons. As a result, a moved object is put in a "null" state, where the destruction is a noop. This null state is the source of a ton of extra work by destructor and a ton of generated code for nothing.

If the function moves the object, then we want no destruction at all, and if it doesn't we want to destroy without checking against a null state. Obviously, we want the callee to do that as the caller doesn't have the infos, but that turns out to be a complex task when the object is passed by ref to the callee due to the previous problem.

This leave us with one more requirement:
5/ Object must not require a null state.

It is to be noted that the current DIP proposes to add 3 to D, but at the cost of either 1/ or 5/ being broken, which IMO is self defeating. Let's see why.
In the current proposed scheme, the move constructor or move assignment have the object things are moving from available. One of the following MUST happen:

a/ The previous object is left as this after the move. This means that 1/ is not ensured by default by the constructor anymore as any leftover won't be destroyed. It is easy to say that dev will be careful but it pretty much bound to fail, because it does the wrong thing by default - a change in the object might break 1/ silently - and it can do so non locally - a change to a member of a member of a member can break 1/ silently.

b/ The previous object is destroyed, but in this case, we ought to place in in a null state as we move. This is the approach C++ is taking and it break 5/ .

This problem is fundamentally unavoidable, because we have 2 object when semantically we should only have one. We have to either do something with the leftovers (which break 5/) or ignore the leftover (which breaks 1/). There are no other options than do something or do nothing once that path is taken.

So, what do we want to do with move constructors anyways? Can't we just move the struct field by field recursively and be done with it? Yes, and I'd argue there is a problem if this isn't enough for 95% of the cases. Which leads to the two use cases I was able to identify:
 - Non movable struct. It is important that such a struct doesn't move. For instance, when the struct is some sort of header or a larger data segment. Another example is a struct that represent some kind of guard that needs to see its construction/destruction done in order. This can be achieved by disabling the move constructor, whatever the move constructor is defined as. It is fairly easy to realize such use case, the move constructor simply needs to exist at all.
 - Movable struct that require some form of bookkeeping. For these cases, a postblit would work with one exception: interior pointers.

What I refers as interior pointers are struct containing pointer to elements which are within the struct itself. While this idiom exist, it is vanishingly rare and becoming rarer over time. The main reason for this is that memory has become slower, computation faster, and pointer larger, which in turn lead people to use "relative pointers", namely pointer defined as an offset from this. Unless is is expected that the struct may be more than 4GB in size - which is always the case, then it's all good. The extra addition required is well worth the memory saved (and increase hit rate in the cache that result from it). See https://www.youtube.com/watch?v=G3bpj-4tWVU for instance on how the swift runtime started using such techniques.

I'll be blunt, once these techniques are known, I've actually never encountered a case of interior pointers that would not be solved by disabling move altogether. I'm not pretending it doesn't exist, but I've never seen it. It simply doesn't make sense to sacrifice any of the above mentioned requirement for it, even it turns out this is really needed, because, well, this is the edge case of the edge case, and while enabling it might be an option, throwing away thing which are good in the general case for it just doesn't make sense.

I suspect that even then, making the struct unmovable and then definition custom method to move it manually would do the trick just fine. But just in case, here is what I propose: simply add an intrinsic, such as `void* __pre_move_address()` that can be called in the postblit, returning the address of the premove object. Any object using it would, of course, discard 4/ and not be usable as a value and instead always be passed by reference at the ABI level. This is the least constraining requirement to break, because it impact exclusively performances and never correctness like 1/ or 5/ would. However, considering it is possible to it custom once you disable move, I strongly suspect the bang is not worth the effort.

March 18, 2021

Re: Discussion Thread: DIP 1040--Copying, Moving, and Forwarding--Community Review Round 1

Posted by deadalnix
in reply to Q. Schroll

deadalnix

Posted in reply to Q. Schroll

On Wednesday, 17 March 2021 at 17:14:04 UTC, Q. Schroll wrote:
> It might be a stupid question, but why have move assignment in the first place? In C++, there's the copy-and-swap idiom[1]. Maybe it's obvious why it does not apply in D, but if using a swap function makes implementing a copy assignment and move assignment trivial, why not requiring opSwap instead of opAssign for an elaborate move object?
>

This isn't a stupid question, this is THE question. It is easy to assume things are necessary because other went there and did it, but I find that questioning these assumptions is how the greatest design ideas came up.

D's pure attribute is the perfect example of this.

> [1] https://stackoverflow.com/questions/3279543/what-is-the-copy-and-swap-idiom

Doing this has major issue: it require all movable structs to have a null state (as in C++) or make other unsavory tradeofs (see https://forum.dlang.org/post/bkfqchwpnonngjrtybbe@forum.dlang.org for a more thorough explanation).

Nevertheless, if the struct naturally has a null state, this is indeed a very good way to do it.

March 18, 2021

Re: Discussion Thread: DIP 1040--Copying, Moving, and Forwarding--Community Review Round 1

Posted by deadalnix
in reply to Walter Bright

deadalnix

Posted in reply to Walter Bright

On Thursday, 18 March 2021 at 09:07:02 UTC, Walter Bright wrote:
> On 3/16/2021 3:30 PM, deadalnix wrote:
>> This is self evident. This is so obvious that I don't know how to unpack it any further.
>
> I'm sorry, I just don't understand your objection.

I hope this post will make it clearer: https://forum.dlang.org/post/bkfqchwpnonngjrtybbe@forum.dlang.org

It's a bit lengthy, but clearly we are not working from the same set of assumption, so it is necessary to dig deeper.

March 18, 2021

Re: Discussion Thread: DIP 1040--Copying, Moving, and Forwarding--Community Review Round 1

Posted by deadalnix
in reply to Walter Bright

deadalnix

Posted in reply to Walter Bright

On Thursday, 18 March 2021 at 09:51:27 UTC, Walter Bright wrote:
> If a field with a move constructor is added to a struct S without one, a default move constructor will be created for S that calls the field's move constructor.
>

This is all good so far.

> If one is added to a struct with an explicit move constructor, it is up to the struct programmer to fold it in explicitly, just like he does for explicit constructors and destructors.

As explained in https://forum.dlang.org/post/bkfqchwpnonngjrtybbe@forum.dlang.org , this is where things go wrong. There are a few ways this can be designed, but they all drop some important property that would be detrimental overall as we don't gain much in return.

The error that you are making is that it is fundamentally different from ctor and dtor in nature. Yes, in both cases, you expect the dev to do something sensible, but the comparison stop there. And you know it is not enough, because if that was, then why do array bound checks? or have ctor/dtor at all to begin with? Just call the destroy function in all codepath and be done with it!

The reality is that the set of assumption broken here is much larger than for ctor/dtor, even going as far as breaking assumptions provided by ctor/dtor, such as guaranteed pairwise construction/destruction.

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation