October 15
On Tuesday, 15 October 2024 at 12:56:35 UTC, RazvanN wrote:

>> Isn't this the exact moment that the recursion ends? If the copy ctor was an exact match (we must have been supplied an lvalue), and (therefore) while considering the move constructor it was determined that a copy is necessary, then it is not an exact match... copy ctor wins. Case closed.
>>

Note that today, from the compilers perspective both the move ctor and
the copy ctor are exact matches. However, the compiler does a thing called
partial ordering where it selects the function that is more specialized.
This is where ref gets picked of rvalue because it is deemed more specialized.

So, all you need to do is just tweak this and simply add a check for the situation
where a copy constructor is preferred over the move constructor.

October 16
On Tue, 15 Oct 2024, 19:56 Arafel via Digitalmars-d, < digitalmars-d@puremagic.com> wrote:

> On 15/10/24 11:26, Manu wrote:
> > Show me a case study; what might you do with an rvalue constructor if not initialise an instance from an rvalue?
>
> I think one such case might be metaprogramming. Consider:
>
> ```d
> struct S {
>         int i;
>         this(C)(C c) if (is(C : int)) {
>                 this.i = c;
>         }
>
>         alias i this;
> }
>
> void main() {
>         S s1, s2, s3;
>         int i = 1;
>         s1 = S(1);
>         s2 = S(i);
>         s3 = S(s1); // This was most certainly not intended as a move
> constructor.
> }
> ```
>

Your example is a valid move constructor though... so even if the S(S) case were reinterpreted under the new move semantics, it's fine. It's almost impossible to imagine such an example where this isn't true.


This example might seem artificial, and it is, but just imagine any
> templated constructor that uses DbI, and where the own type matches.
>
> The actual inner details (i.e. that `S` is instantiated) might be even unknown to the caller, for instance if it comes from the user side of an API through a templated function using IFTI.
>
> Also, you cannot use `ref` because you want it to accept both r- and l-values, and in any case there might be good reasons why this isn't desirable in metaprogramming.
>

That's what auto ref is for, if this specifically is your problem... and as I described before; there's a quite likely chance that instances of this pattern in the wild were actually written by D amateurs, and the code is actually broken, or doesn't actually quite do what they think they were trying to do.

>


October 16
On Tue, 15 Oct 2024, 23:01 RazvanN via Digitalmars-d, < digitalmars-d@puremagic.com> wrote:

> On Tuesday, 15 October 2024 at 09:33:59 UTC, Manu wrote:
> > On Tue, 15 Oct 2024 at 01:56, RazvanN via Digitalmars-d < digitalmars-d@puremagic.com> wrote:
> >
> >> On Friday, 11 October 2024 at 16:12:39 UTC, Manu wrote:
> >> > On Thu, 10 Oct 2024, 17:10 Walter Bright via Digitalmars-d, < digitalmars-d@puremagic.com> wrote:
> >> >
> >> >> On 10/8/2024 10:42 PM, Manu wrote:
> >> >> > Can you show us some cases?
> >> >>
> >> >> I'd get infinite recursion with overload resolution, because
> >> >> the compiler
> >> >> will
> >> >> try and match the argument to `S` and `ref S`, made even
> >> >> more
> >> >> complicated
> >> >> with
> >> >> rvalue references enabled.
> >> >>
> >> >
> >> > I don't understand; was the argument an rvalue or an lvalue?
> >> > It is not at all ambiguous or difficult to select the proper
> >> > overload
> >> > here... one should have been an exact match, the other would
> >> > have required
> >> > a copy or conversion; making it an obviously less preferable
> >> > match.
> >> >
> >>
> >> ```d
> >> struct S
> >> {
> >>      this(ref typeof(this));
> >>      this(typeof(this));
> >> }
> >>
> >> void fun(S);
> >>
> >> void main()
> >> {
> >>      S a;
> >>      fun(a);
> >> }
> >>
> >> ```
> >>
> >> When the fun(a) is called, the compiler will have to check both
> >> constructors to see which one is a better match. It first tries
> >> the copy constructor and sees that it's an exact match, then it
> >> proceeds to the next overload - the move constructor. Now it
> >> wants
> >> to see if the move constructor is callable in this situation.
> >> The move constructor receives its argument by value so the
> >> compiler
> >> will think that it needs to call the copy constructor (if it
> >> exists).
> >
> >
> > Isn't this the exact moment that the recursion ends? If the copy ctor was an exact match (we must have been supplied an lvalue), and (therefore) while considering the move constructor it was determined that a copy is necessary, then it is not an exact match... copy ctor wins. Case closed.
> >
>
> The way overload resolution works is that you try to call match
> each function in the overload set and always save (1) the best
> matching level
> up this far, (2) the number of matches and (3)a pointer to the
> best matching function (and potentially a second pointer to a
> second function, provided that
> you have 2 functions that have the same matching level). Once
> overload
> resolution is done you inspect these results and either pick a
> single
> function or error depending on what you get. This works without
> any
> special casings (there are minor special casings for unique
> constructors,
> but that's fairly non-invasive).
>
> If we were to accept `this(typeof(this))` as a move constructor,
> we would need
> to special case the overload resolution mechanism. I'm not saying
> it's not possible to implement, rather that we need to add this
> special case to a
> battle tested algorithm.
>

Can you explain this further?

void f(ref T) it not capable with T() at all, so void f(T) is the only
possible match for the rvalue case. Today's algorithm works for that case.

The lvalue case could conceivably match to T or ref T (but ref T is obviously the superior selection, not requiring a superfluous copy)... but even now, the rvalue constructor already exists... what logic allows these to coexist in today's rules?

What is the special case you're describing?
Can you show me exactly what the algorithm does now that's incompatible
with selecting the version with proper ref-ness, and what change would be
necessary?


In contrast, we could leave the overload resolution code
> untouched and simply
> give the move constructor a different identifier (what I mean is,
> you type `this(S)`, but internally the compiler gives the
> __movector name to the function). When the compiler inserts calls
> to the move constructor it will then need to do overload
> resolution using the name __movector. This seems to me like a
> more desirable alternative: you get the this(S) syntax (provided
> that Walter accepts the possibility of code breakage or code
> fix-up as you call it) and the overload
> resolution code remains intact. Additionally, you get smaller
> overload resolution
> penalties given that the overload sets get smaller.
>

Okay, but I've asked people to stop talking about move constructors... it seems to have poisoned everyone's brains with irrelevant focus. We're talking about function arguments in general, we don't need more bizarre edge cases, especially not so deep in the foundations.

Run your thought experiment with:
  void f(T);
  void f(ref T)

The constructor isn't special... don't special-case the constructor.


> Now, since the copy constructor is in the same overload
> >> set as the move constructor, both need to be checked to see their matching levels. => infinite recursion.
> >>
> >> That is how overload resolution works for any kind of function
> >> (constructor, destructor, normal function). The way to fix this
> >> is to either move the copy constructor and the move
> >> constructor into different overload sets or to special case
> >> them
> >> in the
> >> function resolution algorithm.
> >>
> >
> > Yeah sorry, I just don't see it.
> > When the pair are both defined; one is an exact match, and the
> > other is
> > not. Given an rvalue, the move ctor is an exact match, given an
> > lvalue, the
> > copy ctor is an exact match. There is no case where this is
> > ambiguous?
>
> Before having copy ctors it was allowed to have  this(ref S) and
> this(S)
> and it worked. So I don't see any opportunity for ambiguity.
> However, when you
> add some implicit constructor calls that's when things get a bit
> messy, however,
> that's not an unsolvable problem. It just depends on what sort of
> trade-offs you
> are willing to make from an implementation stand point.


"Trade offs"? Do you have anything in mind?
I would be surprised if there are any 'trade-offs'; any change here will
probably be fixing language holes or broken edge cases... what actually
stands to break? What good-stuff™ could we possibly be 'trading' away?


October 16
On Tue, 15 Oct 2024, 23:06 RazvanN via Digitalmars-d, < digitalmars-d@puremagic.com> wrote:

> On Tuesday, 15 October 2024 at 12:56:35 UTC, RazvanN wrote:
>
> >> Isn't this the exact moment that the recursion ends? If the copy ctor was an exact match (we must have been supplied an lvalue), and (therefore) while considering the move constructor it was determined that a copy is necessary, then it is not an exact match... copy ctor wins. Case closed.
> >>
>
> Note that today, from the compilers perspective both the move
> ctor and
> the copy ctor are exact matches. However, the compiler does a
> thing called
> partial ordering where it selects the function that is more
> specialized.
> This is where ref gets picked of rvalue because it is deemed more
> specialized.
>
> So, all you need to do is just tweak this and simply add a check
> for the situation
> where a copy constructor is preferred over the move constructor.
>

Okay, but again with the constructor; does that rule generalise to any regular function argument?

>


October 15
On 15/10/24 18:32, Manu wrote:
>     void main() {
>              S s1, s2, s3;
>              int i = 1;
>              s1 = S(1);
>              s2 = S(i);
>              s3 = S(s1); // This was most certainly not intended as a
>     move constructor.
>     }
>     ```
> 
> 
> Your example is a valid move constructor though... so even if the S(S) case were reinterpreted under the new move semantics, it's fine. It's almost impossible to imagine such an example where this isn't true.
> 

AIUI a move constructor invalidates the source (i.e. it has ref semantics). I would certainly expect to be able to use `s1` after `s3` has been constructed, just like `i`.

> A _Move Constructor_ is a struct member constructor that moves, rather than copies, the argument corresponding to its first parameter into the object to be constructed. The argument is invalid after this move, and is not destructed.

There is also the issue with throwing: what happens if the templated constructor throws?

> 
>     This example might seem artificial, and it is, but just imagine any
>     templated constructor that uses DbI, and where the own type matches.
> 
>     The actual inner details (i.e. that `S` is instantiated) might be even
>     unknown to the caller, for instance if it comes from the user side
>     of an
>     API through a templated function using IFTI.
> 
>     Also, you cannot use `ref` because you want it to accept both r- and
>     l-values, and in any case there might be good reasons why this isn't
>     desirable in metaprogramming.
> 
> 
> That's what auto ref is for, if this specifically is your problem... and as I described before; there's a quite likely chance that instances of this pattern in the wild were actually written by D amateurs, and the code is actually broken, or doesn't actually quite do what they think they were trying to do.
> 

Again the issue I see is that a move constructor invalidates the source. In a generic templated expression that would be a special case:

```d
struct S {
	this(T)(T t) { } // This now has special semantics when T == S
}
```

One possible option would be to consider only non-templated constructors as move constructors, but even then what happens if you need both a non-move templated constructor and a templated one with value semantics?

In any case, I personally feel a demeaning tone when referring to "D amateurs" that doesn't exactly help. Be it as it may, even if wrong, there can be generic code like this out there that works well enough for its purpose.

Curtly ordering these "amateurs" to change it is not what I would call "user-friendly".
October 15
On Tuesday, 15 October 2024 at 17:52:43 UTC, Arafel wrote:

> a move constructor invalidates the source.

Only if the source is not used after the construction/assignment. Otherwise, a copy is made. I'd recommend to skim through DIP1040 if only for fun.




October 15
On 15/10/24 20:29, Max Samukha wrote:
> On Tuesday, 15 October 2024 at 17:52:43 UTC, Arafel wrote:
> 
>> a move constructor invalidates the source.
> 
> Only if the source is not used after the construction/assignment. Otherwise, a copy is made. I'd recommend to skim through DIP1040 if only for fun.
> 

I had a look at it before posting, and according to it [1] (my bold):

> A Move Constructor for struct S is declared as:
>
>   ```d
>      this(S s) { ... }
>   ```
>
> [...]
>
> A _Move Constructor_ is a struct member constructor that moves, rather than copies, the argument corresponding to its first parameter into the object to be constructed. **The argument is invalid after this move**, and is not destructed.

Also, the examples I could see were about _implicit_ calls to the move constructor ([2]). I found no example of explicit calls to them, nor a description of what to expect in that case, but I might have missed it.

In any case, it would be then helpful to clarify it more prominently: what happens if a constructor with the signature of a move constructor is called explicitly like this?

```d
struct S {
	this (int i) { }
	this (S s) { }
}

void main() {
	S s1, s2;
	s1 = S(1);
	s2 = S(s1);
	// Is s1 valid here?
}
```

Because this is currently valid D code, even if Walter thinks it shouldn't (as per the bug referenced in DIP1040 itself [3]), and s1 is perfectly valid at the end of the program.

[1]: https://github.com/dlang/DIPs/blob/master/DIPs/DIP1040.md#move-constructor
[2]: https://github.com/dlang/DIPs/blob/master/DIPs/DIP1040.md#assignment-after-move
[3]: https://issues.dlang.org/show_bug.cgi?id=20424
October 15

On Tuesday, 15 October 2024 at 19:29:11 UTC, Arafel wrote:

>
struct S {
	this (int i) { }
	this (S s) { }
}

void main() {
	S s1, s2;
	s1 = S(1);
	s2 = S(s1);
	// Is s1 valid here?
}

Because this is currently valid D code, even if Walter thinks it shouldn't (as per the bug referenced in DIP1040 itself [3]), and s1 is perfectly valid at the end of the program.

The object pointed to by s1 lives, but is destroyed if another object (S(41)) is pointed to by s1. Also, when you don't do @disable this();, objects s1 and s2 will be created twice and count == 3. For example:

import std.stdio;

struct S
{
  int i;
  this (int i) { this.i = i; }

  S* s;
  this (ref S s) { this.s = &s; }

  @disable this();
  ~this() { ++count; }
}

int count;
enum value = 42;

void main()
{
  auto s1 = S(value);
  auto s2 = S(s1);

  s1.tupleof.writeln(": ", &s1); //42null: 7FFD6592A190
  assert(s2.s.i == value);

  s1 = S(41);

  assert(s2.s.i != value); // because value is 41
  assert(count == 1);
}

SDB@79

October 16
On 10/15/24 18:57, Manu wrote:
> On Tue, 15 Oct 2024, 23:06 RazvanN via Digitalmars-d, <digitalmars- d@puremagic.com <mailto:digitalmars-d@puremagic.com>> wrote:
> 
>     On Tuesday, 15 October 2024 at 12:56:35 UTC, RazvanN wrote:
> 
>      >> Isn't this the exact moment that the recursion ends? If the
>      >> copy ctor was an exact match (we must have been supplied an
>      >> lvalue), and (therefore) while considering the move
>      >> constructor it was determined that a copy is necessary, then
>      >> it is not an exact match... copy ctor wins. Case closed.
>      >>
> 
>     Note that today, from the compilers perspective both the move
>     ctor and
>     the copy ctor are exact matches. However, the compiler does a
>     thing called
>     partial ordering where it selects the function that is more
>     specialized.
>     This is where ref gets picked of rvalue because it is deemed more
>     specialized.
> 
>     So, all you need to do is just tweak this and simply add a check
>     for the situation
>     where a copy constructor is preferred over the move constructor.
> 
> 
> Okay, but again with the constructor; does that rule generalise to any regular function argument?
> 

The recursion issue is more of an implementation detail.

How does the compiler determine which of `foo(S)` and `foo(ref S)` is more specialized? Well, I'd expect it tries to call each with the other kind of argument.

Then, `foo(ref S)` cannot be called with (rvalue) `S`.

We still have to check the converse, as if that does not work, the overloads are ambiguous and the compiler has to error out: `foo(S)` can be called with `ref S`, but requires copy construction. So here, the compiler will check whether `ref S` is copyable.

So by default, if you have `this(ref S)` and this(S)` as matches, to pick which one to choose, it will again try to call `this(S)` with a `ref S`. To check whether this call works, it has to be determined whether `S` is copyable, so it will try to find a copy constructor again, etc. Stack overflow.

So the situation is:

Overload resolution on { this(ref S), this(S) } recurses on itself.

Overload resolution on { foo(ref S), foo(S) } recurses on { this(ref S), this(S) }, which then recurses on itself.

I think it is hard to answer a question whether the "rule generalizes". Explicit base cases are needed that avoid stack overflow, at least for constructors. Other functions will then indeed work the same way, whether they are special-cased in the implementation as base cases or not.

I think this is not an insurmountable problem, just annoying to fix.
October 16
On 10/15/24 21:29, Arafel wrote:
> 
> I had a look at it before posting, and according to it [1] (my bold):
> 
>  > A Move Constructor for struct S is declared as:
>  >
>  >   ```d
>  >      this(S s) { ... }
>  >   ```
>  >
>  > [...]
>  >
>  > A _Move Constructor_ is a struct member constructor that moves, rather than copies, the argument corresponding to its first parameter into the object to be constructed. **The argument is invalid after this move**, and is not destructed.

You are right assuming move constructors work as shown in DIP1040. However, I think this design is not really workable, and I think Manu is arguing from a position of assuming that the argument needs to remain valid and will be destroyed by default.

A benefit of `=this(ref S)` syntax or similar is that it supports the design where the argument is not destructed without additional language features to elide the destructor call, and without a special case where destructor elision may be unexpected given the syntax.

However, I think it either has to leave the argument in a valid state or explicit moves have to blit `.init` over the argument after calling the move constructor.