November 19, 2012
On Monday, 19 November 2012 at 05:22:38 UTC, Jonathan M Davis wrote:
> On Monday, November 19, 2012 06:01:55 Rob T wrote:

> postblit constructors and opAssign aren't really related. The postblit
> constructor is used when a _new_ instance is being constructed (it plays the
> same role as a copy constructor in C++). opAssign overloads the assignment
> operator and is only used when the assignment operator is used, which does
> _not_ happen when contstructing a new instance but only when replacing the
> value of an instance with that of another.

Is this correct? From a implementation point of view it looks like opAssign is related to postblit in that it does call postblit first.
From the spec:
    Struct assignment t=s is defined to be semantically equivalent to:
      t = S.opAssign(s);
    where opAssign is a member function of S:
    S* opAssign(S s)
    {   ... bitcopy *this into tmp ...
      ... bitcopy s into *this ...
      ... call destructor on tmp ...
      return this;
    }

It does not say postblit as well, but it does call it.
When assigning one object into another it will first blit, then custom postblit if you have written one. A benefit of this is, if you want deep copy semantics and postblit does the work to provide it - you do not need an opAssign at all, as your postblit will be called. I think this is a step up over C++.

The example below prints:
----------------------------------------------
Begin assign
postblit A
End assign
----------------------------------------------

import std.stdio;
import std.traits;

struct A {
  this(this) { c = c.dup; writeln("postblit A"); }
  char[] c;

}
struct B { A a; }
struct C { B b; }
struct D { C c; }

void main() {
  D d1, d2;
  d1.c.b.a.c = ['a','b','c'];
  writeln("Begin assign");
  d2 = d1;
  writeln("End assign");
}

November 19, 2012
On 11/19/12, Rob T <rob@ucora.com> wrote:
> perhaps best
> done using the C libs memcopy function.

I think the safest thing you can do is:

void oldAssign(Type rhs)
{
    this.tupleof = rhs.tupleof;
}
November 19, 2012
On Monday, 19 November 2012 at 09:37:35 UTC, Jonathan M Davis wrote:
> On Monday, November 19, 2012 10:29:21 Rob T wrote:
>> the D language specification (which is currently MIA).
>
> The online documentation _is_ the official spec, though it definitely doesn't
> have enough detail to be unambiguous, and in some cases, it's not properly up-
> to-date.
>
> - Jonathan M Davis

Well yes, there is a spec, and it's pretty good in some areas, but also just not precise enough in other areas, such with what we're discussing in here. I'd like to see the mechanism surrounding the copy/move semantics described in full details as a part of the language spec, otherwise it's a bit risky to rely on these behaviors if they are only considered as compiler optimizations.

I know that Walter started D from the POV of a compiler developer, so he probably does consider the optimizations to be a part of the spec, but I'd like to see that in writing somewhere to make it rock-solid official. There's mention of this in the TDPL but again it's written as being an optimization, although you could also get the impression is is a language feature, but it's not exactly clear.

Note that I'm picking on this topic because it's a foundation just about everything else it built up on, so it really needs to be thoroughly documented as part of the spec.

--rt
November 23, 2012
On Monday, 19 November 2012 at 12:10:32 UTC, Dan wrote:

Just following up to get confirmation. Hopefully Johnathan or similar expert can follow up.

Here is a strong statement:

If for any struct S you implement a postblit then there is no need to implement opAssign to get a working assignment operator from a type S because by design postblit is already called by default opAssign. This is the behavior I see, but I may be missing something since the language specification does not mention postblit, only blit.

Thanks
Dan

> On Monday, 19 November 2012 at 05:22:38 UTC, Jonathan M Davis wrote:
>> On Monday, November 19, 2012 06:01:55 Rob T wrote:
>
>> postblit constructors and opAssign aren't really related. The postblit
>> constructor is used when a _new_ instance is being constructed (it plays the
>> same role as a copy constructor in C++). opAssign overloads the assignment
>> operator and is only used when the assignment operator is used, which does
>> _not_ happen when contstructing a new instance but only when replacing the
>> value of an instance with that of another.
>
> Is this correct? From a implementation point of view it looks like opAssign is related to postblit in that it does call postblit first.
> From the spec:
>     Struct assignment t=s is defined to be semantically equivalent to:
>       t = S.opAssign(s);
>     where opAssign is a member function of S:
>     S* opAssign(S s)
>     {   ... bitcopy *this into tmp ...
>       ... bitcopy s into *this ...
>       ... call destructor on tmp ...
>       return this;
>     }
>
> It does not say postblit as well, but it does call it.
> When assigning one object into another it will first blit, then custom postblit if you have written one. A benefit of this is, if you want deep copy semantics and postblit does the work to provide it - you do not need an opAssign at all, as your postblit will be called. I think this is a step up over C++.
>
> The example below prints:
> ----------------------------------------------
> Begin assign
> postblit A
> End assign
> ----------------------------------------------
>
> import std.stdio;
> import std.traits;
>
> struct A {
>   this(this) { c = c.dup; writeln("postblit A"); }
>   char[] c;
>
> }
> struct B { A a; }
> struct C { B b; }
> struct D { C c; }
>
> void main() {
>   D d1, d2;
>   d1.c.b.a.c = ['a','b','c'];
>   writeln("Begin assign");
>   d2 = d1;
>   writeln("End assign");
> }


November 23, 2012
On Monday, 19 November 2012 at 12:10:32 UTC, Dan wrote:
> [...]
> provide it - you do not need an opAssign at all, as your postblit will be called. I think this is a step up over C++.
>
> The example below prints:
> ----------------------------------------------
> Begin assign
> postblit A
> End assign
> ----------------------------------------------
>
> import std.stdio;
> import std.traits;
>
> struct A {
>   this(this) { c = c.dup; writeln("postblit A"); }
>   char[] c;
>
> }
> struct B { A a; }
> struct C { B b; }
> struct D { C c; }
>
> void main() {
>   D d1, d2;
>   d1.c.b.a.c = ['a','b','c'];
>   writeln("Begin assign");
>   d2 = d1;
>   writeln("End assign");
> }

That's VERY interesting indeed and originally I had no idea it would do this without a custom opAssign at each level.

This kind of behavior *really* needs to be documented in precise detail, it's rather critical to know.

--rt

November 24, 2012
On Friday, 23 November 2012 at 22:31:46 UTC, Rob T wrote:
> That's VERY interesting indeed and originally I had no idea it would do this without a custom opAssign at each level.
>
> This kind of behavior *really* needs to be documented in precise detail, it's rather critical to know.

 It IS documented. TDPL - pg. 248
[quote]
  The second step (the part with 'transitive field') of the postblit copy process deserves a special mention. The rationale for that behavior is [i]encapsulation[/i]-the postblit constructor of a struct object must be called even when the struct is embedded in another struct object. Consider, for example, that we make Widget a member of another struct, which in turn is a member of yet another struct:

(included from pg. 246)
[code]
struct Widget {
  private int[] array;
  this(uint length) {
    array = new int[length];
  }
  // postblit constructor
  this(this){
    array = array.dup;
  }
  //As Before
  int get(size_t offset) { return array[offset]; }
  void set(size_t offset, int value) { array[offset] = value; }
}

struct Widget2 {
  Widget w1;
  int x;
}

struct Widget3 {
  Widget2 w2;
  string name;
  this(this) {
    name = name ~ " (copy)";
  }
}
[/code]

  Now, if you want to copy around objects that contain Widgets, it would be pretty bad if the compiler forgot to properly copy the Widget subobjects. That's why when copying objects of type Widget2, a call to this(this) is issued for the w subobject, even though Widget2 does not intercept copying at all. Also, when copying objects of type Widget3, again this(this) is invoked for the field w1 of field w2. To Clarify:

[code]
unittest {
  Widget2 a;
  a.w1 = Widget(10);                  //Allocate some memory
  auto b = a;                         // this(this) called for b.w
  assert(a.w1.array ~is b.w1.array);  // Pass

  Widget3 c;
  c.w2.w1 = Widget(20);
  auto d = c;                           // this(this) for d.w2.w1
  assert(c.w.2.w.1.array !is d.w2.w1.array);  //pass
}
[/code]
[/quote]
November 24, 2012
On Saturday, 24 November 2012 at 20:47:17 UTC, Era Scarecrow wrote:
> On Friday, 23 November 2012 at 22:31:46 UTC, Rob T wrote:
>> That's VERY interesting indeed and originally I had no idea it would do this without a custom opAssign at each level.
>>
>> This kind of behavior *really* needs to be documented in precise detail, it's rather critical to know.
>
>  It IS documented. TDPL - pg. 248
> [quote]
>   The second step (the part with 'transitive field') of the postblit copy process deserves a special mention. The rationale for that behavior is [i]encapsulation[/i]-the postblit constructor of a struct object must be called even when the struct is embedded in another struct object. Consider, for example, that we make Widget a member of another struct, which in turn is a member of yet another struct:
>
> (included from pg. 246)
> [code]
> struct Widget {
>   private int[] array;
>   this(uint length) {
>     array = new int[length];
>   }
>   // postblit constructor
>   this(this){
>     array = array.dup;
>   }
>   //As Before
>   int get(size_t offset) { return array[offset]; }
>   void set(size_t offset, int value) { array[offset] = value; }
> }
>
> struct Widget2 {
>   Widget w1;
>   int x;
> }
>
> struct Widget3 {
>   Widget2 w2;
>   string name;
>   this(this) {
>     name = name ~ " (copy)";
>   }
> }
> [/code]
>
>   Now, if you want to copy around objects that contain Widgets, it would be pretty bad if the compiler forgot to properly copy the Widget subobjects. That's why when copying objects of type Widget2, a call to this(this) is issued for the w subobject, even though Widget2 does not intercept copying at all. Also, when copying objects of type Widget3, again this(this) is invoked for the field w1 of field w2. To Clarify:
>
> [code]
> unittest {
>   Widget2 a;
>   a.w1 = Widget(10);                  //Allocate some memory
>   auto b = a;                         // this(this) called for b.w
>   assert(a.w1.array ~is b.w1.array);  // Pass
>
>   Widget3 c;
>   c.w2.w1 = Widget(20);
>   auto d = c;                           // this(this) for d.w2.w1
>   assert(c.w.2.w.1.array !is d.w2.w1.array);  //pass
> }
> [/code]
> [/quote]

Good catch on this(this) - it is documented well. But I think the questionable part is on assignment, not copy construction via postblit. For assignment the postblit *is* being called and the language spec (not TDPL) glosses over that. Also, not mentioned in TDPL is what happens if you do implement your own opAssign. I think some of the magic goes away if I'm not mistaken (i.e. those well-crafted postblits will not be called). I think this should be documented as well.

Thanks
Dan
November 25, 2012
On Saturday, 24 November 2012 at 20:47:17 UTC, Era Scarecrow wrote:
>> This kind of behavior *really* needs to be documented in precise detail, it's rather critical to know.
>
>  It IS documented. TDPL - pg. 248
> [quote]

Thanks for pointing out where the postblit stuff is documented. When I first started learning the language, I did read that part a few times over, but I found it frustratingly hard to grasp. I will re-read that section again.

TDPL is a good book, but it is not the official spec, nor is it even a spec, it's a book that covers some aspects of how to use the language. How copy and assignments work in D really needs to be 100% documented in the language spec to ensure that it is officially a part of the language and not a clever compiler optimization that may or may not be implemented.

To make things much clearer, the documentation should perhaps contain a flow chart showing the order of execution, with plenty of examples that show edge cases and best practices.

--rt

November 25, 2012
On Sunday, 25 November 2012 at 00:12:04 UTC, Rob T wrote:
>
> Thanks for pointing out where the postblit stuff is documented. When I first started learning the language, I did read that part a few times over, but I found it frustratingly hard to grasp. I will re-read that section again.

This should be MUCH more documented. Most users (in particular C++ users) are surprised by this behavior, and creates a great deal of confusion.

> TDPL is a good book, but it is not the official spec, nor is it even a spec, it's a book that covers some aspects of how to use the language. How copy and assignments work in D really needs to be 100% documented in the language spec to ensure that it is officially a part of the language and not a clever compiler optimization that may or may not be implemented.

AFAIK, there is no "official spec". And even if there was, the "de-facto" spec *is* TDPL... minus everything that could have changed since it's printing.


November 25, 2012
On Sunday, 25 November 2012 at 11:05:37 UTC, monarch_dodra wrote:
>
> AFAIK, there is no "official spec". And even if there was, the "de-facto" spec *is* TDPL... minus everything that could have changed since it's printing.

I think TDPL is great, but there is a doc called "D Language Specification" which is the perfect level for reference - it is just now outdated and incomplete. Couldn't Walter just open source this document and retain final say on the updates? With C++ we had the ARM and Stroustrup's book. For D we have the spec and TDPL. Put the source of the language spec out in some markup language, let others help to bring it up to snuff, generate pdf's and other formats. The only tricky part is documenting areas where the language designer(s) has open issues without a commitment to a solution.

Take the opAssign issue in question. In TDPL 7.1.5.1 we have "Recall that Widget holds a private int[] member that was supposed to be distinct for each Widget object. Assigning w2 to w1 field by field assigns w2.array to w1.array—a simple assignment of array bounds, without actually copying the array contents. This needs fixing because what we want is to create a duplicate of the array in the source Widget and assign that duplicate to the target Widget."

He then goes on to describe how user code can intercept opAssign to make things right. But, unless I'm missing something, that is unnecessary because the default opAssign carries the call to postblit (see previous example). Now look at the issue from the language spec (Structs & Unions - Assignment Overload):

Struct assignment t=s is defined to be semantically equivalent to:
t = S.opAssign(s);
where opAssign is a member function of S:
S* opAssign(S s)
{   ... bitcopy *this into tmp ...
    ... bitcopy s into *this ...
    ... call destructor on tmp ...
    return this;
}

This description is almost perfect, it just fails to mention "...bitcopy and postblit". Both docs are incorrect according to the behavior of default opAssign. I think the language spec should be fixed first and kept accurate.

Thanks
Dan