June 21, 2011
On 2011-06-20 18:59, Michel Fortin wrote:
> On 2011-06-20 18:12:11 -0400, "Steven Schveighoffer"
> 
> <schveiguy@yahoo.com> said:
> > On Mon, 20 Jun 2011 16:45:44 -0400, Michel Fortin
> > 
> > <michel.fortin@michelf.com> wrote:
> >> My feeling is that array appending and array assignment should be considered a compiler issue first and foremost. The compiler needs to be  fixed, and once that's done the runtime will need to be updated anyway  to match the changes in the compiler. Your proposed fix for array  assignment is a good start for when the compiler will provide the  necessary info to the runtime, but applying it at this time will just  fix some cases by breaking a few others: net improvement zero.
> > 
> > BTW, I now feel that your request to make a distinction between move and  copy is not required.  The compiler currently calls the destructor of  temporaries, so it should also call postblit.  I don't think it can make  the distinction between array appending and simply calling some other  function.
> 
> Well, if
> 
> 	a ~= S();
> 
> does result in a temporary which get copied and then destroyed, why have move semantics at all? Move semantics are not just an optimization, they actually change the semantics. If you have a struct with a @disabled postblit, should it still be appendable?

I would expect that to have move semantics. There's no need to create and destroy a temporary. It's completely wasteful. A copy should only be happening when a copy _needs_ to happen. It doesn't need to happen here. Now, depending on what ~= did internally (assuming that it were an overloaded operator), then a copy may end up occurring inside of the function, but that shouldn't happen for the built-in ~= operator, and a well-written overloaded ~= should avoid the need to copy as well.

- Jonathan M Davis
June 21, 2011
On Mon, 20 Jun 2011 21:59:49 -0400, Michel Fortin <michel.fortin@michelf.com> wrote:

> On 2011-06-20 18:12:11 -0400, "Steven Schveighoffer" <schveiguy@yahoo.com> said:
>
>> On Mon, 20 Jun 2011 16:45:44 -0400, Michel Fortin  <michel.fortin@michelf.com> wrote:
>>
>>> My feeling is that array appending and array assignment should be  considered a compiler issue first and foremost. The compiler needs to be  fixed, and once that's done the runtime will need to be updated anyway  to match the changes in the compiler. Your proposed fix for array  assignment is a good start for when the compiler will provide the  necessary info to the runtime, but applying it at this time will just  fix some cases by breaking a few others: net improvement zero.
>>  BTW, I now feel that your request to make a distinction between move and  copy is not required.  The compiler currently calls the destructor of  temporaries, so it should also call postblit.  I don't think it can make  the distinction between array appending and simply calling some other  function.
>
> Well, if
>
> 	a ~= S();
>
> does result in a temporary which get copied and then destroyed, why have move semantics at all? Move semantics are not just an optimization, they actually change the semantics. If you have a struct with a @disabled postblit, should it still be appendable?

Good question.  I don't even know how the runtime could avoid calling postblit, there is no flag saying the postblit is disabled in the typeinfo (that I know of).

But think about it this way, if you have a function foo:

foo(S)(ref S s, S[] arr)
{
   arr[0] = s;
}

Isn't this copy semantics?  This is exactly how the D runtime gets the data.  The only difference is, the runtime function is allowed to accept a temporary as a reference (not possible in a normal function).

Now, you could force move semantics, if you know the argument is an rvalue, but I don't know enough about what postblit is used for in order to say it's fine to use move semantics to move the struct into the heap.

The reason I say move semantics are an optimization is because:

{
  S tmp;
  arr ~= tmp;
}

is essentially equivalent to:

arr ~= S();

But the former is copy semantics, the latter can be considered move.  It seems like a smart compiler during optimization could rewrite the former as the latter, unless the semantics truly are different.  Which is why I'm trying to figure out how postblit can be used ;)

>> If the issue of array assignment is fixed, do you think it's worth putting  the change in, and then filing a bug against the GC?  I still think the  current cases that "work" are fundamentally broken anyways.
>
> That depends. I'm not too sure currently whether the S destructor is called for this code:
>
> 	a ~= S();

It is, I tested it.  I ran this code:


struct Test
{
   this(this) { writeln("copy done"); }
   void opAssign(Test rhs) { writeln("assignment done"); }
   ~this() { writeln("destructor called"); }
}

void main()
{
   Test[] tests = new Test[1];
   {
      // Test test;
      // tests ~= test;
      tests ~= Test();
   }
   writeln("done");
}

and saw "destructor called" in the output, no matter which option was commented out.

> All in all, I don't think it's important enough to justify we waste hours debating in what order we should fix those bugs. Do what you think is right. If it becomes a problem or it introduces a bug here or there, we'll adjust, at worse that means a revert of your commit.

OK, then I'll push the change.  I already filed a bug against _d_arraycopy.

>>> As for the issue that destructors aren't called for arrays on the heap,  it's a serious problem. But it's also a separate problem that concerns  purely the runtime, as far as I am aware of. Is there someone working on  it?
>>  I think we need precise scanning to get a complete solution.  Another  option is to increase the information the array runtime stores in the  memory block (currently it only stores the "used" length) and then hook  the GC to call the dtors.  This might be a quick fix that doesn't require  precise scanning, but it also fixes the most common case of allocating a  single struct or an array of structs on the heap.
>
> The GC calling the destructor doesn't require precise scanning. Although it's true that both problems require adding type information to memory blocks, beyond that requirement they're both independent. It'd be really nice if struct destructors were called correctly.

Yes, the more I think about it, the more this solution looks attractive.  All that is required is to flag the block as having a finalizer, store the TypeInfo pointer somewhere, and the GC should call it.

I'll put in a bugzilla enhancement so it's not forgotten.

-Steve
June 21, 2011
On 2011-06-21 07:34:24 -0400, "Steven Schveighoffer" <schveiguy@yahoo.com> said:

> On Mon, 20 Jun 2011 21:59:49 -0400, Michel Fortin  <michel.fortin@michelf.com> wrote:
> 
>> Well, if
>> 
>> 	a ~= S();
>> 
>> does result in a temporary which get copied and then destroyed, why have  move semantics at all? Move semantics are not just an optimization, they  actually change the semantics. If you have a struct with a @disabled  postblit, should it still be appendable?
> 
> Good question.  I don't even know how the runtime could avoid calling  postblit, there is no flag saying the postblit is disabled in the typeinfo  (that I know of).
> 
> But think about it this way, if you have a function foo:
> 
> foo(S)(ref S s, S[] arr)
> {
>     arr[0] = s;
> }
> 
> Isn't this copy semantics?  This is exactly how the D runtime gets the  data.  The only difference is, the runtime function is allowed to accept a  temporary as a reference (not possible in a normal function).

... and in the special case where the reference is a rvalue, then it should have move semantics. See below.


> Now, you could force move semantics, if you know the argument is an  rvalue, but I don't know enough about what postblit is used for in order  to say it's fine to use move semantics to move the struct into the heap.
> 
> The reason I say move semantics are an optimization is because:
> 
> {
>    S tmp;
>    arr ~= tmp;
> }
> 
> is essentially equivalent to:
> 
> arr ~= S();
> 
> But the former is copy semantics, the latter can be considered move.  It  seems like a smart compiler during optimization could rewrite the former  as the latter, unless the semantics truly are different.  Which is why I'm  trying to figure out how postblit can be used ;)

Actually, this should be the equivalent:

	import std.algorithm;

	S tmp;
	arr ~= move(tmp);

While there is no doubt that 'moving' a struct can often be used as an optimization without changing the semantics, if you want the @disabled attribute to be useful on the postblit constructor then the language needs to define when its semantics require 'moving' data and whey then require 'copying' data, it can't let that only to the choice of the optimizer.

Things might be clearer if we had a move operator, but instead we have a 'move' function. There is only one case where I think we can assume to have move semantics: when a temporary (a rvalue) is assigned to somewhere. That's also all that's needed for the 'move' function to work. And that is broken currently when it comes to array appending.

-- 
Michel Fortin
michel.fortin@michelf.com
http://michelf.com/

June 21, 2011
On Tue, 21 Jun 2011 08:25:40 -0400, Michel Fortin <michel.fortin@michelf.com> wrote:

> While there is no doubt that 'moving' a struct can often be used as an optimization without changing the semantics, if you want the @disabled attribute to be useful on the postblit constructor then the language needs to define when its semantics require 'moving' data and whey then require 'copying' data, it can't let that only to the choice of the optimizer.

Another issue with appending a @disabled-postblit struct, what happens when you have to reallocate a block to get more space?  This cannot possibly be a move, because the compiler has no idea at the time of appending whether anything else has a reference to the original data.  So should it just be a runtime error?

I'm starting to think that @disabled postblit structs *shouldn't* be able to be appended.

-Steve
June 21, 2011
On Tue, 21 Jun 2011 04:59:49 +0300, Michel Fortin <michel.fortin@michelf.com> wrote:

> Well, if
>
> 	a ~= S();
>
> does result in a temporary which get copied and then destroyed, why have move semantics at all? Move semantics are not just an optimization, they actually change the semantics.

There was a similar discussion on struct constructors which ended up something like this, that it is an optimization.
I fully agree it is not, move exists just the reasons like this.
June 21, 2011
On Tue, 21 Jun 2011 15:25:40 +0300, Michel Fortin <michel.fortin@michelf.com> wrote:

> Actually, this should be the equivalent:
>
> 	import std.algorithm;
>
> 	S tmp;
> 	arr ~= move(tmp);
>
> While there is no doubt that 'moving' a struct can often be used as an optimization without changing the semantics, if you want the @disabled attribute to be useful on the postblit constructor then the language needs to define when its semantics require 'moving' data and whey then require 'copying' data, it can't let that only to the choice of the optimizer.
>
> Things might be clearer if we had a move operator, but instead we have a 'move' function. There is only one case where I think we can assume to have move semantics: when a temporary (a rvalue) is assigned to somewhere. That's also all that's needed for the 'move' function to work. And that is broken currently when it comes to array appending.

It should be something else because move(tmp) in std.algorithm takes by reference and returns by value by actually moving it, because of the value semantics in D, that the ability to differentiate value from reference it doesn't need any other syntax because this is much better.

I think it is pretty neat, yet i still have some trouble understanding its effect here.

S tmp;
arr ~= move(tmp); // would make an unnecessary copy.

Move should do some kind of a magic there and treat its argument like a value, and return it.

Something like:

move(ref T a)
  return cast(T)a;

Maybe it makes no sense at all but i tried!
June 21, 2011
On 2011-06-21 08:38:05 -0400, "Steven Schveighoffer" <schveiguy@yahoo.com> said:

> On Tue, 21 Jun 2011 08:25:40 -0400, Michel Fortin  <michel.fortin@michelf.com> wrote:
> 
>> While there is no doubt that 'moving' a struct can often be used as an  optimization without changing the semantics, if you want the @disabled  attribute to be useful on the postblit constructor then the language  needs to define when its semantics require 'moving' data and whey then  require 'copying' data, it can't let that only to the choice of the  optimizer.
> 
> Another issue with appending a @disabled-postblit struct, what happens  when you have to reallocate a block to get more space?  This cannot  possibly be a move, because the compiler has no idea at the time of  appending whether anything else has a reference to the original data.  So  should it just be a runtime error?

That's indeed a problem.

> I'm starting to think that @disabled postblit structs *shouldn't* be able  to be appended.

That would make sense. It should be a compile-time error.

It would also turn appending using move to an optimization, because all the types you can append will be guarantied to be copyable.


-- 
Michel Fortin
michel.fortin@michelf.com
http://michelf.com/

June 21, 2011
On 2011-06-21 09:24:29 -0400, so <so@so.so> said:

> On Tue, 21 Jun 2011 15:25:40 +0300, Michel Fortin  <michel.fortin@michelf.com> wrote:
> 
>> Actually, this should be the equivalent:
>> 
>> 	import std.algorithm;
>> 
>> 	S tmp;
>> 	arr ~= move(tmp);
>> 
>> While there is no doubt that 'moving' a struct can often be used as an  optimization without changing the semantics, if you want the @disabled  attribute to be useful on the postblit constructor then the language  needs to define when its semantics require 'moving' data and whey then  require 'copying' data, it can't let that only to the choice of the  optimizer.
>> 
>> Things might be clearer if we had a move operator, but instead we have a  'move' function. There is only one case where I think we can assume to  have move semantics: when a temporary (a rvalue) is assigned to  somewhere. That's also all that's needed for the 'move' function to  work. And that is broken currently when it comes to array appending.
> 
> It should be something else because move(tmp) in std.algorithm takes by  reference and returns by value by actually moving it, because of the value  semantics in D, that the ability to differentiate value from reference it  doesn't need any other syntax because this is much better.
> 
> I think it is pretty neat, yet i still have some trouble understanding its  effect here.
> 
> S tmp;
> arr ~= move(tmp); // would make an unnecessary copy.
> 
> Move should do some kind of a magic there and treat its argument like a  value, and return it.

Actually, no copy is needed. Move takes the argument by ref so it can obliterates it. Obliteration consists of replacing its bytes with those in S.init. That way if you have a smart pointer, it gets returned without having to update the reference count (since the source's content has been destroyed). It was effectively be moved, not copied.

Note 1: Currently 'move' obliterates the source only if the type has a destructor or a postblit. I think it should always do it, but without inlining that might be a performance bottleneck.

Note 2: Making move efficient in the case of appending might require a total rework of how the compiler interacts with the runtime. And I don't think you can optimize away all blitting unless the move function was treated specially by the compiler (or became a special operator).

-- 
Michel Fortin
michel.fortin@michelf.com
http://michelf.com/

June 21, 2011
On Tue, 21 Jun 2011 18:18:26 +0300, Michel Fortin <michel.fortin@michelf.com> wrote:

> Actually, no copy is needed. Move takes the argument by ref so it can obliterates it. Obliteration consists of replacing its bytes with those in S.init. That way if you have a smart pointer, it gets returned without having to update the reference count (since the source's content has been destroyed). It was effectively be moved, not copied.

T move(ref T a) {
  T b;
  move(a, b);
  return b;
}

T a;
whatever = move(a);

If T is a struct, i don't see how a copy is not needed looking at the current state of move.
June 21, 2011
On 2011-06-21 12:13:32 -0400, so <so@so.so> said:

> On Tue, 21 Jun 2011 18:18:26 +0300, Michel Fortin  <michel.fortin@michelf.com> wrote:
> 
>> Actually, no copy is needed. Move takes the argument by ref so it can  obliterates it. Obliteration consists of replacing its bytes with those  in S.init. That way if you have a smart pointer, it gets returned  without having to update the reference count (since the source's content  has been destroyed). It was effectively be moved, not copied.
> 
> T move(ref T a) {
>    T b;
>    move(a, b);
>    return b;
> }
> 
> T a;
> whatever = move(a);
> 
> If T is a struct, i don't see how a copy is not needed looking at the  current state of move.

Actually, that depends on how you look at this.

The essence of a move operation is that you just copy the bits and then obliterate the old ones. So yes, there's indeed a copy to do, but there's no need to call a copy constructor or a destructor because no new instance has been created, it has just been moved. If you don't call the copy constructor (postblit) then it's a move operation, not a copy operation, even though there's still a bitwise copy inside the move operation.

In the return statement above, 'b' gets copied to 'whatever', then disappears along with the stack frame belonging to the function. So it becomes a move operation. (And it's even more direct than that with the named-value optimization.)

-- 
Michel Fortin
michel.fortin@michelf.com
http://michelf.com/