View mode: basic / threaded / horizontal-split · Log in · Help
June 21, 2011
Re: what to do with postblit on the heap?
On 2011-06-20 18:59, Michel Fortin wrote:
> On 2011-06-20 18:12:11 -0400, "Steven Schveighoffer"
> 
> <schveiguy@yahoo.com> said:
> > On Mon, 20 Jun 2011 16:45:44 -0400, Michel Fortin
> > 
> > <michel.fortin@michelf.com> wrote:
> >> My feeling is that array appending and array assignment should be
> >> considered a compiler issue first and foremost. The compiler needs to
> >> be  fixed, and once that's done the runtime will need to be updated
> >> anyway  to match the changes in the compiler. Your proposed fix for
> >> array  assignment is a good start for when the compiler will provide
> >> the  necessary info to the runtime, but applying it at this time will
> >> just  fix some cases by breaking a few others: net improvement zero.
> > 
> > BTW, I now feel that your request to make a distinction between move
> > and  copy is not required.  The compiler currently calls the destructor
> > of  temporaries, so it should also call postblit.  I don't think it can
> > make  the distinction between array appending and simply calling some
> > other  function.
> 
> Well, if
> 
> 	a ~= S();
> 
> does result in a temporary which get copied and then destroyed, why
> have move semantics at all? Move semantics are not just an
> optimization, they actually change the semantics. If you have a struct
> with a @disabled postblit, should it still be appendable?

I would expect that to have move semantics. There's no need to create and 
destroy a temporary. It's completely wasteful. A copy should only be happening 
when a copy _needs_ to happen. It doesn't need to happen here. Now, depending 
on what ~= did internally (assuming that it were an overloaded operator), then 
a copy may end up occurring inside of the function, but that shouldn't happen 
for the built-in ~= operator, and a well-written overloaded ~= should avoid 
the need to copy as well.

- Jonathan M Davis
June 21, 2011
Re: what to do with postblit on the heap?
On Mon, 20 Jun 2011 21:59:49 -0400, Michel Fortin  
<michel.fortin@michelf.com> wrote:

> On 2011-06-20 18:12:11 -0400, "Steven Schveighoffer"  
> <schveiguy@yahoo.com> said:
>
>> On Mon, 20 Jun 2011 16:45:44 -0400, Michel Fortin   
>> <michel.fortin@michelf.com> wrote:
>>
>>> My feeling is that array appending and array assignment should be   
>>> considered a compiler issue first and foremost. The compiler needs to  
>>> be  fixed, and once that's done the runtime will need to be updated  
>>> anyway  to match the changes in the compiler. Your proposed fix for  
>>> array  assignment is a good start for when the compiler will provide  
>>> the  necessary info to the runtime, but applying it at this time will  
>>> just  fix some cases by breaking a few others: net improvement zero.
>>  BTW, I now feel that your request to make a distinction between move  
>> and  copy is not required.  The compiler currently calls the destructor  
>> of  temporaries, so it should also call postblit.  I don't think it can  
>> make  the distinction between array appending and simply calling some  
>> other  function.
>
> Well, if
>
> 	a ~= S();
>
> does result in a temporary which get copied and then destroyed, why have  
> move semantics at all? Move semantics are not just an optimization, they  
> actually change the semantics. If you have a struct with a @disabled  
> postblit, should it still be appendable?

Good question.  I don't even know how the runtime could avoid calling  
postblit, there is no flag saying the postblit is disabled in the typeinfo  
(that I know of).

But think about it this way, if you have a function foo:

foo(S)(ref S s, S[] arr)
{
   arr[0] = s;
}

Isn't this copy semantics?  This is exactly how the D runtime gets the  
data.  The only difference is, the runtime function is allowed to accept a  
temporary as a reference (not possible in a normal function).

Now, you could force move semantics, if you know the argument is an  
rvalue, but I don't know enough about what postblit is used for in order  
to say it's fine to use move semantics to move the struct into the heap.

The reason I say move semantics are an optimization is because:

{
  S tmp;
  arr ~= tmp;
}

is essentially equivalent to:

arr ~= S();

But the former is copy semantics, the latter can be considered move.  It  
seems like a smart compiler during optimization could rewrite the former  
as the latter, unless the semantics truly are different.  Which is why I'm  
trying to figure out how postblit can be used ;)

>> If the issue of array assignment is fixed, do you think it's worth  
>> putting  the change in, and then filing a bug against the GC?  I still  
>> think the  current cases that "work" are fundamentally broken anyways.
>
> That depends. I'm not too sure currently whether the S destructor is  
> called for this code:
>
> 	a ~= S();

It is, I tested it.  I ran this code:


struct Test
{
   this(this) { writeln("copy done"); }
   void opAssign(Test rhs) { writeln("assignment done"); }
   ~this() { writeln("destructor called"); }
}

void main()
{
   Test[] tests = new Test[1];
   {
      // Test test;
      // tests ~= test;
      tests ~= Test();
   }
   writeln("done");
}

and saw "destructor called" in the output, no matter which option was  
commented out.

> All in all, I don't think it's important enough to justify we waste  
> hours debating in what order we should fix those bugs. Do what you think  
> is right. If it becomes a problem or it introduces a bug here or there,  
> we'll adjust, at worse that means a revert of your commit.

OK, then I'll push the change.  I already filed a bug against _d_arraycopy.

>>> As for the issue that destructors aren't called for arrays on the  
>>> heap,  it's a serious problem. But it's also a separate problem that  
>>> concerns  purely the runtime, as far as I am aware of. Is there  
>>> someone working on  it?
>>  I think we need precise scanning to get a complete solution.  Another   
>> option is to increase the information the array runtime stores in the   
>> memory block (currently it only stores the "used" length) and then hook  
>>  the GC to call the dtors.  This might be a quick fix that doesn't  
>> require  precise scanning, but it also fixes the most common case of  
>> allocating a  single struct or an array of structs on the heap.
>
> The GC calling the destructor doesn't require precise scanning. Although  
> it's true that both problems require adding type information to memory  
> blocks, beyond that requirement they're both independent. It'd be really  
> nice if struct destructors were called correctly.

Yes, the more I think about it, the more this solution looks attractive.   
All that is required is to flag the block as having a finalizer, store the  
TypeInfo pointer somewhere, and the GC should call it.

I'll put in a bugzilla enhancement so it's not forgotten.

-Steve
June 21, 2011
Re: what to do with postblit on the heap?
On 2011-06-21 07:34:24 -0400, "Steven Schveighoffer" 
<schveiguy@yahoo.com> said:

> On Mon, 20 Jun 2011 21:59:49 -0400, Michel Fortin  
> <michel.fortin@michelf.com> wrote:
> 
>> Well, if
>> 
>> 	a ~= S();
>> 
>> does result in a temporary which get copied and then destroyed, why 
>> have  move semantics at all? Move semantics are not just an 
>> optimization, they  actually change the semantics. If you have a struct 
>> with a @disabled  postblit, should it still be appendable?
> 
> Good question.  I don't even know how the runtime could avoid calling  
> postblit, there is no flag saying the postblit is disabled in the 
> typeinfo  (that I know of).
> 
> But think about it this way, if you have a function foo:
> 
> foo(S)(ref S s, S[] arr)
> {
>     arr[0] = s;
> }
> 
> Isn't this copy semantics?  This is exactly how the D runtime gets the  
> data.  The only difference is, the runtime function is allowed to 
> accept a  temporary as a reference (not possible in a normal function).

... and in the special case where the reference is a rvalue, then it 
should have move semantics. See below.


> Now, you could force move semantics, if you know the argument is an  
> rvalue, but I don't know enough about what postblit is used for in 
> order  to say it's fine to use move semantics to move the struct into 
> the heap.
> 
> The reason I say move semantics are an optimization is because:
> 
> {
>    S tmp;
>    arr ~= tmp;
> }
> 
> is essentially equivalent to:
> 
> arr ~= S();
> 
> But the former is copy semantics, the latter can be considered move.  
> It  seems like a smart compiler during optimization could rewrite the 
> former  as the latter, unless the semantics truly are different.  Which 
> is why I'm  trying to figure out how postblit can be used ;)

Actually, this should be the equivalent:

	import std.algorithm;

	S tmp;
	arr ~= move(tmp);

While there is no doubt that 'moving' a struct can often be used as an 
optimization without changing the semantics, if you want the @disabled 
attribute to be useful on the postblit constructor then the language 
needs to define when its semantics require 'moving' data and whey then 
require 'copying' data, it can't let that only to the choice of the 
optimizer.

Things might be clearer if we had a move operator, but instead we have 
a 'move' function. There is only one case where I think we can assume 
to have move semantics: when a temporary (a rvalue) is assigned to 
somewhere. That's also all that's needed for the 'move' function to 
work. And that is broken currently when it comes to array appending.

-- 
Michel Fortin
michel.fortin@michelf.com
http://michelf.com/
June 21, 2011
Re: what to do with postblit on the heap?
On Tue, 21 Jun 2011 08:25:40 -0400, Michel Fortin  
<michel.fortin@michelf.com> wrote:

> While there is no doubt that 'moving' a struct can often be used as an  
> optimization without changing the semantics, if you want the @disabled  
> attribute to be useful on the postblit constructor then the language  
> needs to define when its semantics require 'moving' data and whey then  
> require 'copying' data, it can't let that only to the choice of the  
> optimizer.

Another issue with appending a @disabled-postblit struct, what happens  
when you have to reallocate a block to get more space?  This cannot  
possibly be a move, because the compiler has no idea at the time of  
appending whether anything else has a reference to the original data.  So  
should it just be a runtime error?

I'm starting to think that @disabled postblit structs *shouldn't* be able  
to be appended.

-Steve
June 21, 2011
Re: what to do with postblit on the heap?
On Tue, 21 Jun 2011 04:59:49 +0300, Michel Fortin  
<michel.fortin@michelf.com> wrote:

> Well, if
>
> 	a ~= S();
>
> does result in a temporary which get copied and then destroyed, why have  
> move semantics at all? Move semantics are not just an optimization, they  
> actually change the semantics.

There was a similar discussion on struct constructors which ended up  
something like this, that it is an optimization.
I fully agree it is not, move exists just the reasons like this.
June 21, 2011
Re: what to do with postblit on the heap?
On Tue, 21 Jun 2011 15:25:40 +0300, Michel Fortin  
<michel.fortin@michelf.com> wrote:

> Actually, this should be the equivalent:
>
> 	import std.algorithm;
>
> 	S tmp;
> 	arr ~= move(tmp);
>
> While there is no doubt that 'moving' a struct can often be used as an  
> optimization without changing the semantics, if you want the @disabled  
> attribute to be useful on the postblit constructor then the language  
> needs to define when its semantics require 'moving' data and whey then  
> require 'copying' data, it can't let that only to the choice of the  
> optimizer.
>
> Things might be clearer if we had a move operator, but instead we have a  
> 'move' function. There is only one case where I think we can assume to  
> have move semantics: when a temporary (a rvalue) is assigned to  
> somewhere. That's also all that's needed for the 'move' function to  
> work. And that is broken currently when it comes to array appending.

It should be something else because move(tmp) in std.algorithm takes by  
reference and returns by value by actually moving it, because of the value  
semantics in D, that the ability to differentiate value from reference it  
doesn't need any other syntax because this is much better.

I think it is pretty neat, yet i still have some trouble understanding its  
effect here.

S tmp;
arr ~= move(tmp); // would make an unnecessary copy.

Move should do some kind of a magic there and treat its argument like a  
value, and return it.

Something like:

move(ref T a)
  return cast(T)a;

Maybe it makes no sense at all but i tried!
June 21, 2011
Re: what to do with postblit on the heap?
On 2011-06-21 08:38:05 -0400, "Steven Schveighoffer" 
<schveiguy@yahoo.com> said:

> On Tue, 21 Jun 2011 08:25:40 -0400, Michel Fortin  
> <michel.fortin@michelf.com> wrote:
> 
>> While there is no doubt that 'moving' a struct can often be used as an  
>> optimization without changing the semantics, if you want the @disabled  
>> attribute to be useful on the postblit constructor then the language  
>> needs to define when its semantics require 'moving' data and whey then  
>> require 'copying' data, it can't let that only to the choice of the  
>> optimizer.
> 
> Another issue with appending a @disabled-postblit struct, what happens  
> when you have to reallocate a block to get more space?  This cannot  
> possibly be a move, because the compiler has no idea at the time of  
> appending whether anything else has a reference to the original data.  
> So  should it just be a runtime error?

That's indeed a problem.

> I'm starting to think that @disabled postblit structs *shouldn't* be 
> able  to be appended.

That would make sense. It should be a compile-time error.

It would also turn appending using move to an optimization, because all 
the types you can append will be guarantied to be copyable.


-- 
Michel Fortin
michel.fortin@michelf.com
http://michelf.com/
June 21, 2011
Re: what to do with postblit on the heap?
On 2011-06-21 09:24:29 -0400, so <so@so.so> said:

> On Tue, 21 Jun 2011 15:25:40 +0300, Michel Fortin  
> <michel.fortin@michelf.com> wrote:
> 
>> Actually, this should be the equivalent:
>> 
>> 	import std.algorithm;
>> 
>> 	S tmp;
>> 	arr ~= move(tmp);
>> 
>> While there is no doubt that 'moving' a struct can often be used as an  
>> optimization without changing the semantics, if you want the @disabled  
>> attribute to be useful on the postblit constructor then the language  
>> needs to define when its semantics require 'moving' data and whey then  
>> require 'copying' data, it can't let that only to the choice of the  
>> optimizer.
>> 
>> Things might be clearer if we had a move operator, but instead we have 
>> a  'move' function. There is only one case where I think we can assume 
>> to  have move semantics: when a temporary (a rvalue) is assigned to  
>> somewhere. That's also all that's needed for the 'move' function to  
>> work. And that is broken currently when it comes to array appending.
> 
> It should be something else because move(tmp) in std.algorithm takes by 
>  reference and returns by value by actually moving it, because of the 
> value  semantics in D, that the ability to differentiate value from 
> reference it  doesn't need any other syntax because this is much better.
> 
> I think it is pretty neat, yet i still have some trouble understanding 
> its  effect here.
> 
> S tmp;
> arr ~= move(tmp); // would make an unnecessary copy.
> 
> Move should do some kind of a magic there and treat its argument like a 
>  value, and return it.

Actually, no copy is needed. Move takes the argument by ref so it can 
obliterates it. Obliteration consists of replacing its bytes with those 
in S.init. That way if you have a smart pointer, it gets returned 
without having to update the reference count (since the source's 
content has been destroyed). It was effectively be moved, not copied.

Note 1: Currently 'move' obliterates the source only if the type has a 
destructor or a postblit. I think it should always do it, but without 
inlining that might be a performance bottleneck.

Note 2: Making move efficient in the case of appending might require a 
total rework of how the compiler interacts with the runtime. And I 
don't think you can optimize away all blitting unless the move function 
was treated specially by the compiler (or became a special operator).

-- 
Michel Fortin
michel.fortin@michelf.com
http://michelf.com/
June 21, 2011
Re: what to do with postblit on the heap?
On Tue, 21 Jun 2011 18:18:26 +0300, Michel Fortin  
<michel.fortin@michelf.com> wrote:

> Actually, no copy is needed. Move takes the argument by ref so it can  
> obliterates it. Obliteration consists of replacing its bytes with those  
> in S.init. That way if you have a smart pointer, it gets returned  
> without having to update the reference count (since the source's content  
> has been destroyed). It was effectively be moved, not copied.

T move(ref T a) {
  T b;
  move(a, b);
  return b;
}

T a;
whatever = move(a);

If T is a struct, i don't see how a copy is not needed looking at the  
current state of move.
June 21, 2011
Re: what to do with postblit on the heap?
On 2011-06-21 12:13:32 -0400, so <so@so.so> said:

> On Tue, 21 Jun 2011 18:18:26 +0300, Michel Fortin  
> <michel.fortin@michelf.com> wrote:
> 
>> Actually, no copy is needed. Move takes the argument by ref so it can  
>> obliterates it. Obliteration consists of replacing its bytes with those 
>>  in S.init. That way if you have a smart pointer, it gets returned  
>> without having to update the reference count (since the source's 
>> content  has been destroyed). It was effectively be moved, not copied.
> 
> T move(ref T a) {
>    T b;
>    move(a, b);
>    return b;
> }
> 
> T a;
> whatever = move(a);
> 
> If T is a struct, i don't see how a copy is not needed looking at the  
> current state of move.

Actually, that depends on how you look at this.

The essence of a move operation is that you just copy the bits and then 
obliterate the old ones. So yes, there's indeed a copy to do, but 
there's no need to call a copy constructor or a destructor because no 
new instance has been created, it has just been moved. If you don't 
call the copy constructor (postblit) then it's a move operation, not a 
copy operation, even though there's still a bitwise copy inside the 
move operation.

In the return statement above, 'b' gets copied to 'whatever', then 
disappears along with the stack frame belonging to the function. So it 
becomes a move operation. (And it's even more direct than that with the 
named-value optimization.)

-- 
Michel Fortin
michel.fortin@michelf.com
http://michelf.com/
1 2 3
Top | Discussion index | About this forum | D home