Jump to page: 1 2
Thread overview
Behaviour of append (~=)
May 30, 2006
Lionello Lunesu
May 30, 2006
Derek Parnell
May 30, 2006
Lionello Lunesu
May 30, 2006
Derek Parnell
May 30, 2006
Lionello Lunesu
May 30, 2006
Chris Miller
May 30, 2006
Derek Parnell
May 30, 2006
Daniel Keep
May 30, 2006
Lionello Lunesu
May 30, 2006
Oskar Linde
May 30, 2006
Lionello Lunesu
May 30, 2006
Why does the appending to a 'null' array cause the contents to be copied?

int[] ar1; // some array, not constant

int[] ar2 = null;
ar2 ~= ar1;
// ar2 !is ar1

But it could be like this:
if (ar2)
  ar2 ~= ar1;
else
  ar2 = ar1;

Wouldn't it be a good optimization for ~= to check for null first, to prevent the copy?

L.
May 30, 2006
On Tue, 30 May 2006 21:37:02 +1000, Lionello Lunesu <lio@lunesu.remove.com> wrote:

> Why does the appending to a 'null' array cause the contents to be copied?
>
> int[] ar1; // some array, not constant
>
> int[] ar2 = null;
> ar2 ~= ar1;
> // ar2 !is ar1
>
> But it could be like this:
> if (ar2)
>    ar2 ~= ar1;
> else
>    ar2 = ar1;
>
> Wouldn't it be a good optimization for ~= to check for null first, to prevent the copy?

Not so sure, because for consistency sake its good to know that 'append' will always do a copy. Then we can rely on this to happen.

-- 
Derek Parnell
Melbourne, Australia
May 30, 2006

Lionello Lunesu wrote:
> Why does the appending to a 'null' array cause the contents to be copied?
> 
> int[] ar1; // some array, not constant
> 
> int[] ar2 = null;
> ar2 ~= ar1;
> // ar2 !is ar1
> 
> But it could be like this:
> if (ar2)
>   ar2 ~= ar1;
> else
>   ar2 = ar1;
> 
> Wouldn't it be a good optimization for ~= to check for null first, to prevent the copy?
> 
> L.

I think the problem with this is that it's an edge case.  With your suggestion, appending to a null array and appending to an *empty* array would have completely different semantics.  Now, every programmer who uses arrays has to watch out for this one special case.

Yes, it would be better performance-wise, but it would be hell on programmers since it's non-obvious behaviour.

	-- Daniel

-- 

v1sw5+8Yhw5ln4+5pr6OFma8u6+7Lw4Tm6+7l6+7D a2Xs3MSr2e4/6+7t4TNSMb6HTOp5en5g6RAHCP    http://hackerkey.com/
May 30, 2006
Derek Parnell wrote:
> On Tue, 30 May 2006 21:37:02 +1000, Lionello Lunesu <lio@lunesu.remove.com> wrote:
> 
>> Why does the appending to a 'null' array cause the contents to be copied?
>>
>> int[] ar1; // some array, not constant
>>
>> int[] ar2 = null;
>> ar2 ~= ar1;
>> // ar2 !is ar1
>>
>> But it could be like this:
>> if (ar2)
>>    ar2 ~= ar1;
>> else
>>    ar2 = ar1;
>>
>> Wouldn't it be a good optimization for ~= to check for null first, to prevent the copy?
> 
> Not so sure, because for consistency sake its good to know that 'append' will always do a copy. Then we can rely on this to happen.
> 
> --Derek Parnell
> Melbourne, Australia

But, take std.string.replace for example. It only copies when needed. Isn't this "exactly what you expect it to do"? I thought this was kind-of understood by COW.

Also, I don't think it's safe to rely on the copy to happen. If everything follows COW rules, then that's all you need to know, and AFAIC ~= copying only iff lhs !is null is a normal COW rule.

L.
May 30, 2006
Daniel Keep wrote:
> 
> Lionello Lunesu wrote:
>> Why does the appending to a 'null' array cause the contents to be copied?
>>
>> int[] ar1; // some array, not constant
>>
>> int[] ar2 = null;
>> ar2 ~= ar1;
>> // ar2 !is ar1
>>
>> But it could be like this:
>> if (ar2)
>>   ar2 ~= ar1;
>> else
>>   ar2 = ar1;
>>
>> Wouldn't it be a good optimization for ~= to check for null first, to
>> prevent the copy?
>>
>> L.
> 
> I think the problem with this is that it's an edge case.  With your
> suggestion, appending to a null array and appending to an *empty* array
> would have completely different semantics.  Now, every programmer who
> uses arrays has to watch out for this one special case.

It doesn't have to be different. It only depends on what the compiler's testing. If it tests "ar2.length" then the two cases will be treated the same was (no copy).

> Yes, it would be better performance-wise, but it would be hell on
> programmers since it's non-obvious behaviour.

I don't think those programmers should write code that depends on the compiler copying the data in those cases. You got the array you asked for, so? If you want to make sure have a copy, you probably needed to .dup yourself anyway.

L.
May 30, 2006
On Tue, 30 May 2006 07:47:22 -0400, Derek Parnell <derek@psych.ward> wrote:

> On Tue, 30 May 2006 21:37:02 +1000, Lionello Lunesu <lio@lunesu.remove.com> wrote:
>
>> Why does the appending to a 'null' array cause the contents to be copied?
>>
>> int[] ar1; // some array, not constant
>>
>> int[] ar2 = null;
>> ar2 ~= ar1;
>> // ar2 !is ar1
>>
>> But it could be like this:
>> if (ar2)
>>    ar2 ~= ar1;
>> else
>>    ar2 = ar1;
>>
>> Wouldn't it be a good optimization for ~= to check for null first, to prevent the copy?
>
> Not so sure, because for consistency sake its good to know that 'append' will always do a copy. Then we can rely on this to happen.
>

Exactly; many, many times I rely on this behavior. If it is changed, a lot of memory will be overwritten unintentionally and broken code will result. Consider the following code if ~= is changed:
   foo ~= bar[0 .. n];
   foo ~= baz; // likely corruption after bar[n].
It's kind of like saying, why require ~ to copy when you can just use and overwrite the memory after the first operand. Too many people rely on ~ copying.
May 30, 2006
(I posted another thing to this thread that I later canceled. It was just me misunderstanding what Derek meant. I'm just too tired right now...)

Lionello Lunesu skrev:

> I don't think those programmers should write code that depends on the compiler copying the data in those cases. You got the array you asked for, so? If you want to make sure have a copy, you probably needed to .dup yourself anyway.

I would put it the other way around. IMHO, the more guarantees the language can give you the better. Defensive .dup-ing is never good.

D doesn't have any notion of ownership, you have to remember what you own and what you don't. Currently, if you create an array and add (append) data to it, you are guaranteed that you still own the data. With your suggestion, that would no longer be true.

It is very common to append data that you don't own:
a ~= "static read-only string constant";
or
a ~= itoa(7); // May refer to static string data.

/Oskar
May 30, 2006
On Tue, 30 May 2006 23:20:45 +1000, Oskar Linde <oskar.lindeREM@OVEgmail.com> wrote:

>>> Wouldn't it be a good optimization for ~= to check for null first, to prevent the copy?
>>  Not so sure, because for consistency sake its good to know that 'append' will always do a copy. Then we can rely on this to happen.
>
> But does it? Isn't n repeated appends guaranteed to give at most log(n) allocations? How can that be if all appends force a copy? A simple test:

I didn't mention allocations. I was talking about copying.

> char[] a = "abcdefgh";
> char[] b = a[0..3];
> b ~= "xx";
> writefln("a = %s",a);
>
> prints:
> abcxxfgh

Yes, but the "xx" was copied wasn't it. That is, 'a' does not contain a slice to the literal "xx".


-- 
Derek Parnell
Melbourne, Australia
May 30, 2006
On Tue, 30 May 2006 23:30:21 +1000, Lionello Lunesu <lio@lunesu.remove.com> wrote:

> Derek Parnell wrote:
>> On Tue, 30 May 2006 21:37:02 +1000, Lionello Lunesu <lio@lunesu.remove.com> wrote:
>>
>>> Why does the appending to a 'null' array cause the contents to be copied?
>>>
>>> int[] ar1; // some array, not constant
>>>
>>> int[] ar2 = null;
>>> ar2 ~= ar1;
>>> // ar2 !is ar1
>>>
>>> But it could be like this:
>>> if (ar2)
>>>    ar2 ~= ar1;
>>> else
>>>    ar2 = ar1;
>>>
>>> Wouldn't it be a good optimization for ~= to check for null first, to prevent the copy?
>>  Not so sure, because for consistency sake its good to know that 'append' will always do a copy. Then we can rely on this to happen.
>>  --Derek Parnell
>> Melbourne, Australia
>
> But, take std.string.replace for example. It only copies when needed. Isn't this "exactly what you expect it to do"? I thought this was kind-of understood by COW.

But what does the 'replace' function and CoW have to do with this discussion? All I'm saying that the append operation is always going to copy the right hand data. If you don't want that behaviour (such as CoW then only use the append operator when you need to copy the data.)

> Also, I don't think it's safe to rely on the copy to happen. If everything follows COW rules, then that's all you need to know, and AFAIC ~= copying only iff lhs !is null is a normal COW rule.

But not everything uses or needs CoW. In fact, the append operator is a useful method to copy something during a CoW function.

-- 
Derek Parnell
Melbourne, Australia
May 30, 2006
> It is very common to append data that you don't own:
> a ~= "static read-only string constant";
> or
> a ~= itoa(7); // May refer to static string data.

I can see how that would cause problems if it wouldn't copy : )

Thanks for the example.

L.


« First   ‹ Prev
1 2