Thread overview
confused with some_var.dup
Jan 16, 2009
Qian Xu
Jan 16, 2009
Tim M
Jan 16, 2009
Denis Koroskin
Jan 16, 2009
Qian Xu
Jan 16, 2009
Denis Koroskin
Jan 16, 2009
Christopher Wright
Jan 17, 2009
Rainer Deyke
Jan 17, 2009
Christopher Wright
January 16, 2009
When shall I use some_var.dup and when not?
Is there any guidlines?

--Qian
January 16, 2009
On Fri, 16 Jan 2009 20:30:53 +1300, Qian Xu <quian.xu@stud.tu-ilmenau.de> wrote:

> When shall I use some_var.dup and when not?
> Is there any guidlines?
>
> --Qian


Yeah when you want one to be different than the other. If the lvalue is a slice then there is no need though.
January 16, 2009
On Fri, 16 Jan 2009 10:30:53 +0300, Qian Xu <quian.xu@stud.tu-ilmenau.de> wrote:

> When shall I use some_var.dup and when not?
> Is there any guidlines?
>
> --Qiansua

.dup is typical way to makes a copy of variable:

char[] greetings = "Hello, World!".dup;

You usually do this when you want to modify variable that you are not allowed modify in-place.
Consider the following example:

auto greetings = "Hello, World!";

The "Hello, World!" string is not allowed to be modified, because it could be shared throughot the project and will be most probably put in a read-only memory causing segfault at modification.

But it you need to have a modified version of this this, you create its copy (duplication, or 'dup' for short) and make whatever changes you want to it:


char[] copy = greetings.dup;
copy[0] = "T"; // copy -> "Tello, World!"

.dup may be applied to arrays (including strings and maps aka associative arrays).

You should write your own .dup method for your classes (deciding whether you want a deep copy or not), and it is not needed for struct, because assignment does everything for you (unless you want deep copy).

Hope that helps.

January 16, 2009
Denis Koroskin wrote:
> 
> The "Hello, World!" string is not allowed to be modified, because it could be shared throughot the project and will be most probably put in a read-only memory causing segfault at modification.
> 
> But it you need to have a modified version of this this, you create its copy (duplication, or 'dup' for short) and make whatever changes you want to it...

This confuses me very.
Do you mean, there is no Copy-On-Write semantic in D?
IMO, D-Compiler should make decision about whether to allocate a new memory
block, not programmer.

--Qian
January 16, 2009
On Fri, 16 Jan 2009 19:15:03 +0300, Qian Xu <quian.xu@stud.tu-ilmenau.de> wrote:

> Denis Koroskin wrote:
>>
>> The "Hello, World!" string is not allowed to be modified, because it could
>> be shared throughot the project and will be most probably put in a
>> read-only memory causing segfault at modification.
>>
>> But it you need to have a modified version of this this, you create its
>> copy (duplication, or 'dup' for short) and make whatever changes you want
>> to it...
>
> This confuses me very.
> Do you mean, there is no Copy-On-Write semantic in D?
> IMO, D-Compiler should make decision about whether to allocate a new memory
> block, not programmer.
>
> --Qian

No, arrays have reference semantics (unlike std::string in C++) and thus changing data an array points to will have effect on all other arrays that share same data.

January 16, 2009
Qian Xu wrote:
> Denis Koroskin wrote:
>> The "Hello, World!" string is not allowed to be modified, because it could
>> be shared throughot the project and will be most probably put in a
>> read-only memory causing segfault at modification.
>>
>> But it you need to have a modified version of this this, you create its
>> copy (duplication, or 'dup' for short) and make whatever changes you want
>> to it...
> 
> This confuses me very. Do you mean, there is no Copy-On-Write semantic in D?
> IMO, D-Compiler should make decision about whether to allocate a new memory
> block, not programmer.

You can create a COW array struct pretty easily. However, this will be pretty slow in a lot of cases.

If you have an array, you're probably going to build it, hold onto it for a while, and then discard it. You might mutate it in the middle, but you're probably likely to do a lot of mutations if you do any.

When you're building the array, you really don't want COW semantics. This will overallocate -- O(n**2) memory required rather than O(n). When you're mutating large portions, you still don't want COW semantics for the same reason.

COW is safer, but it can waste resources like nobody's business and only helps in a few cases. It's better to leave that for a library type.
January 17, 2009
Christopher Wright wrote:
> You can create a COW array struct pretty easily. However, this will be pretty slow in a lot of cases.

A built-in COW type does not need to be slow!  The compiler can use static analysis to eliminate unnecessary copies, and reference counting can be used to further reduce the number of copies.

> When you're building the array, you really don't want COW semantics. This will overallocate -- O(n**2) memory required rather than O(n). When you're mutating large portions, you still don't want COW semantics for the same reason.

This would not be a problem with a built-in COW type.  The compiler can see that the array is being modified, but not copied, in a block, so it places a single copy operation at the beginning of the block.

The messy sometimes-a-reference-and-sometimes-a-value semantics of D arrays are one of the reasons why I still prefer C++ over D.


-- 
Rainer Deyke - rainerd@eldwood.com
January 17, 2009
Rainer Deyke wrote:
> Christopher Wright wrote:
>> You can create a COW array struct pretty easily. However, this will be
>> pretty slow in a lot of cases.
> 
> A built-in COW type does not need to be slow!  The compiler can use
> static analysis to eliminate unnecessary copies, and reference counting
> can be used to further reduce the number of copies.

True. Though you'd need syntax for COW arrays and non-COW arrays. Same for structs. It makes the language more complicated, and it makes the compiler even more complicated.

>> When you're building the array, you really don't want COW semantics.
>> This will overallocate -- O(n**2) memory required rather than O(n). When
>> you're mutating large portions, you still don't want COW semantics for
>> the same reason.
> 
> This would not be a problem with a built-in COW type.  The compiler can
> see that the array is being modified, but not copied, in a block, so it
> places a single copy operation at the beginning of the block.
> 
> The messy sometimes-a-reference-and-sometimes-a-value semantics of D
> arrays are one of the reasons why I still prefer C++ over D.

I agree, but most of the time, I don't modify arrays once they've been created. If I do modify them a lot, I usually want a set rather than an array for efficient removals and uniqueness.