Thread overview
deepCopy
Oct 18, 2010
Denis Koroskin
Oct 19, 2010
Jacob Carlborg
Oct 19, 2010
Denis Koroskin
Oct 19, 2010
Jacob Carlborg
October 18, 2010
Okay, we've finished what we started with Aleksey today, so I decided to share it with you. This is a rough cut, it lacks comments, but it is already usable.

deepCopy is a function that makes a deep copy of your object, and everything it points to. As simple as that.

Essentially it is a binary serializer except that it doesn't store pointers as offsets (although it is capable of doing that, too - change a single line - so serialize and deepCopy share 99% of the code).

deepCopy and serialize are similar in design yet different in usage: serialized data are usually transmitted to other application, and one must worry about different endianness, pointer size etc. On the contrary deepCopy is to be used within the same address space and is free from such issues.

deepCopy is useful for making sure there are no aliases to your data left. This can be used to create a safe immutable copy of your objects, or to avoid with memory leaks.

We've written deepCopy to solve memory leaks in ddmd the following way:
a) hook all the memory allocations
b) run code, produce result
c) make a deep copy of the result
d) release all the allocated memory

Since you are deallocating all the memory at once, you can use faster allocation methods e.g preallocate a memory and simply advance a pointer, or use dsimcha's tempAlloc (http://dsource.org/projects/scrapple/browser/trunk/tempAlloc). BTW, I remember a discussion about integrating it into druntime, did it go anywhere since then?.

deepCopy stores all the data sequentially, so it should reduce memory fragmentation and should be more cache-friendly. As a downside, the whole block will only be release once last reference to it expires.

If your struct has pointers you can manually specify if that pointer is a pointer to one element (default), many or none (excluding it from being copied). Exclusion works with references, too:

class Foo : ISerializeable
{
    mixin Serializeable;

    // optional, only needed for precise serialization control
    void describe(SerializeInfo* info)
    {
        info.setLength(buffer, length);
        info.exclude(cachedValue);
    }

    ubyte* buffer;
    size_t length;
    Object cachedValue;
}

I hope someone will find it useful, the code with tests is located here:
http://bitbucket.org/korDen/serialize/src/tip/

Suggestions are welcome!
October 19, 2010
On 2010-10-19 00:01, Denis Koroskin wrote:
> Okay, we've finished what we started with Aleksey today, so I decided to
> share it with you. This is a rough cut, it lacks comments, but it is
> already usable.
>
> deepCopy is a function that makes a deep copy of your object, and
> everything it points to. As simple as that.
>
> Essentially it is a binary serializer except that it doesn't store
> pointers as offsets (although it is capable of doing that, too - change
> a single line - so serialize and deepCopy share 99% of the code).
>
> deepCopy and serialize are similar in design yet different in usage:
> serialized data are usually transmitted to other application, and one
> must worry about different endianness, pointer size etc. On the contrary
> deepCopy is to be used within the same address space and is free from
> such issues.
>
> deepCopy is useful for making sure there are no aliases to your data
> left. This can be used to create a safe immutable copy of your objects,
> or to avoid with memory leaks.
>
> We've written deepCopy to solve memory leaks in ddmd the following way:
> a) hook all the memory allocations
> b) run code, produce result
> c) make a deep copy of the result
> d) release all the allocated memory
>
> Since you are deallocating all the memory at once, you can use faster
> allocation methods e.g preallocate a memory and simply advance a
> pointer, or use dsimcha's tempAlloc
> (http://dsource.org/projects/scrapple/browser/trunk/tempAlloc). BTW, I
> remember a discussion about integrating it into druntime, did it go
> anywhere since then?.
>
> deepCopy stores all the data sequentially, so it should reduce memory
> fragmentation and should be more cache-friendly. As a downside, the
> whole block will only be release once last reference to it expires.
>
> If your struct has pointers you can manually specify if that pointer is
> a pointer to one element (default), many or none (excluding it from
> being copied). Exclusion works with references, too:
>
> class Foo : ISerializeable
> {
> mixin Serializeable;
>
> // optional, only needed for precise serialization control
> void describe(SerializeInfo* info)
> {
> info.setLength(buffer, length);
> info.exclude(cachedValue);
> }
>
> ubyte* buffer;
> size_t length;
> Object cachedValue;
> }
>
> I hope someone will find it useful, the code with tests is located here:
> http://bitbucket.org/korDen/serialize/src/tip/
>
> Suggestions are welcome!

What types does this support, all types? Does it support array slices?

-- 
/Jacob Carlborg
October 19, 2010
On Tue, 19 Oct 2010 12:37:35 +0400, Jacob Carlborg <doob@me.com> wrote:

> On 2010-10-19 00:01, Denis Koroskin wrote:
>> Okay, we've finished what we started with Aleksey today, so I decided to
>> share it with you. This is a rough cut, it lacks comments, but it is
>> already usable.
>>
>> deepCopy is a function that makes a deep copy of your object, and
>> everything it points to. As simple as that.
>>
>> Essentially it is a binary serializer except that it doesn't store
>> pointers as offsets (although it is capable of doing that, too - change
>> a single line - so serialize and deepCopy share 99% of the code).
>>
>> deepCopy and serialize are similar in design yet different in usage:
>> serialized data are usually transmitted to other application, and one
>> must worry about different endianness, pointer size etc. On the contrary
>> deepCopy is to be used within the same address space and is free from
>> such issues.
>>
>> deepCopy is useful for making sure there are no aliases to your data
>> left. This can be used to create a safe immutable copy of your objects,
>> or to avoid with memory leaks.
>>
>> We've written deepCopy to solve memory leaks in ddmd the following way:
>> a) hook all the memory allocations
>> b) run code, produce result
>> c) make a deep copy of the result
>> d) release all the allocated memory
>>
>> Since you are deallocating all the memory at once, you can use faster
>> allocation methods e.g preallocate a memory and simply advance a
>> pointer, or use dsimcha's tempAlloc
>> (http://dsource.org/projects/scrapple/browser/trunk/tempAlloc). BTW, I
>> remember a discussion about integrating it into druntime, did it go
>> anywhere since then?.
>>
>> deepCopy stores all the data sequentially, so it should reduce memory
>> fragmentation and should be more cache-friendly. As a downside, the
>> whole block will only be release once last reference to it expires.
>>
>> If your struct has pointers you can manually specify if that pointer is
>> a pointer to one element (default), many or none (excluding it from
>> being copied). Exclusion works with references, too:
>>
>> class Foo : ISerializeable
>> {
>> mixin Serializeable;
>>
>> // optional, only needed for precise serialization control
>> void describe(SerializeInfo* info)
>> {
>> info.setLength(buffer, length);
>> info.exclude(cachedValue);
>> }
>>
>> ubyte* buffer;
>> size_t length;
>> Object cachedValue;
>> }
>>
>> I hope someone will find it useful, the code with tests is located here:
>> http://bitbucket.org/korDen/serialize/src/tip/
>>
>> Suggestions are welcome!
>
> What types does this support, all types? Does it support array slices?
>

Classes, structs, built-in types, arrays, slices - just about anything. For classes to work you need to implement ISerializable interface using mixin Serializable; so that it could serialize through base class pointer.

Not that I think about it, it doesn't support built-in associative arrays yet. I forgot about that one (it might be tricky to serialize it into a sequential memory block). Take a loot at the tests (there are many of them, some of them are very tricky), and give it a try.
October 19, 2010
On 2010-10-19 17:52, Denis Koroskin wrote:
> On Tue, 19 Oct 2010 12:37:35 +0400, Jacob Carlborg <doob@me.com> wrote:
>>
>> What types does this support, all types? Does it support array slices?
>>
>
> Classes, structs, built-in types, arrays, slices - just about anything.
> For classes to work you need to implement ISerializable interface using
> mixin Serializable; so that it could serialize through base class pointer.
>
> Not that I think about it, it doesn't support built-in associative
> arrays yet. I forgot about that one (it might be tricky to serialize it
> into a sequential memory block). Take a loot at the tests (there are
> many of them, some of them are very tricky), and give it a try.

I'm kind of amazed how sort the code is and that you managed to support delegates. A suggestion, support classes without the mixin if they're not serialized through a base class reference.

I might have found a bug:

struct Foo
{
    int[] arr;
}

public void test0() {
    Foo src;
    src.arr = [0, 1, 2, 3, 4, 5];		
    auto dst = deepCopy(src);
	
    assert(src.arr is dst.arr); // passes, but should fail ?
}

The assert passes but should fail, otherwise it's not a deep copy.

-- 
/Jacob Carlborg