Thread overview
making COW and ownership
Nov 21, 2012
Era Scarecrow
Nov 21, 2012
Timon Gehr
Nov 21, 2012
Era Scarecrow
Nov 21, 2012
Dan
November 21, 2012
 I was thinking briefly and glancing over documentation regarding cluts and non-cluts (Reference vs value types), and considering my BitArray implementation. I was having a problem trying to solve how to avoid duplicating it unless it actually was needed; And I've come to a possible solution.

 There are data types that must have new copies, but what if we delay the duplicating unless it actually tries to make a change. Mind you this won't be usable in all cases; But if we can put an ownership tag on it, perhaps it might do a good portion of the job.

 Here's a quick thrown together example of what I'm talking about.

[code]
import std.stdio;
import std.conv;

struct COWArray(T) {
  COWArray *owner;
  T[] data;

  this(int i) {
    owner = &this;
    data.length = i;
  }

  inout(T) opIndex(int i) inout pure nothrow {
    return data[i];
  }

  ref COWArray opIndexAssign(T value, int i) pure {
    if (owner != &this) {
      owner = &this;
      data = data.dup;
    }

    data[i] = value;
    return this;
  }
}

unittest {
  auto ca = COWArray!int(4);

  writeln("\nOriginal:");
  foreach(i; 0 .. 4) {
    ca[i] = 10 + i;
    writeln(ca);
  }

  auto new_ca = ca;

  writeln("\nNew section: ");
  foreach(i; 0 .. 4) {
    new_ca[i] = 200 + i;
    writeln(new_ca);
  }

  writeln("\nPost processing:");
  writeln(ca);
  writeln(new_ca);
}
[/code]

 This produces:

Original:
COWArray!(int)(18FDCC, [10, 0, 0, 0])
COWArray!(int)(18FDCC, [10, 11, 0, 0])
COWArray!(int)(18FDCC, [10, 11, 12, 0])
COWArray!(int)(18FDCC, [10, 11, 12, 13])

New section:
COWArray!(int)(18FDE4, [200, 11, 12, 13])
COWArray!(int)(18FDE4, [200, 201, 12, 13])
COWArray!(int)(18FDE4, [200, 201, 202, 13])
COWArray!(int)(18FDE4, [200, 201, 202, 203])

Post processing:
COWArray!(int)(18FDCC, [10, 11, 12, 13])
COWArray!(int)(18FDE4, [200, 201, 202, 203])
November 21, 2012
On 11/21/2012 05:23 AM, Era Scarecrow wrote:
>   I was thinking briefly and glancing over documentation regarding cluts
> and non-cluts (Reference vs value types), and considering my BitArray
> implementation. I was having a problem trying to solve how to avoid
> duplicating it unless it actually was needed; And I've come to a
> possible solution.
>
>   There are data types that must have new copies, but what if we delay
> the duplicating unless it actually tries to make a change. Mind you this
> won't be usable in all cases; But if we can put an ownership tag on it,
> perhaps it might do a good portion of the job.
>
>   Here's a quick thrown together example of what I'm talking about.
>
> [code]
> import std.stdio;
> import std.conv;
>
> struct COWArray(T) {
>    COWArray *owner;
>    T[] data;
>
>    this(int i) {
>      owner = &this;
>      data.length = i;
>    }
>
>    inout(T) opIndex(int i) inout pure nothrow {
>      return data[i];
>    }
>
>    ref COWArray opIndexAssign(T value, int i) pure {
>      if (owner != &this) {
>        owner = &this;
>        data = data.dup;
>      }
>
>      data[i] = value;
>      return this;
>    }
> }
>
> unittest {
>    auto ca = COWArray!int(4);
>
>    writeln("\nOriginal:");
>    foreach(i; 0 .. 4) {
>      ca[i] = 10 + i;
>      writeln(ca);
>    }
>
>    auto new_ca = ca;
>
>    writeln("\nNew section: ");
>    foreach(i; 0 .. 4) {
>      new_ca[i] = 200 + i;
>      writeln(new_ca);
>    }
>
>    writeln("\nPost processing:");
>    writeln(ca);
>    writeln(new_ca);
> }
> [/code]
>
>   This produces:
>
> Original:
> COWArray!(int)(18FDCC, [10, 0, 0, 0])
> COWArray!(int)(18FDCC, [10, 11, 0, 0])
> COWArray!(int)(18FDCC, [10, 11, 12, 0])
> COWArray!(int)(18FDCC, [10, 11, 12, 13])
>
> New section:
> COWArray!(int)(18FDE4, [200, 11, 12, 13])
> COWArray!(int)(18FDE4, [200, 201, 12, 13])
> COWArray!(int)(18FDE4, [200, 201, 202, 13])
> COWArray!(int)(18FDE4, [200, 201, 202, 203])
>
> Post processing:
> COWArray!(int)(18FDCC, [10, 11, 12, 13])
> COWArray!(int)(18FDE4, [200, 201, 202, 203])

This also duplicates the data if you move the struct in memory and then mutate. Probably you just need to have a boolean owner flag and set it to false on postblit.



November 21, 2012
On Wednesday, 21 November 2012 at 04:23:56 UTC, Era Scarecrow wrote:
>  Here's a quick thrown together example of what I'm talking about.

I think I see where you are going. After the copy creation of new_ca you have two objects referencing exactly the same data. It doesn't show it, but the benefit is that a whole bunch of reading can be going on against two handles to the same data without conflict (i.e. before the first assignment 'new_ca[i] = 200+i;' and you have delayed work. I think for this to be effective or sane you have to lock all the doors of change. So, for example, you would need 'data' to be private, otherwise data could be changed without successfully triggering the copy via opAssignIndex. Also, if ca itself were changed after the assignment to new_ca, but before mutation on new_ca got its copy then both would see that change, which is probably not desired.

An alternative might be to use the const system. Assume that the whole lot of reading is going on in multiple pieces of code - different functions. Just give them the original bit array as const. Then they can not write to it directly, but can read from it. Eventually when they need to mutate they would need to copy just before beginning mutation. The const system helps, but what is missing is a way to copy a const object. No way exists yet - but I'm proposing and have a 'gdup' that does for this case. Any comments on it appreciated.

https://github.com/patefacio/d-help/blob/master/doc/canonical.pdf
https://github.com/patefacio/d-help/blob/master/d-help/opmix/mix.d

The idea is that the transitive nature of const in D is perfect for providing those copy on write semantics without any extra work if you can copy the const object when needed.

Thanks
Dan

--------------------------------------------------
import std.stdio;
import std.conv;
import opmix.mix;

struct COWArray(T) {
  private T[] data;
  this(int i) {
    data.length = i;
  }
  inout(T) opIndex(int i) inout pure nothrow {
    return data[i];
  }

  ref COWArray opIndexAssign(T value, int i) pure {
    data[i] = value;
    return this;
  }
}

void give_away_ref(T)(ref const(COWArray!T) ca) {
  // Do lots of reading

  auto new_ca = ca.gdup;

  writeln("\nNew section: ");
  foreach(i; 0 .. 4) {
    new_ca[i] = 200 + i;
    writeln(new_ca);
  }
}

unittest {
  auto ca = COWArray!int(4);

  writeln("\nOriginal:");
  foreach(i; 0 .. 4) {
    ca[i] = 10 + i;
    writeln(ca);
  }

  give_away_ref(ca);

  writeln("\nPost processing:");
  writeln(ca);

}
November 21, 2012
On Wednesday, 21 November 2012 at 10:15:52 UTC, Timon Gehr wrote:
> This also duplicates the data if you move the struct in memory and then mutate. Probably you just need to have a boolean owner flag and set it to false on postblit.

 Hadn't thought of a move being involved.. I was trying to use a way to identify it at the off chance it copied without a postblit, then it could still identify itself... (casting could easily do that). I have code that forcibly converts a struct into a raw array, since reversing that is needed at times. In those cases the owner address would still identify it where with a flag it would overwrite the referenced data believing it was the owner.

 Let's see... Here's the convert struct/type to array, modified to simplify.

  void[] getArrayOfType(TYPE, V ...)() {
    TYPE[] t;
    t.length = 1;
    static if (V.length)
      t[0] = TYPE(V);

    return cast(void[]) t;
  }