February 24, 2015
Does DIP25 means we can turn resources into RC types and be done with non-deterministic destructor calls by the GC, without using RefCounted or Unique?

On Monday, 23 February 2015 at 22:15:54 UTC, Walter Bright wrote:
> This is pretty straightforward. More could be done:
>
> 1. small array optimization
> 2. support for ranges as constructor args
> 3. present a range interface
> 4. support for malloc/free instead of GC
> 5. bounds checking
> 6. the array[] and the count could be allocated together
> 7. array[] could be just a pointer
>
> but the basic idea is there, I didn't want to hide it behind all the other flesh a professional type would have.
>
> Note the return in opIndex(). This is DIP25 at work!
>
> Compile:
>     dmd rcarray -unittest -main -dip25
>
> ===========================================
>
> struct RCArray(E) {
>
>     this(E[] a)
>     {
>         array = a.dup;
>         start = 0;
>         end = a.length;
>         count = new int;
>         *count = 1;
>     }
>
>     ~this()
>     {
>         if (count && --*count == 0)
>             delete array;
>     }
>
>     this(this)
>     {
>         if (count)
>             ++*count;
>     }
>
>     size_t length()
>     {
>         return end - start;
>     }
>
>     ref E opIndex(size_t i) return // here's the magic
>     {
>         return array[start + i];
>     }
>
>     RCArray opSlice(size_t lwr, size_t upr)
>     {
>         RCArray result = this;
>         result.start = start + lwr;
>         result.end = start + upr;
>         return result;
>     }
>
>   private:
>     E[] array;
>     size_t start, end;
>     int* count;
> }
>
> unittest
> {
>     static int[3] s = [7, 6, 4];
>     auto r = RCArray!int(s);
>     assert(r.length == 3);
>     assert(r[0] == 7);
>     assert(r[1] == 6);
>     assert(r[2] == 4);
>     assert(*r.count == 1);
>
>     {
>         auto r2 = r;
>         assert(r2[0] == 7);
>         assert(r2[1] == 6);
>         assert(r2[2] == 4);
>         assert(*r.count == 2);
>
>         r[1] = 3;
>         assert(r2[0] == 7);
>         assert(r2[1] == 3);
>         assert(r2[2] == 4);
>     }
>     assert(*r.count == 1);
>
>     auto r3 = r[1 .. 3];
>     r[2] = 9;
>     assert(r3[0] == 3);
>     assert(r3[1] == 9);
>
>   /+
>     ref int test(ref RCArray!int rr)
>     {
>         return rr[1]; // this gives error
>     }
>    +/
> }

February 24, 2015
On 2015-02-23 22:15:46 +0000, Walter Bright said:

> int* count;
> 
> [...] if (count && --*count == 0) [...]

Careful!

This isn't memory safe and you have to thank the GC for it. If you ever use RCArray as a member variable in a class, the RCArray destructor is going to be called from a random thread when the class destructor is run. If some thread has a stack reference to the array you have a race.

You have to use an atomic counter unless you can prove the RCArray struct will never be put in a GC-managed context. It is rather sad that the language has no way to enforce such a restriction, and also that @safe cannot detect that this is a problem here.

-- 
Michel Fortin
michel.fortin@michelf.com
http://michelf.com/

February 24, 2015
On 2/24/15 8:17 AM, Michel Fortin wrote:
> On 2015-02-23 22:15:46 +0000, Walter Bright said:
>
>> int* count;
>>
>> [...] if (count && --*count == 0) [...]
>
> Careful!
>
> This isn't memory safe and you have to thank the GC for it. If you ever
> use RCArray as a member variable in a class, the RCArray destructor is
> going to be called from a random thread when the class destructor is
> run. If some thread has a stack reference to the array you have a race.
>
> You have to use an atomic counter unless you can prove the RCArray
> struct will never be put in a GC-managed context. It is rather sad that
> the language has no way to enforce such a restriction, and also that
> @safe cannot detect that this is a problem here.
>

Actually, RCArray can never be allocated on GC, or you may corrupt memory. count may be non-null, and point at invalid memory when the dtor is called.

Only safe way to do this is to C malloc/free the count. And yes, at that point, you need atomics.

-Steve
February 24, 2015
On 2/23/15 6:05 PM, Walter Bright wrote:
> On 2/23/2015 5:41 PM, weaselcat wrote:
>> On Monday, 23 February 2015 at 22:15:54 UTC, Walter Bright wrote:
>>>     ref E opIndex(size_t i) return // here's the magic
>>
>> exactly what is this doing? I don't see this explained in DIP25 at all.
>
> The 'this' is passed by reference. So the 'return' is really 'return ref
> this', the rest is explained by DIP25.

We should amend DIP25 to explain that "this" is handled like a parameter. -- Andrei
February 24, 2015
On 2/24/15 3:14 AM, ponce wrote:
> Does DIP25 means we can turn resources into RC types and be done with
> non-deterministic destructor calls by the GC, without using RefCounted
> or Unique?

That's what we're aiming for. -- Andrei

February 24, 2015
On 2/24/15 5:17 AM, Michel Fortin wrote:
> On 2015-02-23 22:15:46 +0000, Walter Bright said:
>
>> int* count;
>>
>> [...] if (count && --*count == 0) [...]
>
> Careful!
>
> This isn't memory safe and you have to thank the GC for it. If you ever
> use RCArray as a member variable in a class, the RCArray destructor is
> going to be called from a random thread when the class destructor is
> run. If some thread has a stack reference to the array you have a race.
>
> You have to use an atomic counter unless you can prove the RCArray
> struct will never be put in a GC-managed context. It is rather sad that
> the language has no way to enforce such a restriction, and also that
> @safe cannot detect that this is a problem here.

Good point. This is already a danger in existing code, e.g. File uses reference counting. I just submitted https://issues.dlang.org/show_bug.cgi?id=14221.

Andrei


February 24, 2015
On 2/24/2015 6:56 AM, Steven Schveighoffer wrote:
> Actually, RCArray can never be allocated on GC, or you may corrupt memory. count
> may be non-null, and point at invalid memory when the dtor is called.

Just set count to null after the delete.

> Only safe way to do this is to C malloc/free the count. And yes, at that point,
> you need atomics.

No, RCArray is not intended for shared access between threads.

Shared containers and local containers are different enough that they merit being different types with different implementations altogether. Trying to just slap 'shared' on a container isn't going to work.

February 24, 2015
On Tuesday, 24 February 2015 at 15:51:42 UTC, Andrei Alexandrescu wrote:
> On 2/23/15 6:05 PM, Walter Bright wrote:
>> On 2/23/2015 5:41 PM, weaselcat wrote:
>>> On Monday, 23 February 2015 at 22:15:54 UTC, Walter Bright wrote:
>>>>    ref E opIndex(size_t i) return // here's the magic
>>>
>>> exactly what is this doing? I don't see this explained in DIP25 at all.
>>
>> The 'this' is passed by reference. So the 'return' is really 'return ref
>> this', the rest is explained by DIP25.
>
> We should amend DIP25 to explain that "this" is handled like a parameter. -- Andrei

Note that I'm not surprised by that behavior (this is indeed ref for struct). But this is not ref for classes, which will defeat DIP25.
February 24, 2015
On Monday, 23 February 2015 at 22:15:54 UTC, Walter Bright wrote:
> struct RCArray(E) {
>
>     this(E[] a)
>     {
>         array = a.dup;
>         start = 0;
>         end = a.length;
>         count = new int;
>         *count = 1;
>     }
>

This may not be @safe depending on the type of E. Also, this do not really solve the garbage problem

>     ~this()
>     {
>         if (count && --*count == 0)
>             delete array;
>     }
>
>     this(this)
>     {
>         if (count)
>             ++*count;
>     }
>

Here, we are going to run a bunch of null check because we can't enforce proper construction of struct. This is not per se a limitation of this piece of code, simply a long standing issue that should up.

Also, there are no way to ensure that the array is going to be bound to one thread, even without shared (exception, delegate, destructor, pure return). You need at least atomic increment and/or decrement.

But worse, as this can be assigned to static variables, you need to lock the whole damn struct unless you can update it atomically. The struct is 320 bits large and it is not updatable atomically in any arch I know of.

You need a mutex.

>     size_t length()
>     {
>         return end - start;
>     }
>
>     ref E opIndex(size_t i) return // here's the magic
>     {
>         return array[start + i];
>     }
>

I have to admit, that is pretty cool. But it is only gonna work with value types.

>     RCArray opSlice(size_t lwr, size_t upr)
>     {
>         RCArray result = this;
>         result.start = start + lwr;
>         result.end = start + upr;
>         return result;
>     }
>

You need bound check: end <= start <= array.length .

>   private:
>     E[] array;
>     size_t start, end;
>     int* count;
> }
>

February 24, 2015
On Monday, 23 February 2015 at 22:15:54 UTC, Walter Bright wrote:
> struct RCArray(E) {
>
>     this(E[] a)
>     {
>         array = a.dup;
>         start = 0;
>         end = a.length;
>         count = new int;
>         *count = 1;
>     }
>

This may not be @safe depending on the type of E. Also, this do not really solve the garbage problem as you need a GCed slice to build the RCed slice. That mean that without a nicer way to construct this, this is not solving any problem.

Also, I guess that even if you use the GC in there, you want to make sure that the function is @nogc .

>     ~this()
>     {
>         if (count && --*count == 0)
>             delete array;
>     }
>
>     this(this)
>     {
>         if (count)
>             ++*count;
>     }
>

Here, we are going to run a bunch of null check because we can't enforce proper construction of struct. This is not per se a limitation of this piece of code, simply a long standing issue that should up.

Also, there are no way to ensure that the array is going to be bound to one thread, even without shared (exception, delegate, destructor, pure return). You need at least atomic increment and/or decrement.

But worse, as this can be assigned to static variables, you need to lock the whole damn struct unless you can update it atomically. The struct is 320 bits large and it is not updatable atomically in any arch I know of.

You need a mutex.

>     size_t length()
>     {
>         return end - start;
>     }
>
>     ref E opIndex(size_t i) return // here's the magic
>     {
>         return array[start + i];
>     }
>

I have to admit, that is pretty cool. But it is only gonna work with value types.

>     RCArray opSlice(size_t lwr, size_t upr)
>     {
>         RCArray result = this;
>         result.start = start + lwr;
>         result.end = start + upr;
>         return result;
>     }
>

You need bound check: end <= start <= array.length .

>   private:
>     E[] array;
>     size_t start, end;
>     int* count;
> }
>

Generally, this need support for other array operations, is not usable in @nogc, has threading issues, but this is a cool thing you can do with "ref return".