A Refcounted Array Type (page 3) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » A Refcounted Array Type (page 3)

February 24, 2015

Re: A Refcounted Array Type

Posted by ponce
in reply to Walter Bright

ponce

Posted in reply to Walter Bright

Does DIP25 means we can turn resources into RC types and be done with non-deterministic destructor calls by the GC, without using RefCounted or Unique?

On Monday, 23 February 2015 at 22:15:54 UTC, Walter Bright wrote:
> This is pretty straightforward. More could be done:
>
> 1. small array optimization
> 2. support for ranges as constructor args
> 3. present a range interface
> 4. support for malloc/free instead of GC
> 5. bounds checking
> 6. the array[] and the count could be allocated together
> 7. array[] could be just a pointer
>
> but the basic idea is there, I didn't want to hide it behind all the other flesh a professional type would have.
>
> Note the return in opIndex(). This is DIP25 at work!
>
> Compile:
>     dmd rcarray -unittest -main -dip25
>
> ===========================================
>
> struct RCArray(E) {
>
>     this(E[] a)
>     {
>         array = a.dup;
>         start = 0;
>         end = a.length;
>         count = new int;
>         *count = 1;
>     }
>
>     ~this()
>     {
>         if (count && --*count == 0)
>             delete array;
>     }
>
>     this(this)
>     {
>         if (count)
>             ++*count;
>     }
>
>     size_t length()
>     {
>         return end - start;
>     }
>
>     ref E opIndex(size_t i) return // here's the magic
>     {
>         return array[start + i];
>     }
>
>     RCArray opSlice(size_t lwr, size_t upr)
>     {
>         RCArray result = this;
>         result.start = start + lwr;
>         result.end = start + upr;
>         return result;
>     }
>
>   private:
>     E[] array;
>     size_t start, end;
>     int* count;
> }
>
> unittest
> {
>     static int[3] s = [7, 6, 4];
>     auto r = RCArray!int(s);
>     assert(r.length == 3);
>     assert(r[0] == 7);
>     assert(r[1] == 6);
>     assert(r[2] == 4);
>     assert(*r.count == 1);
>
>     {
>         auto r2 = r;
>         assert(r2[0] == 7);
>         assert(r2[1] == 6);
>         assert(r2[2] == 4);
>         assert(*r.count == 2);
>
>         r[1] = 3;
>         assert(r2[0] == 7);
>         assert(r2[1] == 3);
>         assert(r2[2] == 4);
>     }
>     assert(*r.count == 1);
>
>     auto r3 = r[1 .. 3];
>     r[2] = 9;
>     assert(r3[0] == 3);
>     assert(r3[1] == 9);
>
>   /+
>     ref int test(ref RCArray!int rr)
>     {
>         return rr[1]; // this gives error
>     }
>    +/
> }

February 24, 2015

Re: A Refcounted Array Type

Posted by Michel Fortin
in reply to Walter Bright

Michel Fortin

Posted in reply to Walter Bright

On 2015-02-23 22:15:46 +0000, Walter Bright said:

> int* count;
> 
> [...] if (count && --*count == 0) [...]

Careful!

This isn't memory safe and you have to thank the GC for it. If you ever use RCArray as a member variable in a class, the RCArray destructor is going to be called from a random thread when the class destructor is run. If some thread has a stack reference to the array you have a race.

You have to use an atomic counter unless you can prove the RCArray struct will never be put in a GC-managed context. It is rather sad that the language has no way to enforce such a restriction, and also that @safe cannot detect that this is a problem here.

-- 
Michel Fortin
michel.fortin@michelf.com
http://michelf.com/

February 24, 2015

Re: A Refcounted Array Type

Posted by Steven Schveighoffer
in reply to Michel Fortin

Steven Schveighoffer

Posted in reply to Michel Fortin

On 2/24/15 8:17 AM, Michel Fortin wrote:
> On 2015-02-23 22:15:46 +0000, Walter Bright said:
>
>> int* count;
>>
>> [...] if (count && --*count == 0) [...]
>
> Careful!
>
> This isn't memory safe and you have to thank the GC for it. If you ever
> use RCArray as a member variable in a class, the RCArray destructor is
> going to be called from a random thread when the class destructor is
> run. If some thread has a stack reference to the array you have a race.
>
> You have to use an atomic counter unless you can prove the RCArray
> struct will never be put in a GC-managed context. It is rather sad that
> the language has no way to enforce such a restriction, and also that
> @safe cannot detect that this is a problem here.
>

Actually, RCArray can never be allocated on GC, or you may corrupt memory. count may be non-null, and point at invalid memory when the dtor is called.

Only safe way to do this is to C malloc/free the count. And yes, at that point, you need atomics.

-Steve

February 24, 2015

Re: A Refcounted Array Type

Posted by Andrei Alexandrescu
in reply to Walter Bright

Andrei Alexandrescu

Posted in reply to Walter Bright

On 2/23/15 6:05 PM, Walter Bright wrote:
> On 2/23/2015 5:41 PM, weaselcat wrote:
>> On Monday, 23 February 2015 at 22:15:54 UTC, Walter Bright wrote:
>>>     ref E opIndex(size_t i) return // here's the magic
>>
>> exactly what is this doing? I don't see this explained in DIP25 at all.
>
> The 'this' is passed by reference. So the 'return' is really 'return ref
> this', the rest is explained by DIP25.

We should amend DIP25 to explain that "this" is handled like a parameter. -- Andrei

February 24, 2015

Re: A Refcounted Array Type

Posted by Andrei Alexandrescu
in reply to ponce

Andrei Alexandrescu

Posted in reply to ponce

On 2/24/15 3:14 AM, ponce wrote:
> Does DIP25 means we can turn resources into RC types and be done with
> non-deterministic destructor calls by the GC, without using RefCounted
> or Unique?

That's what we're aiming for. -- Andrei

February 24, 2015

Re: A Refcounted Array Type

Posted by Andrei Alexandrescu
in reply to Michel Fortin

Andrei Alexandrescu

Posted in reply to Michel Fortin

On 2/24/15 5:17 AM, Michel Fortin wrote:
> On 2015-02-23 22:15:46 +0000, Walter Bright said:
>
>> int* count;
>>
>> [...] if (count && --*count == 0) [...]
>
> Careful!
>
> This isn't memory safe and you have to thank the GC for it. If you ever
> use RCArray as a member variable in a class, the RCArray destructor is
> going to be called from a random thread when the class destructor is
> run. If some thread has a stack reference to the array you have a race.
>
> You have to use an atomic counter unless you can prove the RCArray
> struct will never be put in a GC-managed context. It is rather sad that
> the language has no way to enforce such a restriction, and also that
> @safe cannot detect that this is a problem here.

Good point. This is already a danger in existing code, e.g. File uses reference counting. I just submitted https://issues.dlang.org/show_bug.cgi?id=14221.

Andrei

February 24, 2015

Re: A Refcounted Array Type

Posted by Walter Bright
in reply to Steven Schveighoffer

Walter Bright

Posted in reply to Steven Schveighoffer

On 2/24/2015 6:56 AM, Steven Schveighoffer wrote:
> Actually, RCArray can never be allocated on GC, or you may corrupt memory. count
> may be non-null, and point at invalid memory when the dtor is called.

Just set count to null after the delete.

> Only safe way to do this is to C malloc/free the count. And yes, at that point,
> you need atomics.

No, RCArray is not intended for shared access between threads.

Shared containers and local containers are different enough that they merit being different types with different implementations altogether. Trying to just slap 'shared' on a container isn't going to work.

February 24, 2015

Re: A Refcounted Array Type

Posted by deadalnix
in reply to Andrei Alexandrescu

deadalnix

Posted in reply to Andrei Alexandrescu

On Tuesday, 24 February 2015 at 15:51:42 UTC, Andrei Alexandrescu wrote:
> On 2/23/15 6:05 PM, Walter Bright wrote:
>> On 2/23/2015 5:41 PM, weaselcat wrote:
>>> On Monday, 23 February 2015 at 22:15:54 UTC, Walter Bright wrote:
>>>>    ref E opIndex(size_t i) return // here's the magic
>>>
>>> exactly what is this doing? I don't see this explained in DIP25 at all.
>>
>> The 'this' is passed by reference. So the 'return' is really 'return ref
>> this', the rest is explained by DIP25.
>
> We should amend DIP25 to explain that "this" is handled like a parameter. -- Andrei

Note that I'm not surprised by that behavior (this is indeed ref for struct). But this is not ref for classes, which will defeat DIP25.

February 24, 2015

Re: A Refcounted Array Type

Posted by deadalnix
in reply to Walter Bright

deadalnix

Posted in reply to Walter Bright

On Monday, 23 February 2015 at 22:15:54 UTC, Walter Bright wrote:
> struct RCArray(E) {
>
>     this(E[] a)
>     {
>         array = a.dup;
>         start = 0;
>         end = a.length;
>         count = new int;
>         *count = 1;
>     }
>

This may not be @safe depending on the type of E. Also, this do not really solve the garbage problem

>     ~this()
>     {
>         if (count && --*count == 0)
>             delete array;
>     }
>
>     this(this)
>     {
>         if (count)
>             ++*count;
>     }
>

Here, we are going to run a bunch of null check because we can't enforce proper construction of struct. This is not per se a limitation of this piece of code, simply a long standing issue that should up.

Also, there are no way to ensure that the array is going to be bound to one thread, even without shared (exception, delegate, destructor, pure return). You need at least atomic increment and/or decrement.

But worse, as this can be assigned to static variables, you need to lock the whole damn struct unless you can update it atomically. The struct is 320 bits large and it is not updatable atomically in any arch I know of.

You need a mutex.

>     size_t length()
>     {
>         return end - start;
>     }
>
>     ref E opIndex(size_t i) return // here's the magic
>     {
>         return array[start + i];
>     }
>

I have to admit, that is pretty cool. But it is only gonna work with value types.

>     RCArray opSlice(size_t lwr, size_t upr)
>     {
>         RCArray result = this;
>         result.start = start + lwr;
>         result.end = start + upr;
>         return result;
>     }
>

You need bound check: end <= start <= array.length .

>   private:
>     E[] array;
>     size_t start, end;
>     int* count;
> }
>

February 24, 2015

Re: A Refcounted Array Type

Posted by deadalnix
in reply to Walter Bright

deadalnix

Posted in reply to Walter Bright

On Monday, 23 February 2015 at 22:15:54 UTC, Walter Bright wrote:
> struct RCArray(E) {
>
>     this(E[] a)
>     {
>         array = a.dup;
>         start = 0;
>         end = a.length;
>         count = new int;
>         *count = 1;
>     }
>

This may not be @safe depending on the type of E. Also, this do not really solve the garbage problem as you need a GCed slice to build the RCed slice. That mean that without a nicer way to construct this, this is not solving any problem.

Also, I guess that even if you use the GC in there, you want to make sure that the function is @nogc .

>     ~this()
>     {
>         if (count && --*count == 0)
>             delete array;
>     }
>
>     this(this)
>     {
>         if (count)
>             ++*count;
>     }
>

Here, we are going to run a bunch of null check because we can't enforce proper construction of struct. This is not per se a limitation of this piece of code, simply a long standing issue that should up.

Also, there are no way to ensure that the array is going to be bound to one thread, even without shared (exception, delegate, destructor, pure return). You need at least atomic increment and/or decrement.

But worse, as this can be assigned to static variables, you need to lock the whole damn struct unless you can update it atomically. The struct is 320 bits large and it is not updatable atomically in any arch I know of.

You need a mutex.

>     size_t length()
>     {
>         return end - start;
>     }
>
>     ref E opIndex(size_t i) return // here's the magic
>     {
>         return array[start + i];
>     }
>

I have to admit, that is pretty cool. But it is only gonna work with value types.

>     RCArray opSlice(size_t lwr, size_t upr)
>     {
>         RCArray result = this;
>         result.start = start + lwr;
>         result.end = start + upr;
>         return result;
>     }
>

You need bound check: end <= start <= array.length .

>   private:
>     E[] array;
>     size_t start, end;
>     int* count;
> }
>

Generally, this need support for other array operations, is not usable in @nogc, has threading issues, but this is a cool thing you can do with "ref return".

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation