RCArray is unsafe (page 9) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » RCArray is unsafe (page 9)

March 03, 2015

Re: RCArray is unsafe

Posted by Zach the Mystic
in reply to Andrei Alexandrescu

Zach the Mystic

Posted in reply to Andrei Alexandrescu

On Tuesday, 3 March 2015 at 18:48:36 UTC, Andrei Alexandrescu wrote:
> On 3/3/15 9:00 AM, Zach the Mystic wrote:
>> On Tuesday, 3 March 2015 at 16:31:07 UTC, Andrei Alexandrescu wrote:
>>>> I was dazzed, but I'm not anymore. I wrote my concern here:
>>>>
>>>> http://forum.dlang.org/post/ylpaqhnuiczfgfpqjuww@forum.dlang.org
>>>
>>> There's a misunderstanding here. The object being assigned keeps a
>>> trailing list of past values and defers their deallocation to
>>> destruction. -- Andrei
>>
>> So you need an extra pointer per instance?
>
> Yah, or define your type to be single-assignment (probably an emerging idiom). You can massage the extra pointer with other data thus reducing its cost.
>
>> Isn't that a big price to
>> pay? Is the only problem we're still trying to solve aliasing which is
>> not recognized as such and therefore doesn't bump the refcounter like it
>> should? An extra pointer would be overkill for that. Isn't it better to
>> just recognize the aliasing when it happens?
>
> It's all tradeoffs. This has runtime overhead.

Isn't allocating and collecting a freelist also overhead?

> A static analysis would have the challenges of being permissive enough, cheap enough, not add notational overhead, etc. etc.

It's certainly permissive: you can do anything, and compiler wraps uncertain operations with add/release cycles automatically. These are: passing a global as a mutable reference to an impure function; aliasing the same variable in two parameters with itself. The unoptimized lowerings would be:

{
  auto tmp = myGlobal; // bump count
  impureFun(myGlobal);
}  // tmp destroyed, --count

{
  auto tmp2 = c; // bump count
  fun(c, c);
} // --count

The only addition is an optimization where the compiler elides the assignments and calls the add/release cycles directly.

March 03, 2015

Re: RCArray is unsafe

Posted by Andrei Alexandrescu
in reply to Zach the Mystic

Andrei Alexandrescu

Posted in reply to Zach the Mystic

On 3/3/15 12:35 PM, Zach the Mystic wrote:
> On Tuesday, 3 March 2015 at 18:48:36 UTC, Andrei Alexandrescu wrote:
>> On 3/3/15 9:00 AM, Zach the Mystic wrote:
>>> On Tuesday, 3 March 2015 at 16:31:07 UTC, Andrei Alexandrescu wrote:
>>>>> I was dazzed, but I'm not anymore. I wrote my concern here:
>>>>>
>>>>> http://forum.dlang.org/post/ylpaqhnuiczfgfpqjuww@forum.dlang.org
>>>>
>>>> There's a misunderstanding here. The object being assigned keeps a
>>>> trailing list of past values and defers their deallocation to
>>>> destruction. -- Andrei
>>>
>>> So you need an extra pointer per instance?
>>
>> Yah, or define your type to be single-assignment (probably an emerging
>> idiom). You can massage the extra pointer with other data thus
>> reducing its cost.
>>
>>> Isn't that a big price to
>>> pay? Is the only problem we're still trying to solve aliasing which is
>>> not recognized as such and therefore doesn't bump the refcounter like it
>>> should? An extra pointer would be overkill for that. Isn't it better to
>>> just recognize the aliasing when it happens?
>>
>> It's all tradeoffs. This has runtime overhead.
>
> Isn't allocating and collecting a freelist also overhead?

No. I don't have time now for a proof of concept and it seems everybody wants to hypothesize about code that doesn't exist instead of writing code and then discussing it.

>> A static analysis would have the challenges of being permissive
>> enough, cheap enough, not add notational overhead, etc. etc.
>
> It's certainly permissive: you can do anything, and compiler wraps
> uncertain operations with add/release cycles automatically. These are:
> passing a global as a mutable reference to an impure function; aliasing
> the same variable in two parameters with itself. The unoptimized
> lowerings would be:
>
> {
>    auto tmp = myGlobal; // bump count
>    impureFun(myGlobal);
> }  // tmp destroyed, --count
>
> {
>    auto tmp2 = c; // bump count
>    fun(c, c);
> } // --count
>
> The only addition is an optimization where the compiler elides the
> assignments and calls the add/release cycles directly.

Do you have something reviewable, or just your own past posts?

For the time being I want to move forward with DIP25 and deferred freeing.


Andrei

March 03, 2015

Re: RCArray is unsafe

Posted by deadalnix
in reply to Andrei Alexandrescu

deadalnix

Posted in reply to Andrei Alexandrescu

On Tuesday, 3 March 2015 at 18:49:43 UTC, Andrei Alexandrescu wrote:
> On 3/3/15 10:22 AM, Zach the Mystic wrote:
>> On Tuesday, 3 March 2015 at 17:40:59 UTC, Marc Schütz wrote:
>>> All instances need to carry a pointer to refcount anyway, so the
>>> freelist could just be stored next to the refcount. The idea of
>>> creating that list, however, is more worrying, because it again
>>> involves allocations. It can get arbitrarily long.
>>
>> If the last RcType is a global, will the list ever get freed at all?
>
> No. Making a global variable of a reference counted type would be poor design.
>
>
> Andrei

Considering glopbal here means thread local, I think it does make sense to use global for various forms of cache.

March 03, 2015

Re: RCArray is unsafe

Posted by Michel Fortin
in reply to Walter Bright

Michel Fortin

Posted in reply to Walter Bright

On 2015-03-02 05:57:12 +0000, Walter Bright said:

> On 3/1/2015 12:51 PM, Michel Fortin wrote:
>> That's actually not enough. You'll have to block access to global variables too:
> 
> Hmm. That's not so easy to solve.

Let's see. The problem is that 'ref' variables get invalidated by some operations. Perhaps we could just tell the compiler that doing this or that will makes 'ref' variables unsafe after that point. Let's try adding a @refbreaking function attribute, and apply it to RCArray.opAssign:

	S s;

	void main() {
		s.array = RCArray!T([T()]);
		foo(s.array[0]);
	}
	void foo(ref T t) {
		t.doSomething(); // all is fine
		s.array = RCArray!T([]); // here, RCArray.opAssign would be labeled @refbreaking
		t.doSomething(); // cannot use 'ref' variable after doing a refbreaking operation
	}

Also, the above shouldn't compile anyway because @refbreaking would need to be transitive, and it follows that `foo` would need to be @refbreaking too:

	void foo(ref T t) @refbreaking {
		...
	}

which in turn means that `main` too needs to be @refbreaking.

So what needs to be @refbreaking? Anything that might deallocate. This includes `opRelease` if it deallocates when the counter reaches zero. Although you could implement `opRelease` in a way that sends the memory block to an autorelease pool of some kind, in which case draining the autorelease pool at a later point would be @refbreaking.

-- 
Michel Fortin
michel.fortin@michelf.com
http://michelf.com/

March 04, 2015

Re: RCArray is unsafe

Posted by Michel Fortin
in reply to Michel Fortin

Michel Fortin

Posted in reply to Michel Fortin

On 2015-03-03 22:39:12 +0000, Michel Fortin said:

> Let's see. The problem is that 'ref' variables get invalidated by some operations. Perhaps we could just tell the compiler that doing this or that will makes 'ref' variables unsafe after that point. Let's try adding a @refbreaking function attribute, and apply it to RCArray.opAssign:
> 
> 	S s;
> 
> 	void main() {
> 		s.array = RCArray!T([T()]);
> 		foo(s.array[0]);
> 	}
> 	void foo(ref T t) {
> 		t.doSomething(); // all is fine
> 		s.array = RCArray!T([]); // here, RCArray.opAssign would be labeled @refbreaking
> 		t.doSomething(); // cannot use 'ref' variable after doing a refbreaking operation
> 	}
> 
> Also, the above shouldn't compile anyway because @refbreaking would need to be transitive, and it follows that `foo` would need to be @refbreaking too:
> 
> 	void foo(ref T t) @refbreaking {
> 		...
> 	}
> 
> which in turn means that `main` too needs to be @refbreaking.
> 
> So what needs to be @refbreaking? Anything that might deallocate. This includes `opRelease` if it deallocates when the counter reaches zero. Although you could implement `opRelease` in a way that sends the memory block to an autorelease pool of some kind, in which case draining the autorelease pool at a later point would be @refbreaking.

And giving it some more thought, @refbreaking also has the interesting property that any pair of opAddRef/opRelease with no @refbreaking call between them can be elided safely.

-- 
Michel Fortin
michel.fortin@michelf.com
http://michelf.com/

March 04, 2015

Re: RCArray is unsafe

Posted by Zach the Mystic
in reply to Andrei Alexandrescu

Zach the Mystic

Posted in reply to Andrei Alexandrescu

On Tuesday, 3 March 2015 at 21:37:20 UTC, Andrei Alexandrescu wrote:
> On 3/3/15 12:35 PM, Zach the Mystic wrote:
>> Isn't allocating and collecting a freelist also overhead?
>
> No. I don't have time now for a proof of concept and it seems everybody wants to hypothesize about code that doesn't exist instead of writing code and then discussing it.

Okay.

>> The unoptimized lowerings would be:
>>
>> {
>>   auto tmp = myGlobal; // bump count
>>   impureFun(myGlobal);
>> }  // tmp destroyed, --count
>>
>> {
>>   auto tmp2 = c; // bump count
>>   fun(c, c);
>> } // --count
>>
>> The only addition is an optimization where the compiler elides the
>> assignments and calls the add/release cycles directly.
>
> Do you have something reviewable, or just your own past posts?

Just my own past posts. My suggestion is based on the compiler doing all the work. I don't know how it could be tested without hacking the compiler.

> For the time being I want to move forward with DIP25 and deferred freeing.

That's fine. I like DIP25. It's a start towards stronger safety guarantees. While I'm pretty sure the runtime costs of my proposal are lower than yours, they do require compiler hacking, which means they can wait.

March 04, 2015

Re: RCArray is unsafe

Posted by Zach the Mystic
in reply to Zach the Mystic

Zach the Mystic

Posted in reply to Zach the Mystic

On Wednesday, 4 March 2015 at 03:46:36 UTC, Zach the Mystic wrote:
> Just my own past posts. My suggestion is based on the compiler doing all the work. I don't know how it could be tested without hacking the compiler.

I think that part of the fear of my idea is that I want structs to get some of the behavior suggested in DIP74 for classes, i.e. the compiler inserts calls to opAddRef/opRelease on its own at certain times. Since structs only have postblits and destructors, there's no canonical way to call them as separate functions. The behavior I'm suggesting would only be good if you had a refcounted type, which means it's superfluous if not harmful to insert it "just because" in other types of structs.

If it turns out that some of the behavior desirable for refcounted classes is useful for structs too, it may be necessary to hint to the complier that a struct is indeed of the refcounted type. For example, "void opAddRef();" and "void opRelease();" could be specially recognized, with no definitions even permitted (error on attempt), implying "alias opAddRef this(this);", "alias opRelease ~this;".

March 04, 2015

Re: RCArray is unsafe

Posted by Manu
in reply to Zach the Mystic

Manu

Posted in reply to Zach the Mystic

On 4 March 2015 at 01:07, Zach the Mystic via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
> On Tuesday, 3 March 2015 at 08:04:25 UTC, Manu wrote:
>>
>> My immediate impression on this problem:
>>
>> s.array[0] is being passed to foo from main. s does not belong to main
>> (is global), and main does not hold have a reference to s.array.
>> Shouldn't main just need to inc/dec array around the call to foo when
>> passing un-owned references down the call tree.
>> It seems to me that there always needs to be a reference _somewhere_
>> on the stack for anything being passed down the call tree (unless the
>> function is pure). Seems simplest to capture a stack ref at the top
>> level, then as it's received as arguments to each callee, it's
>> effectively owned by those functions and they don't need to worry
>> anymore.
>>
>> So, passing global x to some function; inc/dec x around the function call that it's passed to...? Then the stack has its own reference, and the global reference can go away safely.
>
>
> This is my position too.
>
> There is another problem being discussed now, however, having to do with references to non-rc'd subcomponents of an Rc'd type.

Well you can't get to a subcomponent if not through it's owner.
If the question is about passing RC objects members to functions, then
the solution is the same as above, the stack needs a reference to the
parent before it can pass a pointer to it's member down the line for
the same reasons.
The trouble then is what if that member pointer escapes? Well I'd
imagine that it needs to be a scope pointer (I think we all agree RC
relies on scope). So a raw pointer to some member of an RC object must
be scope(*). That it can't escape, combined with knowledge that the
stack has a reference to it's owner, guarantees that it won't
disappear.

March 04, 2015

Re: RCArray is unsafe

Posted by deadalnix
in reply to Zach the Mystic

deadalnix

Posted in reply to Zach the Mystic

On Wednesday, 4 March 2015 at 03:46:36 UTC, Zach the Mystic wrote:
> That's fine. I like DIP25. It's a start towards stronger safety guarantees. While I'm pretty sure the runtime costs of my proposal are lower than yours, they do require compiler hacking, which means they can wait.

I don't think that it is fine.

At this point we need to :
 - Not free anything as long as something is alive.
 - Can't recycle memory.
 - Keep track of allocated chunk to be able to free them (ie implementing malloc on top of malloc).

It means that RC is attached to an ever growing arena. Code that would manipulate RCArray and append to it on a regular manner must expect some impressive memory consumption.

Even if we manage to do this in phobos (I'm sure we can) it is pretty much guaranteed at this point that noone else will, at least safely. The benefit is reduced because of the bookeeping that need to be done for memory to be freed in addition to reference count themselves.

The #1 argument for DIP25 compared to alternative proposal was its simplicity. I assume at this point that we have empirical evidence that this is NOT the case.

March 04, 2015

Re: RCArray is unsafe

Posted by Ivan Timokhin
in reply to Andrei Alexandrescu

Ivan Timokhin

Posted in reply to Andrei Alexandrescu

Excuse me if I miss something obvious, but:

    void main()
    {
        auto arr = RCArray!int([0]);
        foo(arr, arr[0]);
    }

    void foo(ref RCArray!int arr, ref int val)
    {
        {
            auto copy = arr; //arr's (and copy's) reference counts are both 2
            arr = RCArray!int([]); // There is another owner, so arr
                                   // forgets about the old payload
        } // Last owner of the array ('copy') gets destroyed and happily
          // frees the payload.
        val = 3; // Oops.
    }

On Mon, Mar 02, 2015 at 03:22:52PM -0800, Andrei Alexandrescu wrote:
> On 3/2/15 2:57 PM, Walter Bright wrote:
> > His insight was that the deletion of the payload occurred before the end of the lifetime of the RC object, and that this was the source of the problem. If the deletion of the payload occurs during the destructor call, rather than the postblit, then although the ref count of the payload goes to zero, it doesn't actually get deleted.
> >
> > I.e. the postblit manipulates the ref count, but does NOT do payload deletions. The destructor checks the ref count, if it is zero, THEN it does the payload deletion.
> >
> > Pretty dazz idea, dontcha think? And DIP25 still stands unscathed :-)
> >
> > Unless, of course, we missed something obvious.
> 
> And since an RCArray may undergo several assignments during its lifetime (thus potentially needing to free several chunks of memory), the arrays to be destroyed will be kept in a freelist-style structure. Destructor walks the freelist and frees the chunks.
> 
> Andrei

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation