February 05, 2014
On Tue, 04 Feb 2014 23:21:43 -0800, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:

> On 2/4/14, 11:20 PM, Jesse Phillips wrote:
>> On Tuesday, 4 February 2014 at 23:51:35 UTC, Andrei Alexandrescu wrote:
>>> and those who don't care say:
>>>
>>> auto x = fun().toGC();
>>
>> If I don't care, why would I place .toGC() at the end of my calls?
>
> This is the way it all works: RC+GC is more structure than GC, so you start with more structure and then optionally "forget" it.
>
>> What
>> reason do I have to go out of my way to request this?
>
> You use an API that uses e.g. string, not RCString.
>

The amount of existing-code-breakage here will be immense. Almost nothing, unless it uses Phobos only, will compile once this is released. It might even do more damage to D's still fragile public-image than the Phobos/Tango fiasco did.

>> What problems can
>> I expect when I forget to add it?
>
> Passing x around won't compile.
>

And we're supposed to want that? (See Above)

>
> Andrei


-- 
Adam Wilson
GitHub/IRC: LightBender
Aurora Project Coordinator
February 05, 2014
On 2/4/14, 11:38 PM, Kagamin wrote:
> My understanding was that ARC completely replaces GC and everything
> including slices becomes refcounted. Is having mixed incompatible GC and
> ARC code and trying to get them interoperate a good idea? Can you sell
> such mixed code to ARC guys?

In an RC system you must collect cycles. ARC leaves that to the programmer in the form of weak pointers. This particular idea automates that.

Andrei
February 05, 2014
On 2/5/2014 2:21 AM, Andrei Alexandrescu wrote:
> On 2/4/14, 11:20 PM, Jesse Phillips wrote:
>> On Tuesday, 4 February 2014 at 23:51:35 UTC, Andrei Alexandrescu wrote:
>>> and those who don't care say:
>>>
>>> auto x = fun().toGC();
>>
>> If I don't care, why would I place .toGC() at the end of my calls?
>
> This is the way it all works: RC+GC is more structure than GC, so you
> start with more structure and then optionally "forget" it.
>
>> What
>> reason do I have to go out of my way to request this?
>
> You use an API that uses e.g. string, not RCString.
>

IIUC, it sounds like it'd work like this:

// Library author provides either one or both of these:
T[] getFooGC() {...}
RCSlice!T getFooARC() {...}

And then if, for whatever reason, you have a RCSlice!T and need to pass it to something that expects a T[], then you can cancel the RC-ing via toGC.

If that's so, then I'd think lib authors could easily provide APIs that offer GC by default and ARC as an opt-in choice with templating:

enum WantARC { Yes, No }
auto getFoo(WantARC arc = WantARC.No)() {
    static if(arc == WantARC.No)
        return getFoo().toGC();
    else {
	RCSlice!T x = ...;
        return x;
    }
}
T[] fooGC = getFoo();
RCSlice!T = getFoo!(WantARC.Yes)();

And I imagine that boilerplate could be encapsulated in a utility template:

private RCSlice!T getFooARC() {
    RCSlice!T x = ...;
    return x;
}
template makeGCDefault(...){...magic happens here...}
alias getFoo = makeGCDefault!getFooARC;

February 05, 2014
On Wednesday, 5 February 2014 at 07:45:26 UTC, Andrei Alexandrescu wrote:
> In an RC system you must collect cycles. ARC leaves that to the programmer in the form of weak pointers. This particular idea automates that.

Having GC as a backup for cycles doesn't prevent making everything transparently refcounted. As to allocation strategies, the code can just use compatible allocation strategy.
February 05, 2014
On Tue, 04 Feb 2014 23:38:54 -0800, Kagamin <spam@here.lot> wrote:

> My understanding was that ARC completely replaces GC and everything including slices becomes refcounted. Is having mixed incompatible GC and ARC code and trying to get them interoperate a good idea? Can you sell such mixed code to ARC guys?

I've been asking myself those questions a lot over the past couple days. If GC backed ARC is such a brilliant idea, how come nobody has done it yet? I mean, it is a rather obvious solution. What I am confident of is that it is going to create a metric-ton of implementation gotchas for the compiler to sort out (as if we don't have enough open trouble tickets already) and it is going to pretty steeply increase the complexity of language. I thought the whole point of D was to not be C++, particularly in terms of complexity? All for a potentially undeliverable promise of a little more speed and fewer (not none) collection pauses.

I have a suspicion that the reason that it hasn't been done is that it doesn't actually improve the overall speed and quite possibly reduces it. It will take months of engineering effort just to ship the buggy initial functionality, and many more months and possibly years to sort out all the edge cases. This will in turn significantly reduce the bandwidth going towards fixing the features we already have that don't work right and improving the GC so that it isn't such an eyesore.

-- 
Adam Wilson
GitHub/IRC: LightBender
Aurora Project Coordinator
February 05, 2014
On Wednesday, 5 February 2014 at 09:05:02 UTC, Adam Wilson wrote:
> On Tue, 04 Feb 2014 23:38:54 -0800, Kagamin <spam@here.lot> wrote:
>
>> My understanding was that ARC completely replaces GC and everything including slices becomes refcounted. Is having mixed incompatible GC and ARC code and trying to get them interoperate a good idea? Can you sell such mixed code to ARC guys?
>
> I've been asking myself those questions a lot over the past couple days. If GC backed ARC is such a brilliant idea, how come nobody has done it yet? I mean, it is a rather obvious solution. What I am confident of is that it is going to create a metric-ton of implementation gotchas for the compiler to sort out (as if we don't have enough open trouble tickets already) and it is going to pretty steeply increase the complexity of language. I thought the whole point of D was to not be C++, particularly in terms of complexity? All for a potentially undeliverable promise of a little more speed and fewer (not none) collection pauses.
>
> I have a suspicion that the reason that it hasn't been done is that it doesn't actually improve the overall speed and quite possibly reduces it. It will take months of engineering effort just to ship the buggy initial functionality, and many more months and possibly years to sort out all the edge cases. This will in turn significantly reduce the bandwidth going towards fixing the features we already have that don't work right and improving the GC so that it isn't such an eyesore.

They have done it.

It is how the systems programming language Cedar, for the Mesa operating system at Xerox PARC used to work.

There are papers about it, that I already posted multiple times.

--
Paulo
February 05, 2014
On Wednesday, 5 February 2014 at 08:06:30 UTC, Kagamin wrote:
> On Wednesday, 5 February 2014 at 07:45:26 UTC, Andrei Alexandrescu wrote:
>> In an RC system you must collect cycles. ARC leaves that to the programmer in the form of weak pointers. This particular idea automates that.
>
> Having GC as a backup for cycles doesn't prevent making everything transparently refcounted. As to allocation strategies, the code can just use compatible allocation strategy.

Yes, the GC just needs to check roots for already released blocks, if I am not mistaken.
February 05, 2014
On Wednesday, 5 February 2014 at 12:12:01 UTC, Paulo Pinto wrote:
> Yes, the GC just needs to check roots for already released blocks, if I am not mistaken.

Perhaps I lost some explanation in the discussions, but does this improve performance in any way (other than allowing you to disable the GC)?
February 05, 2014
On 5 February 2014 09:51, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org
> wrote:

> Consider we add a library slice type called RCSlice!T. It would have the same primitives as T[] but would use reference counting through and through. When the last reference count is gone, the buffer underlying the slice is freed. The underlying allocator will be the GC allocator.
>
> Now, what if someone doesn't care about the whole RC thing and aims at convenience? There would be a method .toGC that just detaches the slice and disables the reference counter (e.g. by setting it to uint.max/2 or whatever).
>
> Then people who want reference counting say
>
> auto x = fun();
>
> and those who don't care say:
>
> auto x = fun().toGC();
>
>
> Destroy.


This doesn't excite me at all.
What about all other types of allocations? I don't want to mangle my types.
What about closures? What about allocations from phobos? What about
allocations from 3rd party libs that I have no control over?
I don't like that it requires additional specification, and special
treatment to have it detach to the GC.
There's nothing transparent about that. Another library solution like
RefCounted doesn't address the problem.

Counter question; why approach it this way?
Is there a reason that it needs to be of one kind or the other?


February 05, 2014
On Wednesday, 5 February 2014 at 12:12:01 UTC, Paulo Pinto wrote:
> Yes, the GC just needs to check roots for already released blocks, if I am not mistaken.

Yes, when they go to non-zero. This is the scheme used in PHPs ARC/GC solution, published in this paper:

http://researcher.watson.ibm.com/researcher/files/us-bacon/Bacon01Concurrent.pdf

You push roots that are candidates onto a queue when the counter is decreased to nonzero, then you do a concurrent scan when you have a set of roots to scan for cycles. So you probably need a clever ARC to reduce the scanning.

I think however, that for most programs that use ARC you don't do the GC at all. For long lived processes such as servers you might run it using heuristics (based on memory headroom or during the night). According to the paper above the amount of cyclic garbage tends to be low, but varies a great deal. They site another paper relating to Inferno that supposedly claims that RC caught 98% of the garbage. So the need for cycle collection is rather application specific.

Perl has traditionally not caught cycles at all, but then again perl programs tends to be short lived.

----

There are also some papers on near real time GC, such as Staccato and Metronome, on David F. Bacons list:

http://researcher.watson.ibm.com/researcher/view_pubs.php?person=us-bacon&t=1