Smart pointers instead of GC? (page 30)

February 05, 2014

Re: Smart pointers instead of GC?

Posted by Manu
in reply to Adam Wilson

Permalink

Manu

Posted in reply to Adam Wilson

Attachments:

text/html part

Permalink

On 4 February 2014 17:50, Adam Wilson <flyboynw@gmail.com> wrote:

> On Mon, 03 Feb 2014 22:12:18 -0800, Manu <turkeyman@gmail.com> wrote:
>
>>
>> So, the way I see this working in general, is that because in the majority
>> case, ARC would release memory immediately freeing up memory regularly, an
>> alloc that would have usually triggered a collect will happen far, far
>> less
>> often.
>> Practically, this means that the mark phase, which you say is the longest
>> phase, would be performed far less often.
>>
>>
> Well, if you want the ARC memory to share the heap with the GC the ARC memory will need to be tracked and marked by the GC. Otherwise the GC might try to allocate over the top of ARC memory and vice versa. This means that every time you run a collection you're still marking all ARC+GC memory, that will induce a pause. And the GC will still STW-collect on random allocations, and it will still have to Mark all ARC memory to make sure it's still valid. So yes, there will be fewer pauses, but they will still be there.
>

I'm not bothered in the least. At that stage, I will have turned the GC off, and I'll handle weak references myself. The GC crowd are then welcome to go on and continue improving the GC in whatever way they plan to do so.

 For me and my kind, I think the typical approach would be to turn off the
>> backing GC, and rely on marking weak references correctly.
>> This satisfies my requirements, and I also lose nothing in terms of
>> facilities in Phobos or other libraries (assuming that those libraries
>> have
>> also marked weak references correctly, which I expect phobos would
>> absolutely be required to do).
>>
>> This serves both worlds nicely, I retain access to libraries since they
>> use
>> the same allocator, the GC remains (and is run less often) for those that
>> want care-free memory management, and for RT/embedded users, they can
>> *practically* disable the GC, and take responsibility for weak references
>> themselves, which I'm happy to do.
>>
>>
>> Going the other way, GC is default with ARC support on the side, is not as
>>
>>> troublesome from an implementation standpoint because the GC does not
>>> have
>>> to be taught about the ARC memory. This means that ARC memory is free of
>>> being tracked by the GC and the GC has less overall memory to track which
>>> makes collection cycles faster. However, I don't think that the
>>> RT/Embedded
>>> guys will like this either, because it means you are still paying for the
>>> GC at some point, and they'll never know for sure if a library they are
>>> using is going to GC-allocate (and collect) when they don't expect it.
>>>
>>>
>> It also means that phobos and other libraries will use the GC because it's
>> the default. Correct, I don't see this as a valid solution. In fact, I
>> don't see it as a solution at all.
>> Where would implicit allocations like strings, concatenations, closures be
>> allocated?
>> I might as well just use RefCounted, I don't see this offering anything
>> much more than that.
>>
>> The only way I can see to make the ARC crowd happy is to completely replace
>>
>>> the GC entirely, along with the attendant language changes (new keywords,
>>> etc) that are probably along the lines of Rust. I strongly believe that
>>> the
>>> reason we've never seen a GC backed ARC system is because in practice it
>>> doesn't completely solve any of the problems with either system but costs
>>> quite a bit more than either system on it's own.
>>>
>>
>>
>> Really? [refer to my first paragraph in the reply]
>> It seems to me like ARC in front of a GC would result in the GC running
>> far
>> less collect cycles. And the ARC opposition would be absolved of having to
>> tediously mark weak references. Also, the GC opposition can turn the GC
>> off, and everything will still work (assuming they take care of their
>> cycles).
>> I don't really see the disadvantage here, except that the
>> only-GC-at-all-costs-I-won't-even-consider-ARC crowd would gain a
>> ref-count, but they would also gain the advantage where the GC would run
>> less collect cycles. That would probably balance out.
>>
>> I'm certainly it would be better than what we have, and in theory,
>> everyone
>> would be satisfied.
>>
>
> I'm not convinced. Mostly, because it's not likely going to be good news for the GC crowd. First, now there are two GC algos running unpredictably at different times, so while you *might* experience a perf win in ARC-only mode, we'll probably pay for it in GC-backed ARC mode, because you still have the chance at non-deterministic pause lengths with ARC and you have the GC pauses, and they happen at different times (GC pause on allocate, ARC pause on delete).

I don't understand. How is ARC non-deterministic? It seems entirely
deterministic to me. And if you want to, you can defer destruction to idle
time if you fancy.
Sure, the GC may pause from time to time, but you already have that anyway.
In this case, it'll run much less often.

Each individual pause length *might* be shorter, but there is no guarantee
> of that, but you end up paying more time on the whole than you would otherwise, remembering that with the GC on, the slow part of the collection has to be performed on all memory, not just the GC memory. So yes you might delete a bit less, but you're marking just as much, and you've still got those pesky ARC pauses to deal with.

Again, what ARC pauses? You mean object destruction time? Defer destruction if you find cleaning up on the spot to be expensive. GC will always have to scan all memory that is allocated. The fact that it's scanning ARC memory is precisely the point (catch cycles), and no additional cost in scan load, since that memory would all be GC memory anyway if ARC wasn't present.

And in basically everything but games you measure time spent on resource
> management as a portion of CPU cycles over time, not time spent per frame.
>

I suspect that spending less time doing GC scan's will result in a win overall. I have nothing to back that up, but it's a strong suspicion. Object destruction, which you seem to have a problem with under ARC, still happens even with a GC, just at some unknown time. It's not clear to me what the additional ARC cost is (other than the obvious inc and dec)... except it facilitates spending less time doing GC collection, which is probably a significant saving.

That ref-count you hand-wave can actually cost quite a lot. Implementing
> ARC properly can have some serious perf implications on pointer-ops and count-ops due to the need to make sure that everything is performed atomically.

I don't think this is true. There's no need to perform the ref fiddling
with atomic operations unless it's shared.
Everyone expects additional costs for synchronisation of shared objects.

And since this is a compiler thing, you can't say "Don't atomically operate
> here because I will never do anything that might race." because the compiler has to assume that at some point you will and the compiler cannot detect which mode it needs, or if a library will ruin your day. The only way you could get around this is with yet more syntax hints for the compiler like '@notatomic'.
>

Ummm. I think D makes a clear assumption that if something isn't marked
shared, that it doesn't have to compile code to protect against races.
That's the whole point of making shared an explicit attribute.
What you say is true in C++ which can't distinguish the cases, I don't
think it applies in D.

Very quickly ARC starts needing a lot of specialized syntax to make it work
> efficiently. And that's not good for "Modern Convenience".
>

Other than a weak attribute, what does it need? I'm not aware of anything else.

However, you don't have to perform everything atomically with a GC as the
> collect phase can always be performed concurrently on a separate thread and in most cases,

You don't have to perform everything atomically in ARC, and the GC is definitely like that now.

the mark phase can do stop-the-thread instead of stop-the-world and in some
> cases, it will never stop anything at all.

If I only have one core?
ARC doesn't need to mark at all, that cost is effectively distributed among
inc/dec ref's, and I'm confident ++ and -- operations performed only on
relevant data and only when it's copied is much cheaper than the GC
scanning the whole heap, and applying all that logic to determine what
things are pointers that it needs to follow and what not.

That can very easily result in pause times that are less than ARC on
> average. So now I've got a marginal improvement in the speed of ARC over the GC at best, and I still haven't paid for the GC.
>

What pause does the ARC produce?

Are you paying an ambient cost for the GC? When the ARC is doing it's job,
the GC wouldn't run. When too many un-handled cycles add up, the GC might
run a scan. If/when the GC does run a scan, it's precisely because you
_haven't_ already paid the cost via the ARC; it missed it due to cycle, no
cost was paid, no time was lost, it was just collected by the GC instead. I
don't see how you can sum the costs of the 2 collection mechanisms in this
case.
Either the ARC cleans it up, and the GC doesn't. Or the GC cleans it up
because the ARC didn't.

ARC destruction can easily be deferred, and unlike a mark phase which MUST be executed entirely in one step, it is possible to process deferred ARC object destruction at leisure, using only idle time for instance, and arbitrary lengths of time are easily supported.

And if we disable the GC to get the speed back

I still don't follow, we never lost anything, we only moved it.

we now require that everyone on the team learns the specialized rules and
> syntax for cyclic-refs. That might be relatively easy for someone coming from C++, but it will be difficult to teach someone coming from C#/Java, which is statistically the more likely person to come to D. And indeed would be more than enough to stop my company moving to D.
>

Don't turn the GC off. Indeed, that would be silly for most applications. You need to clarify how moving some collection cost from the GC to ARC makes it more expensive? I still can't see it. As far as I can see, everything the ARC cleans up is burden lifted from the GC, it could only result in the GC running less often, and ARC is not by nature more expensive than GC. I suspect ARC has a lower net cost, since ++/--, only on relevant things, and only when they're copied, is probably a lot less complicated work than a full mark phase, which touches everything, and follows many pointers, mostly unnecessarily. GC mark phase is quite a large workload, increases proportionally to the working data set, and it's also an absolute dcache disaster. No way inc/dec could compare to that workload, particularly as the heap grows large or nears capacity.

I've seen you say more than once that you can't bond with the GC, and
> believe me I understand, if you search back through the forums, you'll find one of the first things I did when I got here was complain about the GC. But what you're saying is "I can't bond with this horrible GC so we need to throw it out and rebuild the compiler to support ARC." All I am saying is "I can't bond with the horrible GC, so why don't we make a better one, that doesn't ruin responsiveness, because I've seen it done in other places and there is no technical reason D can't do the same, or better." Now that I've started to read the GC Handbook I am starting to suspect that using D, there might be a way to create a completely pause-less GC. Don't hold me too it, I don't know enough yet, but D has some unique capabilities that might just make it possible.

Well when you know, I'm all ears. Truly, I am. But I can't imagine a GC
that will work acceptably in environments such as limited memory, realtime,
single core, or various combinations of those.
I also get the feeling you haven't thought through the cost distribution of
a GC backed ARC solution. Either that, or I haven't done my math correctly,
which I'd be happy to have demonstrated where I went wrong.

On 4 February 2014 19:59, Don <x@nospam.com> wrote:

> On Tuesday, 4 February 2014 at 03:43:53 UTC, ed wrote:
>
>>
>>
>> Most of us know and understand the issues with ARC and that with a GC. Many of us have seen how they play out in systems level development. There is a good reason all serious driver and embedded development is done in C/C++.
>>
>> A language is the compiler+std as one unit. If Phobos depends on the GC, D depends on the GC. If Phobos isn't systems level ready, D isn't systems level ready. I've heard arguments here that you can turn off the GC, but that equates to rewriting functions that already exists in Phobos and not using any third-party library.
>>
>
> At Sociomantic, that is exactly what we have done. Phobos is almost completely unusable at the present time.
>
> I personally don't think that ARC would make much difference. The problem is that *far* too much garbage is being created. And it's completely unnecessary in most cases.
>

I agree here. A new collector can't distract from the task of reducing the amount of garbage produced in the first place.

I've never suggested ARC will make a wild difference in terms of
performance in the standard use case, that's not the point (although I do
imagine it would be faster).
I'm saying that ARC is not fundamentally incompatible with many kinds of
workloads, and offers the application a level of flexibility that's not
available under a GC alone. It's an enabler for some whole new industries
to use D with confidence.

Forums