Idea #1 on integrating RC with GC (page 11) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » Idea #1 on integrating RC with GC (page 11)

February 05, 2014

Re: Idea #1 on integrating RC with GC

Posted by Adam D. Ruppe
in reply to Andrei Alexandrescu

Adam D. Ruppe

Posted in reply to Andrei Alexandrescu

On Wednesday, 5 February 2014 at 22:32:52 UTC, Andrei Alexandrescu wrote:
> On the face of it it seems odd that reference counted chunks of typed memory are deemed useless

I don't think anybody has actually said that. They have their places, it is just useless to talk about throwing them in everywhere.

> I should also add that imparting useful semantics to scope is much more difficult than it might seem.

I'm not so sure about that*, but the fact is scope would be enormously useful if it was implemented.

* Let's say it meant "assigning to any higher scope is prohibited". That should be trivially easy to check and ensures that variable itself doesn't escape. The tricky part would be preventing:

int[] global;
void foo(scope int[] a) {
   int[] b = a;
   global = b;
}

And that's easy to fix too: make ALL variables scope, unless specifically marked otherwise at the type declaration site (or if they are value types OR references to immutable data, which are very similar to value types in use).

The type declaration can be marked as a reference encapsulation and those are allowed to be passed up (if the type otherwise allows; e.g. postblit is not disabled).

That would break a fair chunk of existing code**, but it'd make memory management explicit, correct, and user extensible.

** I think moving to not null by default at the same time would be good, just rip off teh whole band aid.

February 06, 2014

Re: Idea #1 on integrating RC with GC

Posted by Manu
in reply to Michel Fortin

Manu

Posted in reply to Michel Fortin

Attachments:

text/html part

On 6 February 2014 06:23, Michel Fortin <michel.fortin@michelf.ca> wrote:

> On 2014-02-05 18:26:38 +0000, Andrei Alexandrescu < SeeWebsiteForEmail@erdani.org> said:
>
>  On 2/5/14, 7:23 AM, Michel Fortin wrote:
>>
>>> I don't think it makes much sense. ARC when used for D constructs should be treated an alternate GC algorithm, not a different kind of pointer.
>>>
>>
>> Why? The RC object has a different layout, so it may as well have a different type.
>>
>
> Well, it depends on your goal. If your goal is to avoid the garbage collector, you need all language constructs to use ARC. Having a single type in the language that relies on the GC defeats the purpose. What you want is simply to replace the current GC with another implantation, one that uses ARC. It shouldn't affect user code in any way, it's mostly an implementation detail (managed by the compiler and the runtime).
>
> If your goal is to have a deterministic lifetime for slices in some situations, then RCSlice as you proposes it is fine. That said, with a library type you'll have a hard time making the optimizer elide redundant increment/decrement pairs, so it'll never be optimal. I'm also not sure there's a lot of use cases for a deterministic slice lifetime working side by side with memory managed by the current GC.
>
> To me it seems you're trying to address a third problem here: that people have complained that Phobos relies on the GC too much. This comes from people who either don't want the GC to pause anything, or people who want to reduce memory allocations altogether. For the former group, replacing the current GC with an ARC+GC scheme at the language level, with the possibility to disable the GC, will fix most of Phobos (and most other libraries) with no code change required. For the later group, you need to make the API so that allocations are either not necessary, or when necessary provide a way to use a custom an allocator of some sort.


This.

February 06, 2014

Re: Idea #1 on integrating RC with GC

Posted by Manu
in reply to Adam D. Ruppe

Manu

Posted in reply to Adam D. Ruppe

Attachments:

text/html part

On 6 February 2014 09:16, Adam D. Ruppe <destructionator@gmail.com> wrote:

> On Wednesday, 5 February 2014 at 22:32:52 UTC, Andrei Alexandrescu wrote:
>
>> On the face of it it seems odd that reference counted chunks of typed memory are deemed useless
>>
>
> I don't think anybody has actually said that. They have their places, it is just useless to talk about throwing them in everywhere.
>
>  I should also add that imparting useful semantics to scope is much more
>> difficult than it might seem.
>>
>
> I'm not so sure about that*, but the fact is scope would be enormously useful if it was implemented.
>
> * Let's say it meant "assigning to any higher scope is prohibited". That should be trivially easy to check and ensures that variable itself doesn't escape. The tricky part would be preventing:
>
> int[] global;
> void foo(scope int[] a) {
>    int[] b = a;
>    global = b;
> }
>
>
> And that's easy to fix too: make ALL variables scope, unless specifically marked otherwise at the type declaration site (or if they are value types OR references to immutable data, which are very similar to value types in use).
>

Surely a simpler solution is to mark b scope too? Does that break-down at some point?

The type declaration can be marked as a reference encapsulation and those
> are allowed to be passed up (if the type otherwise allows; e.g. postblit is
> not disabled).
>
> That would break a fair chunk of existing code**, but it'd make memory management explicit, correct, and user extensible.
>
> ** I think moving to not null by default at the same time would be good, just rip off teh whole band aid.
>

February 06, 2014

Re: Idea #1 on integrating RC with GC

Posted by Mike
in reply to Michel Fortin

Mike

Posted in reply to Michel Fortin

On Wednesday, 5 February 2014 at 20:23:13 UTC, Michel Fortin wrote:

>
> What you want is simply to replace the current GC with another implantation, one that uses ARC. It shouldn't affect user code in any way, it's mostly an implementation detail (managed by the compiler and the runtime).

Yes.

>
> To me it seems you're trying to address a third problem here: that people have complained that Phobos relies on the GC too much.

Yes.

> This comes from people who either don't want the GC to pause anything, or people who want to reduce memory allocations altogether. For the former group, replacing the current GC with an ARC+GC scheme at the language level, with the possibility to disable the GC, will fix most of Phobos (and most other libraries) with no code change required.

Yes.

> For the later group, you need to make the API so that allocations are either not necessary, or when necessary provide a way to use a custom an allocator of some sort.

... and Yes

February 06, 2014

Re: Idea #1 on integrating RC with GC

Posted by Andrei Alexandrescu
in reply to Adam D. Ruppe

Andrei Alexandrescu

Posted in reply to Adam D. Ruppe

On 2/5/14, 3:16 PM, Adam D. Ruppe wrote:
> On Wednesday, 5 February 2014 at 22:32:52 UTC, Andrei Alexandrescu wrote:
>> On the face of it it seems odd that reference counted chunks of typed
>> memory are deemed useless
>
> I don't think anybody has actually said that. They have their places, it
> is just useless to talk about throwing them in everywhere.

I think part of the problem is a disconnect in assumptions and expectations. My idea was to simply make a first simple and obvious step toward improving the situation. Apparently that wasn't quite understood because ten people have eleven notions about what's desirable and even possible with regard to alternate memory management schemes.

One school of thought seems to be that D should be everything it is today, just with reference counting throughout instead of garbage collection. One build flag to rule them all would choose one or the other.

One other school of thought (to which I subscribe) is that one should take advantage of reference counting where appropriate within a GC milieu, regardless of more radical RC approaches that may be available.

>> I should also add that imparting useful semantics to scope is much
>> more difficult than it might seem.
>
> I'm not so sure about that*, but the fact is scope would be enormously
> useful if it was implemented.
>
> * Let's say it meant "assigning to any higher scope is prohibited". That
> should be trivially easy to check and ensures that variable itself
> doesn't escape. The tricky part would be preventing:
>
> int[] global;
> void foo(scope int[] a) {
>     int[] b = a;
>     global = b;
> }
>
>
> And that's easy to fix too: make ALL variables scope, unless
> specifically marked otherwise at the type declaration site (or if they
> are value types OR references to immutable data, which are very similar
> to value types in use).

Yah, that does break a bunch of code. Things like the type of "this" in class objects also comes to mind. Binding ref is also a related topic. All of these are complex matters, and I think a few simple sketches don't do them justice.

Andrei

February 06, 2014

Re: Idea #1 on integrating RC with GC

Posted by Michel Fortin
in reply to Andrei Alexandrescu

Michel Fortin

Posted in reply to Andrei Alexandrescu

On 2014-02-05 22:19:27 +0000, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> said:

> I want to make one positive step toward improving memory allocation in the D language.

I know. But I find your proposal confusing.

Perhaps this is just one piece in your master plan where everything will make sense once we have all the pieces. But this piece by itself makes no sense to me; I have no idea where you're going with it.

Is this the continuation of the old thread where you wanted ideas about how to eliminate hidden allocations in buildPath? Doesn't sound like it.

Or is this about implementing ARC in the language for those who can't use the GC? The changes for this need to be done at a lower level (compiler, runtime), and no change would be required in Phobos.

Or maybe this is to please the @nogc crowd by making things reference-counted by default? While I'm not a fan of @nogc, this will not work for them either as your proposal allocates from GC memory and this will sometime trigger a collect cycle.

Or maybe you're trying to address the following issue: if we change D's GC to use the ARC+GC scheme, what if I don't want to increment/decrement at pointer assignment and instead rely purely on mark and sweep for certain pointers? I'm not sure if someone asked for that yet, but I guess it could be a valid concern.

So, what problem are we trying to solve again?

-- 
Michel Fortin
michel.fortin@michelf.ca
http://michelf.ca

February 06, 2014

Re: Idea #1 on integrating RC with GC

Posted by Adam D. Ruppe
in reply to Andrei Alexandrescu

Adam D. Ruppe

Posted in reply to Andrei Alexandrescu

On Thursday, 6 February 2014 at 00:42:20 UTC, Andrei Alexandrescu wrote:
> One other school of thought (to which I subscribe) is that one should take advantage of reference counting where appropriate within a GC milieu, regardless of more radical RC approaches that may be available.

I agree with that stance, but I don't think there's a blanket rule there. I think RC freeing small slices will waste more time than it saves. Large allocations, on the other hand, might be worth it. So std.file.read for example returns a large block - that's a good candidate for refcounting since it might be accidentally subject to false pointers, or sit around too long creating needless memory pressure, etc.

(My png.d used to use large GC allocations internally and it ended up being problematic. I switched to malloc/free for this specific task and took care of that problem. But the little garbage created by stuff like toLower has never been a problem to me. (Well, except in a tight loop, but I wouldn't want to refcount in a tight loop either, reusing a static buffer is better tere.))

Anywho, I'd just go through on a case-by-case basis and tackle the big fish. Of course, a user could just do scope(exit) GC.free(ret); too.

> Yah, that does break a bunch of code. Things like the type of "this" in class objects also comes to mind.

I talked about this yesterday: this should be scope too, since an object doesn't know its own allocation method. If the class object is on the stack, escaping this is wrong.. and thanks to emplace, it might be on the stack without the object ever knowing. Thus it must be conservative.

> Binding ref is also a related topic. All of these are complex matters, and I think a few simple sketches don't do them justice.

I'd rather discuss these details than adding RCSlice and toGC everywhere for more cost than benefit.

Note that working scope would also help with library RC, in efficiency, correctness, and ease of use.

February 06, 2014

Re: Idea #1 on integrating RC with GC

Posted by Ola Fosheim Grøstad
in reply to Andrei Alexandrescu

Ola Fosheim Grøstad

Posted in reply to Andrei Alexandrescu

On Thursday, 6 February 2014 at 00:42:20 UTC, Andrei Alexandrescu
wrote:
> One school of thought seems to be that D should be everything it is today, just with reference counting throughout instead of garbage collection. One build flag to rule them all would choose one or the other.
>
> One other school of thought (to which I subscribe) is that one should take advantage of reference counting where appropriate within a GC milieu, regardless of more radical RC approaches that may be available.

The third school of thought is that one should be able to have
different types of allocation schemes without changing the object
type, but somehow tie it to the allocator and if needed to the
pointer type/storage class.

If you allocate as fully owned, it stays owned. If you allocate as shared with immediate release (RC) it stays shared. If you allocate as shared with delayed collection (GC) it stays that way.

The RC/GC metadata is a hidden feature and
allocator/runtime/compiler dependent component. Possibly you
won't have GC or RC, but one pure GC runtime, one integrated
RC/GC runtime, one pure ARC runtime, one integrated ARC/GC
runtime etc. That's probably most realistic since the allocation
metadata might be in conflict.

You should be able to switch to the runtime you care about if
needed as a compile time switch:

1. Pure Owned/borrowed: hard core performance, OS level
development

2. Manual RC (+GC): high throughput, low latency

3. ARC (+GC): ease of use, low throughput, low latency

4. GC: ease of use, high throughput, higher latency, long lived

5. Realtime GC

6. ??

I see no reason for having objects treated differently if they are "owned", just because they have a different type of ownership. If the function dealing with it does not own it, but borrows it, then it should not matter. The object should have the same layout, the ownership/allocation metadata should be encapsulated and hidden.

It is only when you transfer ownership that you need to know if the object is under RC or not.  You might not  even want to use counters in a particular implementation, maybe it is better to use a linked list in some scenarios. "reference counting" is a misnomer, it should be called "ownership tracker".

The current default is that all pointers are shared. What D needs is defined semantics for ownership. Then you can start switching one runtime for another one and have the compiler/runtime act as an efficient unit.

February 06, 2014

Re: Idea #1 on integrating RC with GC

Posted by Brad Anderson
in reply to Manu

Brad Anderson

Posted in reply to Manu

On Thursday, 6 February 2014 at 00:32:05 UTC, Manu wrote:
> On 6 February 2014 09:16, Adam D. Ruppe <destructionator@gmail.com> wrote:
>>
>> And that's easy to fix too: make ALL variables scope, unless specifically
>> marked otherwise at the type declaration site (or if they are value types
>> OR references to immutable data, which are very similar to value types in
>> use).
>>
>
> Surely a simpler solution is to mark b scope too? Does that break-down at
> some point?

scope in declarations is currently used as a storage class for
classes on the stack (a deprecated feature) so it couldn't be
used for class references until it's been removed from the
language for awhile. It does seem like it'd help the compiler a
lot if you disallowed assigning references to variables not also
marked as scope though.

It's probably hard to evaluate how big of a pain it'd be without
using it in real world code. It'd technically be a breaking
change but scope isn't implemented at all anyway so I think this
current users of scope would probably welcome a change that would
make it actually start working.

February 06, 2014

Re: Idea #1 on integrating RC with GC

Posted by Andrei Alexandrescu
in reply to Michel Fortin

Andrei Alexandrescu

Posted in reply to Michel Fortin

On 2/5/14, 4:53 PM, Michel Fortin wrote:
> On 2014-02-05 22:19:27 +0000, Andrei Alexandrescu
> <SeeWebsiteForEmail@erdani.org> said:
>
>> I want to make one positive step toward improving memory allocation in
>> the D language.
>
> I know. But I find your proposal confusing.
>
> Perhaps this is just one piece in your master plan where everything will
> make sense once we have all the pieces. But this piece by itself makes
> no sense to me; I have no idea where you're going with it.
>
> Is this the continuation of the old thread where you wanted ideas about
> how to eliminate hidden allocations in buildPath? Doesn't sound like it.

Actually buildPath is a good example because it concatenates strings. It should work transparently with RC and GC strings.

Andrei

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation