Transitioning to a type aware Garbage Collector (page 2) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » Announce » Transitioning to a type aware Garbage Collector (page 2)

January 23, 2007

Re: Transitioning to a type aware Garbage Collector

Posted by kmk
in reply to Walter Bright

kmk

Posted in reply to Walter Bright

Walter Bright Wrote:

> To improve GC performance, we need to transition to a GC that is aware of the types of what it is allocating.
> 
> So problems can happen if the type of allocated memory is changed after it is allocated, via casting. The way to avoid this is to allocate as void[] memory that will be cast, as in:
> 
> struct Foo { ... };
> Foo[] f;
> 
> p = new void[100];
> f = cast(Foo[])p;	// ok
> byte[] b = cast(byte[])p;
> f = cast(Foo[])b;	// ok, GC still regards memory as void[]
> 
> rather than:
> 
> p = new byte[100];
> f = cast(Foo[])p;	// will likely eventually corrupt memory
> 
> The GC will regard void[] as a chunk of memory containing arbitrary data of arbitrary types, and so will treat it conservatively.
> 
> In general, regard memory allocated by the GC as anything other than void[] as being *strongly typed*.

Good stuff.

January 23, 2007

Re: Transitioning to a type aware Garbage Collector

Posted by Chris Miller
in reply to Walter Bright

Chris Miller

Posted in reply to Walter Bright

On Mon, 22 Jan 2007 18:38:05 -0500, Walter Bright <newshound@digitalmars.com> wrote:

> Andrei Alexandrescu (See Website For Email) wrote:
>> Sounds great. Any possibility to type memory post factum (useful in e.g. implementing allocators)? That is, allocate memory as void[] and then mark it to be of type Foo.
>
> Looks like that will have to be implemented.

What about allocating a bigger chunk and dividing it up as a few types?

I have an idea how to make this work, all memory allocated could have another chunk associated with it that denotes what are pointers. It would only need to use one bit per pointer size.

e.g.
struct Foo
{
   int x;
   int* y;
}
would only need 1 byte to track pointer info:  01000000
with a few bits either wasted or used for the allocation at the next address.

void[] would probably have all bits 1, and any non-pointer, like byte[], would be all bits 0.

This lets you very easily reuse memory. Memory reuse is important for some things, such as with realtime programming or general use of memory pools.

- Chris

January 23, 2007

Re: Transitioning to a type aware Garbage Collector

Posted by Frits van Bommel
in reply to Chris Miller

Frits van Bommel

Posted in reply to Chris Miller

Chris Miller wrote:
> On Mon, 22 Jan 2007 18:38:05 -0500, Walter Bright <newshound@digitalmars.com> wrote:
> 
>> Andrei Alexandrescu (See Website For Email) wrote:
>>> Sounds great. Any possibility to type memory post factum (useful in e.g. implementing allocators)? That is, allocate memory as void[] and then mark it to be of type Foo.
>>
>> Looks like that will have to be implemented.
> 
> What about allocating a bigger chunk and dividing it up as a few types?
> 
> I have an idea how to make this work, all memory allocated could have another chunk associated with it that denotes what are pointers. It would only need to use one bit per pointer size.

counter-example:

union Tree
{
    int leaf;
    Tree* inner_node;
}

You need 2 bits to catch all cases :P.
(At least, if you want to allow a moving GC; your scheme should work for a non-moving collector)

A while back I thought up the following encoding for a moving GC:

Two bits: "follow" and "adjustable"

follow   adjustable    Meaning:
   0        0          Not a pointer.
   0        1          Weak pointer[1]
   1        0          May be a pointer[2]
   1        1          A normal pointer

follow: The GC follows these references to determine reachability
adjustable: The GC may adjust this value. A moving GC may only move an object if all references to it are adjustable.

(You can also think of "follow" as "GC may read this" and "adjustable" as "GC may write this")

[1]: Possible future feature. These should be set to null if the referenced object is collected.
I'm not yet sure what to do on explicit deletion; would it be too much trouble to require weak pointers to be nulled in that case?
For Object instances, you could use the Object.notify[Un]Register functions to achieve that. Unfortunately, that solution doesn't work for anything but class instances.
Also: weak pointers should probably be disallowed from occurring in unions.

[2]: For "mixed" union locations, void[], etc.
Can possibly also be used (temporarily) for pinned normal pointers.

January 23, 2007

Re: Transitioning to a type aware Garbage Collector

Posted by Paul Findlay
in reply to Walter Bright

Paul Findlay

Posted in reply to Walter Bright

Walter Bright wrote:
> To improve GC performance, we need to transition to a GC that is aware of the types of what it is allocating.
Does this mean the D garbage collector will be an exact (or precise) garbage collector?

 - Paul

January 23, 2007

Re: Transitioning to a type aware Garbage Collector

Posted by Sean Kelly
in reply to Paul Findlay

Sean Kelly

Posted in reply to Paul Findlay

Paul Findlay wrote:
> Walter Bright wrote:
>> To improve GC performance, we need to transition to a GC that is aware of the types of what it is allocating.
> Does this mean the D garbage collector will be an exact (or precise) garbage collector?

If a block contains pointers then the entire block will still be scanned instead of just the pointers in that block, so It will still be a conservative collector but with fewer false positives.  I think this could be changed so that just the pointers were scanned, but there would likely be a noticeable performance hit for doing so.  If false positives are still a concern (and I'm not convinced they will be), a better approach may be to pursue a different GC design such as the CBGC mentioned yesterday.

Sean

January 23, 2007

Re: Transitioning to a type aware Garbage Collector

Posted by Frits van Bommel
in reply to Sean Kelly

Frits van Bommel

Posted in reply to Sean Kelly

Sean Kelly wrote:
> Paul Findlay wrote:
>> Walter Bright wrote:
>>> To improve GC performance, we need to transition to a GC that is aware of the types of what it is allocating.
>> Does this mean the D garbage collector will be an exact (or precise) garbage collector?
> 
> If a block contains pointers then the entire block will still be scanned instead of just the pointers in that block, so It will still be a conservative collector but with fewer false positives.

I thought that by "a GC that is aware of the types of what it is allocating" Walter meant a (mostly) precise GC.

In most cases, the locations of pointers can be deduced from the type, as long as the compiler generates appropriate RTTI. The exceptions (that I'm aware of) are unions with overlapping pointers and non-pointers, and void[]s (deliberately).

Even types of variables in stack frames can be determined I think, as long as you format them in the standard way (return eip & previous ebp at the start). Then it's basically a linked list:
-----
struct StackFrame
{
    // local variables are *before* this data (stack grows down)
    StackFrame* ebp;    // pointer to next StackFrame
    size_t eip;         // return address
}
-----
If you do a lookup on eip you should be able to determine what variables (or at least their "pointerness") the stack frame contains, with some help from compiler-generated metadata. Unknown stack frames (and data between the defined ranges of two consecutive stackframes) should be considered ambiguous.
Again, ambiguous union members and (static) void arrays are exceptions.

An alternative to eip-based lookup would be to push an additional value at the beginning of each function call, but that would
(a) break with non-D functions
and
(b) cost performance outside of the GC (even if the GC is disabled)

> I think this could be changed so that just the pointers were scanned, but there would likely be a noticeable performance hit for doing so.

I'm not so sure about that "noticeable performance hit" you speak of. I think that would depend on the format of the RTTI and the relative amounts of pointers & non-pointers. (the latter is application-specific, by the way...)
To be sure about this we'd need to run some benchmarks and see how it fares in practice...

> If false positives are still a concern (and I'm not convinced they will be), a better approach may be to pursue a different GC design such as the CBGC mentioned yesterday.

False positives are a concern for some people; look at the thread "The problem with the D GC" in digitalmars.D[1]. I'm pretty sure you've seen it already. For one thing, it contains about 10 of your posts ;).

Note: I haven't read that article on CBGC yet, but I plan to.

[1]: digitalmars.D:46407, http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=46407

January 23, 2007

Re: Transitioning to a type aware Garbage Collector

Posted by Pragma
in reply to Sean Kelly

Pragma

Posted in reply to Sean Kelly

Sean Kelly wrote:
> Paul Findlay wrote:
>> Walter Bright wrote:
>>> To improve GC performance, we need to transition to a GC that is aware of the types of what it is allocating.
>> Does this mean the D garbage collector will be an exact (or precise) garbage collector?
> 
> If a block contains pointers then the entire block will still be scanned instead of just the pointers in that block, so It will still be a conservative collector but with fewer false positives.  I think this could be changed so that just the pointers were scanned, but there would likely be a noticeable performance hit for doing so. 

I was going to say: it sounds an awful lot like Walter is going to have the GC  stamp each allocated block (or its corresponding record) with a field that amounts to "bool containsPointers".  I'm all for it.  Especially since the majority of allocations and re-allocations in a typical D program are for strings, images and other non-pointer related data.  This simple change will amount to much smaller GC pauses, especially in the video games department.

> If false positives
> are still a concern (and I'm not convinced they will be), a better
> approach may be to pursue a different GC design such as the CBGC
> mentioned yesterday.

I haven't had a chance to read the paper myself, but the abstract left me with the impression that it's a good middle-of-the road approach to things.

Just an aside here: I keep reading that as "CBGB".

-- 
- EricAnderton at yahoo

January 23, 2007

Re: Transitioning to a type aware Garbage Collector

Posted by Walter Bright
in reply to Paul Findlay

Walter Bright

Posted in reply to Paul Findlay

Paul Findlay wrote:
> Walter Bright wrote:
>> To improve GC performance, we need to transition to a GC that is aware of the types of what it is allocating.
> Does this mean the D garbage collector will be an exact (or precise) garbage collector?

No.

January 23, 2007

Re: Transitioning to a type aware Garbage Collector

Posted by Sean Kelly
in reply to Pragma

Sean Kelly

Posted in reply to Pragma

Pragma wrote:
> Sean Kelly wrote:
>> Paul Findlay wrote:
>>> Walter Bright wrote:
>>>> To improve GC performance, we need to transition to a GC that is aware of the types of what it is allocating.
>>> Does this mean the D garbage collector will be an exact (or precise) garbage collector?
>>
>> If a block contains pointers then the entire block will still be scanned instead of just the pointers in that block, so It will still be a conservative collector but with fewer false positives.  I think this could be changed so that just the pointers were scanned, but there would likely be a noticeable performance hit for doing so. 
> 
> I was going to say: it sounds an awful lot like Walter is going to have the GC  stamp each allocated block (or its corresponding record) with a field that amounts to "bool containsPointers".  I'm all for it.  

That's basically it.  The Tango GC already works this way, but it keys off of element size for arrays or block size for non-arrays.  Walter's will be using TypeInfo to determine whether the block contains any pointers, so it will be far more accurate.  I imagine a situation could be contrived whereby struct or class data could cause false positives, but I don't expect it to be much of an issue in practice.

> Especially since the majority of allocations and re-allocations in a typical D program are for strings, images and other non-pointer related data.  This simple change will amount to much smaller GC pauses, especially in the video games department.

Yup.  This was the original motivation for the changes in Tango.  I figured that by simply eliminating char strings from scanning, things would improve significantly.  And they pretty much have.

>  > If false positives
>  > are still a concern (and I'm not convinced they will be), a better
>  > approach may be to pursue a different GC design such as the CBGC
>  > mentioned yesterday.
> 
> I haven't had a chance to read the paper myself, but the abstract left me with the impression that it's a good middle-of-the road approach to things.
> 
> Just an aside here: I keep reading that as "CBGB".

Me too :-)


Sean

January 23, 2007

Re: Transitioning to a type aware Garbage Collector

Posted by Frits van Bommel
in reply to Walter Bright

Frits van Bommel

Posted in reply to Walter Bright

Walter Bright wrote:
> Paul Findlay wrote:
>> Walter Bright wrote:
>>> To improve GC performance, we need to transition to a GC that is aware of the types of what it is allocating.
>> Does this mean the D garbage collector will be an exact (or precise) garbage collector?
> 
> No.

So what exactly *does* it mean?
Is it, as Pragma put it, a "bool containsPointers" per block?
Or is it mostly-precise?

"Not exact/precise" still leaves a pretty big range...

And if it's just a bool, what was the reasoning behind this decision?
Ease of implementation? The overhead of the metadata needed to get to mostly-precise? All of the above?

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation