View mode: basic / threaded / horizontal-split · Log in · Help
January 23, 2007
Re: Transitioning to a type aware Garbage Collector
Walter Bright Wrote:

> To improve GC performance, we need to transition to a GC that is aware 
> of the types of what it is allocating.
> 
> So problems can happen if the type of allocated memory is changed after 
> it is allocated, via casting. The way to avoid this is to allocate as 
> void[] memory that will be cast, as in:
> 
> struct Foo { ... };
> Foo[] f;
> 
> p = new void[100];
> f = cast(Foo[])p;	// ok
> byte[] b = cast(byte[])p;
> f = cast(Foo[])b;	// ok, GC still regards memory as void[]
> 
> rather than:
> 
> p = new byte[100];
> f = cast(Foo[])p;	// will likely eventually corrupt memory
> 
> The GC will regard void[] as a chunk of memory containing arbitrary data 
> of arbitrary types, and so will treat it conservatively.
> 
> In general, regard memory allocated by the GC as anything other than 
> void[] as being *strongly typed*.

Good stuff.
January 23, 2007
Re: Transitioning to a type aware Garbage Collector
On Mon, 22 Jan 2007 18:38:05 -0500, Walter Bright  
<newshound@digitalmars.com> wrote:

> Andrei Alexandrescu (See Website For Email) wrote:
>> Sounds great. Any possibility to type memory post factum (useful in  
>> e.g. implementing allocators)? That is, allocate memory as void[] and  
>> then mark it to be of type Foo.
>
> Looks like that will have to be implemented.

What about allocating a bigger chunk and dividing it up as a few types?

I have an idea how to make this work, all memory allocated could have  
another chunk associated with it that denotes what are pointers. It would  
only need to use one bit per pointer size.

e.g.
struct Foo
{
   int x;
   int* y;
}
would only need 1 byte to track pointer info:  01000000
with a few bits either wasted or used for the allocation at the next  
address.

void[] would probably have all bits 1, and any non-pointer, like byte[],  
would be all bits 0.

This lets you very easily reuse memory. Memory reuse is important for some  
things, such as with realtime programming or general use of memory pools.

- Chris
January 23, 2007
Re: Transitioning to a type aware Garbage Collector
Chris Miller wrote:
> On Mon, 22 Jan 2007 18:38:05 -0500, Walter Bright 
> <newshound@digitalmars.com> wrote:
> 
>> Andrei Alexandrescu (See Website For Email) wrote:
>>> Sounds great. Any possibility to type memory post factum (useful in 
>>> e.g. implementing allocators)? That is, allocate memory as void[] and 
>>> then mark it to be of type Foo.
>>
>> Looks like that will have to be implemented.
> 
> What about allocating a bigger chunk and dividing it up as a few types?
> 
> I have an idea how to make this work, all memory allocated could have 
> another chunk associated with it that denotes what are pointers. It 
> would only need to use one bit per pointer size.

counter-example:

union Tree
{
    int leaf;
    Tree* inner_node;
}

You need 2 bits to catch all cases :P.
(At least, if you want to allow a moving GC; your scheme should work for 
a non-moving collector)



A while back I thought up the following encoding for a moving GC:

Two bits: "follow" and "adjustable"

follow   adjustable    Meaning:
   0        0          Not a pointer.
   0        1          Weak pointer[1]
   1        0          May be a pointer[2]
   1        1          A normal pointer

follow: The GC follows these references to determine reachability
adjustable: The GC may adjust this value. A moving GC may only move an 
object if all references to it are adjustable.

(You can also think of "follow" as "GC may read this" and "adjustable" 
as "GC may write this")


[1]: Possible future feature. These should be set to null if the 
referenced object is collected.
I'm not yet sure what to do on explicit deletion; would it be too much 
trouble to require weak pointers to be nulled in that case?
For Object instances, you could use the Object.notify[Un]Register 
functions to achieve that. Unfortunately, that solution doesn't work for 
anything but class instances.
Also: weak pointers should probably be disallowed from occurring in unions.

[2]: For "mixed" union locations, void[], etc.
Can possibly also be used (temporarily) for pinned normal pointers.
January 23, 2007
Re: Transitioning to a type aware Garbage Collector
Walter Bright wrote:
> To improve GC performance, we need to transition to a GC that is aware 
> of the types of what it is allocating.
Does this mean the D garbage collector will be an exact (or precise) 
garbage collector?

 - Paul
January 23, 2007
Re: Transitioning to a type aware Garbage Collector
Paul Findlay wrote:
> Walter Bright wrote:
>> To improve GC performance, we need to transition to a GC that is aware 
>> of the types of what it is allocating.
> Does this mean the D garbage collector will be an exact (or precise) 
> garbage collector?

If a block contains pointers then the entire block will still be scanned 
instead of just the pointers in that block, so It will still be a 
conservative collector but with fewer false positives.  I think this 
could be changed so that just the pointers were scanned, but there would 
likely be a noticeable performance hit for doing so.  If false positives 
are still a concern (and I'm not convinced they will be), a better 
approach may be to pursue a different GC design such as the CBGC 
mentioned yesterday.


Sean
January 23, 2007
Re: Transitioning to a type aware Garbage Collector
Sean Kelly wrote:
> Paul Findlay wrote:
>> Walter Bright wrote:
>>> To improve GC performance, we need to transition to a GC that is 
>>> aware of the types of what it is allocating.
>> Does this mean the D garbage collector will be an exact (or precise) 
>> garbage collector?
> 
> If a block contains pointers then the entire block will still be scanned 
> instead of just the pointers in that block, so It will still be a 
> conservative collector but with fewer false positives.

I thought that by "a GC that is aware of the types of what it is 
allocating" Walter meant a (mostly) precise GC.

In most cases, the locations of pointers can be deduced from the type, 
as long as the compiler generates appropriate RTTI. The exceptions (that 
I'm aware of) are unions with overlapping pointers and non-pointers, and 
void[]s (deliberately).

Even types of variables in stack frames can be determined I think, as 
long as you format them in the standard way (return eip & previous ebp 
at the start). Then it's basically a linked list:
-----
struct StackFrame
{
    // local variables are *before* this data (stack grows down)
    StackFrame* ebp;    // pointer to next StackFrame
    size_t eip;         // return address
}
-----
If you do a lookup on eip you should be able to determine what variables 
(or at least their "pointerness") the stack frame contains, with some 
help from compiler-generated metadata. Unknown stack frames (and data 
between the defined ranges of two consecutive stackframes) should be 
considered ambiguous.
Again, ambiguous union members and (static) void arrays are exceptions.

An alternative to eip-based lookup would be to push an additional value 
at the beginning of each function call, but that would
(a) break with non-D functions
and
(b) cost performance outside of the GC (even if the GC is disabled)

> I think this 
> could be changed so that just the pointers were scanned, but there would 
> likely be a noticeable performance hit for doing so.

I'm not so sure about that "noticeable performance hit" you speak of. I 
think that would depend on the format of the RTTI and the relative 
amounts of pointers & non-pointers. (the latter is application-specific, 
by the way...)
To be sure about this we'd need to run some benchmarks and see how it 
fares in practice...

> If false positives 
> are still a concern (and I'm not convinced they will be), a better 
> approach may be to pursue a different GC design such as the CBGC 
> mentioned yesterday.

False positives are a concern for some people; look at the thread "The 
problem with the D GC" in digitalmars.D[1]. I'm pretty sure you've seen 
it already. For one thing, it contains about 10 of your posts ;).

Note: I haven't read that article on CBGC yet, but I plan to.


[1]: digitalmars.D:46407, 
http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=46407
January 23, 2007
Re: Transitioning to a type aware Garbage Collector
Sean Kelly wrote:
> Paul Findlay wrote:
>> Walter Bright wrote:
>>> To improve GC performance, we need to transition to a GC that is 
>>> aware of the types of what it is allocating.
>> Does this mean the D garbage collector will be an exact (or precise) 
>> garbage collector?
> 
> If a block contains pointers then the entire block will still be scanned 
> instead of just the pointers in that block, so It will still be a 
> conservative collector but with fewer false positives.  I think this 
> could be changed so that just the pointers were scanned, but there would 
> likely be a noticeable performance hit for doing so. 


I was going to say: it sounds an awful lot like Walter is going to have the GC  stamp each allocated block (or its 
corresponding record) with a field that amounts to "bool containsPointers".  I'm all for it.  Especially since the 
majority of allocations and re-allocations in a typical D program are for strings, images and other non-pointer related 
data.  This simple change will amount to much smaller GC pauses, especially in the video games department.

> If false positives
> are still a concern (and I'm not convinced they will be), a better
> approach may be to pursue a different GC design such as the CBGC
> mentioned yesterday.

I haven't had a chance to read the paper myself, but the abstract left me with the impression that it's a good 
middle-of-the road approach to things.

Just an aside here: I keep reading that as "CBGB".

-- 
- EricAnderton at yahoo
January 23, 2007
Re: Transitioning to a type aware Garbage Collector
Paul Findlay wrote:
> Walter Bright wrote:
>> To improve GC performance, we need to transition to a GC that is aware 
>> of the types of what it is allocating.
> Does this mean the D garbage collector will be an exact (or precise) 
> garbage collector?

No.
January 23, 2007
Re: Transitioning to a type aware Garbage Collector
Pragma wrote:
> Sean Kelly wrote:
>> Paul Findlay wrote:
>>> Walter Bright wrote:
>>>> To improve GC performance, we need to transition to a GC that is 
>>>> aware of the types of what it is allocating.
>>> Does this mean the D garbage collector will be an exact (or precise) 
>>> garbage collector?
>>
>> If a block contains pointers then the entire block will still be 
>> scanned instead of just the pointers in that block, so It will still 
>> be a conservative collector but with fewer false positives.  I think 
>> this could be changed so that just the pointers were scanned, but 
>> there would likely be a noticeable performance hit for doing so. 
> 
> I was going to say: it sounds an awful lot like Walter is going to have 
> the GC  stamp each allocated block (or its corresponding record) with a 
> field that amounts to "bool containsPointers".  I'm all for it.  

That's basically it.  The Tango GC already works this way, but it keys 
off of element size for arrays or block size for non-arrays.  Walter's 
will be using TypeInfo to determine whether the block contains any 
pointers, so it will be far more accurate.  I imagine a situation could 
be contrived whereby struct or class data could cause false positives, 
but I don't expect it to be much of an issue in practice.

> Especially since the majority of allocations and re-allocations in a 
> typical D program are for strings, images and other non-pointer related 
> data.  This simple change will amount to much smaller GC pauses, 
> especially in the video games department.

Yup.  This was the original motivation for the changes in Tango.  I 
figured that by simply eliminating char strings from scanning, things 
would improve significantly.  And they pretty much have.

>  > If false positives
>  > are still a concern (and I'm not convinced they will be), a better
>  > approach may be to pursue a different GC design such as the CBGC
>  > mentioned yesterday.
> 
> I haven't had a chance to read the paper myself, but the abstract left 
> me with the impression that it's a good middle-of-the road approach to 
> things.
> 
> Just an aside here: I keep reading that as "CBGB".

Me too :-)


Sean
January 23, 2007
Re: Transitioning to a type aware Garbage Collector
Walter Bright wrote:
> Paul Findlay wrote:
>> Walter Bright wrote:
>>> To improve GC performance, we need to transition to a GC that is 
>>> aware of the types of what it is allocating.
>> Does this mean the D garbage collector will be an exact (or precise) 
>> garbage collector?
> 
> No.

So what exactly *does* it mean?
Is it, as Pragma put it, a "bool containsPointers" per block?
Or is it mostly-precise?

"Not exact/precise" still leaves a pretty big range...


And if it's just a bool, what was the reasoning behind this decision?
Ease of implementation? The overhead of the metadata needed to get to 
mostly-precise? All of the above?
1 2 3
Top | Discussion index | About this forum | D home