January 23, 2007
Walter Bright Wrote:

> To improve GC performance, we need to transition to a GC that is aware of the types of what it is allocating.
> 
> So problems can happen if the type of allocated memory is changed after it is allocated, via casting. The way to avoid this is to allocate as void[] memory that will be cast, as in:
> 
> struct Foo { ... };
> Foo[] f;
> 
> p = new void[100];
> f = cast(Foo[])p;	// ok
> byte[] b = cast(byte[])p;
> f = cast(Foo[])b;	// ok, GC still regards memory as void[]
> 
> rather than:
> 
> p = new byte[100];
> f = cast(Foo[])p;	// will likely eventually corrupt memory
> 
> The GC will regard void[] as a chunk of memory containing arbitrary data of arbitrary types, and so will treat it conservatively.
> 
> In general, regard memory allocated by the GC as anything other than void[] as being *strongly typed*.

Good stuff.

January 23, 2007
On Mon, 22 Jan 2007 18:38:05 -0500, Walter Bright <newshound@digitalmars.com> wrote:

> Andrei Alexandrescu (See Website For Email) wrote:
>> Sounds great. Any possibility to type memory post factum (useful in e.g. implementing allocators)? That is, allocate memory as void[] and then mark it to be of type Foo.
>
> Looks like that will have to be implemented.

What about allocating a bigger chunk and dividing it up as a few types?

I have an idea how to make this work, all memory allocated could have another chunk associated with it that denotes what are pointers. It would only need to use one bit per pointer size.

e.g.
struct Foo
{
   int x;
   int* y;
}
would only need 1 byte to track pointer info:  01000000
with a few bits either wasted or used for the allocation at the next address.

void[] would probably have all bits 1, and any non-pointer, like byte[], would be all bits 0.

This lets you very easily reuse memory. Memory reuse is important for some things, such as with realtime programming or general use of memory pools.

- Chris
January 23, 2007
Chris Miller wrote:
> On Mon, 22 Jan 2007 18:38:05 -0500, Walter Bright <newshound@digitalmars.com> wrote:
> 
>> Andrei Alexandrescu (See Website For Email) wrote:
>>> Sounds great. Any possibility to type memory post factum (useful in e.g. implementing allocators)? That is, allocate memory as void[] and then mark it to be of type Foo.
>>
>> Looks like that will have to be implemented.
> 
> What about allocating a bigger chunk and dividing it up as a few types?
> 
> I have an idea how to make this work, all memory allocated could have another chunk associated with it that denotes what are pointers. It would only need to use one bit per pointer size.

counter-example:

union Tree
{
    int leaf;
    Tree* inner_node;
}

You need 2 bits to catch all cases :P.
(At least, if you want to allow a moving GC; your scheme should work for a non-moving collector)



A while back I thought up the following encoding for a moving GC:

Two bits: "follow" and "adjustable"

follow   adjustable    Meaning:
   0        0          Not a pointer.
   0        1          Weak pointer[1]
   1        0          May be a pointer[2]
   1        1          A normal pointer

follow: The GC follows these references to determine reachability
adjustable: The GC may adjust this value. A moving GC may only move an object if all references to it are adjustable.

(You can also think of "follow" as "GC may read this" and "adjustable" as "GC may write this")


[1]: Possible future feature. These should be set to null if the referenced object is collected.
I'm not yet sure what to do on explicit deletion; would it be too much trouble to require weak pointers to be nulled in that case?
For Object instances, you could use the Object.notify[Un]Register functions to achieve that. Unfortunately, that solution doesn't work for anything but class instances.
Also: weak pointers should probably be disallowed from occurring in unions.

[2]: For "mixed" union locations, void[], etc.
Can possibly also be used (temporarily) for pinned normal pointers.
January 23, 2007
Walter Bright wrote:
> To improve GC performance, we need to transition to a GC that is aware of the types of what it is allocating.
Does this mean the D garbage collector will be an exact (or precise) garbage collector?

 - Paul
January 23, 2007
Paul Findlay wrote:
> Walter Bright wrote:
>> To improve GC performance, we need to transition to a GC that is aware of the types of what it is allocating.
> Does this mean the D garbage collector will be an exact (or precise) garbage collector?

If a block contains pointers then the entire block will still be scanned instead of just the pointers in that block, so It will still be a conservative collector but with fewer false positives.  I think this could be changed so that just the pointers were scanned, but there would likely be a noticeable performance hit for doing so.  If false positives are still a concern (and I'm not convinced they will be), a better approach may be to pursue a different GC design such as the CBGC mentioned yesterday.


Sean
January 23, 2007
Sean Kelly wrote:
> Paul Findlay wrote:
>> Walter Bright wrote:
>>> To improve GC performance, we need to transition to a GC that is aware of the types of what it is allocating.
>> Does this mean the D garbage collector will be an exact (or precise) garbage collector?
> 
> If a block contains pointers then the entire block will still be scanned instead of just the pointers in that block, so It will still be a conservative collector but with fewer false positives.

I thought that by "a GC that is aware of the types of what it is allocating" Walter meant a (mostly) precise GC.

In most cases, the locations of pointers can be deduced from the type, as long as the compiler generates appropriate RTTI. The exceptions (that I'm aware of) are unions with overlapping pointers and non-pointers, and void[]s (deliberately).

Even types of variables in stack frames can be determined I think, as long as you format them in the standard way (return eip & previous ebp at the start). Then it's basically a linked list:
-----
struct StackFrame
{
    // local variables are *before* this data (stack grows down)
    StackFrame* ebp;    // pointer to next StackFrame
    size_t eip;         // return address
}
-----
If you do a lookup on eip you should be able to determine what variables (or at least their "pointerness") the stack frame contains, with some help from compiler-generated metadata. Unknown stack frames (and data between the defined ranges of two consecutive stackframes) should be considered ambiguous.
Again, ambiguous union members and (static) void arrays are exceptions.

An alternative to eip-based lookup would be to push an additional value at the beginning of each function call, but that would
(a) break with non-D functions
and
(b) cost performance outside of the GC (even if the GC is disabled)

> I think this could be changed so that just the pointers were scanned, but there would likely be a noticeable performance hit for doing so.

I'm not so sure about that "noticeable performance hit" you speak of. I think that would depend on the format of the RTTI and the relative amounts of pointers & non-pointers. (the latter is application-specific, by the way...)
To be sure about this we'd need to run some benchmarks and see how it fares in practice...

> If false positives are still a concern (and I'm not convinced they will be), a better approach may be to pursue a different GC design such as the CBGC mentioned yesterday.

False positives are a concern for some people; look at the thread "The problem with the D GC" in digitalmars.D[1]. I'm pretty sure you've seen it already. For one thing, it contains about 10 of your posts ;).

Note: I haven't read that article on CBGC yet, but I plan to.


[1]: digitalmars.D:46407, http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=46407
January 23, 2007
Sean Kelly wrote:
> Paul Findlay wrote:
>> Walter Bright wrote:
>>> To improve GC performance, we need to transition to a GC that is aware of the types of what it is allocating.
>> Does this mean the D garbage collector will be an exact (or precise) garbage collector?
> 
> If a block contains pointers then the entire block will still be scanned instead of just the pointers in that block, so It will still be a conservative collector but with fewer false positives.  I think this could be changed so that just the pointers were scanned, but there would likely be a noticeable performance hit for doing so. 


I was going to say: it sounds an awful lot like Walter is going to have the GC  stamp each allocated block (or its corresponding record) with a field that amounts to "bool containsPointers".  I'm all for it.  Especially since the majority of allocations and re-allocations in a typical D program are for strings, images and other non-pointer related data.  This simple change will amount to much smaller GC pauses, especially in the video games department.

> If false positives
> are still a concern (and I'm not convinced they will be), a better
> approach may be to pursue a different GC design such as the CBGC
> mentioned yesterday.

I haven't had a chance to read the paper myself, but the abstract left me with the impression that it's a good middle-of-the road approach to things.

Just an aside here: I keep reading that as "CBGB".

-- 
- EricAnderton at yahoo
January 23, 2007
Paul Findlay wrote:
> Walter Bright wrote:
>> To improve GC performance, we need to transition to a GC that is aware of the types of what it is allocating.
> Does this mean the D garbage collector will be an exact (or precise) garbage collector?

No.
January 23, 2007
Pragma wrote:
> Sean Kelly wrote:
>> Paul Findlay wrote:
>>> Walter Bright wrote:
>>>> To improve GC performance, we need to transition to a GC that is aware of the types of what it is allocating.
>>> Does this mean the D garbage collector will be an exact (or precise) garbage collector?
>>
>> If a block contains pointers then the entire block will still be scanned instead of just the pointers in that block, so It will still be a conservative collector but with fewer false positives.  I think this could be changed so that just the pointers were scanned, but there would likely be a noticeable performance hit for doing so. 
> 
> I was going to say: it sounds an awful lot like Walter is going to have the GC  stamp each allocated block (or its corresponding record) with a field that amounts to "bool containsPointers".  I'm all for it.  

That's basically it.  The Tango GC already works this way, but it keys off of element size for arrays or block size for non-arrays.  Walter's will be using TypeInfo to determine whether the block contains any pointers, so it will be far more accurate.  I imagine a situation could be contrived whereby struct or class data could cause false positives, but I don't expect it to be much of an issue in practice.

> Especially since the majority of allocations and re-allocations in a typical D program are for strings, images and other non-pointer related data.  This simple change will amount to much smaller GC pauses, especially in the video games department.

Yup.  This was the original motivation for the changes in Tango.  I figured that by simply eliminating char strings from scanning, things would improve significantly.  And they pretty much have.

>  > If false positives
>  > are still a concern (and I'm not convinced they will be), a better
>  > approach may be to pursue a different GC design such as the CBGC
>  > mentioned yesterday.
> 
> I haven't had a chance to read the paper myself, but the abstract left me with the impression that it's a good middle-of-the road approach to things.
> 
> Just an aside here: I keep reading that as "CBGB".

Me too :-)


Sean
January 23, 2007
Walter Bright wrote:
> Paul Findlay wrote:
>> Walter Bright wrote:
>>> To improve GC performance, we need to transition to a GC that is aware of the types of what it is allocating.
>> Does this mean the D garbage collector will be an exact (or precise) garbage collector?
> 
> No.

So what exactly *does* it mean?
Is it, as Pragma put it, a "bool containsPointers" per block?
Or is it mostly-precise?

"Not exact/precise" still leaves a pretty big range...


And if it's just a bool, what was the reasoning behind this decision?
Ease of implementation? The overhead of the metadata needed to get to mostly-precise? All of the above?