View mode: basic / threaded / horizontal-split · Log in · Help
January 23, 2007
Re: Transitioning to a type aware Garbage Collector
Frits van Bommel wrote:
> Walter Bright wrote:
>> Paul Findlay wrote:
>>> Walter Bright wrote:
>>>> To improve GC performance, we need to transition to a GC that is 
>>>> aware of the types of what it is allocating.
>>> Does this mean the D garbage collector will be an exact (or precise) 
>>> garbage collector?
>>
>> No.
> 
> So what exactly *does* it mean?
> Is it, as Pragma put it, a "bool containsPointers" per block?
> Or is it mostly-precise?
> 
> "Not exact/precise" still leaves a pretty big range...

All it means is that a bit gets set per block meaning if it might 
contain pointers or does not contain pointers. In the future, it might 
give a list of which offsets contain pointers.

> And if it's just a bool, what was the reasoning behind this decision?
> Ease of implementation? The overhead of the metadata needed to get to 
> mostly-precise? All of the above?

It's a fairly significant improvement for a small change.
January 23, 2007
Re: Transitioning to a type aware Garbage Collector
Walter Bright wrote:
> Frits van Bommel wrote:
>> Walter Bright wrote:
>>> Paul Findlay wrote:
>>>> Walter Bright wrote:
>>>>> To improve GC performance, we need to transition to a GC that is 
>>>>> aware of the types of what it is allocating.
>>>> Does this mean the D garbage collector will be an exact (or precise) 
>>>> garbage collector?
>>>
>>> No.
>>
>> So what exactly *does* it mean?
>> Is it, as Pragma put it, a "bool containsPointers" per block?
>> Or is it mostly-precise?
>>
>> "Not exact/precise" still leaves a pretty big range...
> 
> All it means is that a bit gets set per block meaning if it might 
> contain pointers or does not contain pointers. In the future, it might 
> give a list of which offsets contain pointers.

Just shooting from the hip here: one possibility for such a list that won't eat much space would be having each bit of a 
word (or dword) represent the presence of a pointer at that 4-byte offset.  The last bit set could just signify "this is 
a big object, conservatively scan beyond bits*4 bytes".  This would take advantage of the fact that most objects 
typically aren't very big.

Example (16 bit list):
0000 0000 0000 0000 - zero = no pointers present
0000 0000 0000 1001 - pointers at offsets 0 and 12
1111 1111 1111 1111 - max = I'm all pointers
1000 0000 1000 0001 - pointers at offsets 0 and 28, and force scan at offsets 60 through end of block.

Another option would be to make the resolution of the high-order bits more logarithmic, making it behave something like 
a bloom filter.

> 
>> And if it's just a bool, what was the reasoning behind this decision?
>> Ease of implementation? The overhead of the metadata needed to get to 
>> mostly-precise? All of the above?
> 
> It's a fairly significant improvement for a small change.


-- 
- EricAnderton at yahoo
January 24, 2007
Re: Transitioning to a type aware Garbage Collector
Pragma wrote:
> Walter Bright wrote:
>> Frits van Bommel wrote:
>>> Walter Bright wrote:
>>>> Paul Findlay wrote:
>>>>> Walter Bright wrote:
>>>>>> To improve GC performance, we need to transition to a GC that is 
>>>>>> aware of the types of what it is allocating.
>>>>> Does this mean the D garbage collector will be an exact (or 
>>>>> precise) garbage collector?
>>>>
>>>> No.
>>>
>>> So what exactly *does* it mean?
>>> Is it, as Pragma put it, a "bool containsPointers" per block?
>>> Or is it mostly-precise?
>>>
>>> "Not exact/precise" still leaves a pretty big range...
>>
>> All it means is that a bit gets set per block meaning if it might 
>> contain pointers or does not contain pointers. In the future, it might 
>> give a list of which offsets contain pointers.
> 
> Just shooting from the hip here: one possibility for such a list that 
> won't eat much space would be having each bit of a word (or dword) 
> represent the presence of a pointer at that 4-byte offset.  The last bit 
> set could just signify "this is a big object, conservatively scan beyond 
> bits*4 bytes".  This would take advantage of the fact that most objects 
> typically aren't very big.

I think there should also be a way to annotate as "array, repeat". This 
would likely be useful since big objects are likely to be arrays.
This would require one bit for a flag and some way to note the element 
size, though.
If we go by the assumption that most structs/unions aren't very big, 
that could perhaps be implemented as a byte total: 1 bit = flag, 7 bits 
= size - 1 (since 0-length elements aren't useful anyway :) ) would 
allow elements up to 128 bytes, which should cover most arrays.

If we want no more than 4 bytes of meta-info[1], that would leave 23 
bits for pointer flags and one "big object/element" flag. Objects up to 
95 bytes could then be specified exactly (23 * 4 + 3 bytes, the last 3 
of which can't contain a pointer).

And yes, that's the optimum with these parameters; moving 1 bit from 
size to pointer flags leaves the size at a max of 64, which is under 95.

For 8 bytes of meta-info[2], the optimum would be
8 bits of elt/object size
1 bit array flag
1 bit big object flag
54 bits pointer flags
max exactly-specified size: 219 (54 * 4 + 3)


[1]: That's the minimum alignment for anything containing pointers on 32 
bit systems. This would allow it to be put at the start of the block 
without padding for align(4) contents.
[2]: Minimum align of anything containing ptrs on 64 bit systems 
(presumably). Can of course also be used on 32 bit systems; IIRC doubles 
and structs containing them already have align(8) as default.


[...]


Wait... Since size == 0 is useless, we can also use *that* as "big" 
'flag'. That would mean:
32 bits: 1 + 7 + 24, max exact size: 99
64 bits: 1 + 8 + 55, max exact size: 223

And if the array bit is 0, you can use the size bits as extra pointer 
flags (and one for the "big" flag), allowing normal objects up to 
123/251 bytes for 32/64 bits, respectively.


</obsess over details>
January 24, 2007
Re: Transitioning to a type aware Garbage Collector
Walter Bright wrote:
> Frits van Bommel wrote:
>> Walter Bright wrote:
>>> Paul Findlay wrote:
>>>> Walter Bright wrote:
>>>>> To improve GC performance, we need to transition to a GC that is 
>>>>> aware of the types of what it is allocating.
>>>> Does this mean the D garbage collector will be an exact (or precise) 
>>>> garbage collector?
>>>
>>> No.
>>
>> So what exactly *does* it mean?
>> Is it, as Pragma put it, a "bool containsPointers" per block?
>> Or is it mostly-precise?
>>
>> "Not exact/precise" still leaves a pretty big range...
> 
> All it means is that a bit gets set per block meaning if it might 
> contain pointers or does not contain pointers. In the future, it might 
> give a list of which offsets contain pointers.
> 
>> And if it's just a bool, what was the reasoning behind this decision?
>> Ease of implementation? The overhead of the metadata needed to get to 
>> mostly-precise? All of the above?
> 
> It's a fairly significant improvement for a small change.

Does this incremental change include the RTTI support needed to 
implement algorithms like CBGC (There are others which could use it)? Or 
are you leaving that for a later time?
January 24, 2007
Re: Transitioning to a type aware Garbage Collector
Kyle Furlong wrote:
> Does this incremental change include the RTTI support needed to 
> implement algorithms like CBGC (There are others which could use it)? Or 
> are you leaving that for a later time?

It's most of the info, but not quite all of it. I plan to add the rest 
later.
January 25, 2007
Re: Transitioning to a type aware Garbage Collector
I was hoping this would happen some day.  Thank you Walter!

Walter Bright wrote:
> To improve GC performance, we need to transition to a GC that is aware 
> of the types of what it is allocating.
> 
> So problems can happen if the type of allocated memory is changed after 
> it is allocated, via casting. The way to avoid this is to allocate as 
> void[] memory that will be cast, as in:
> 
> struct Foo { ... };
> Foo[] f;
> 
> p = new void[100];
> f = cast(Foo[])p;    // ok
> byte[] b = cast(byte[])p;
> f = cast(Foo[])b;    // ok, GC still regards memory as void[]
> 
> rather than:
> 
> p = new byte[100];
> f = cast(Foo[])p;    // will likely eventually corrupt memory
> 
> The GC will regard void[] as a chunk of memory containing arbitrary data 
> of arbitrary types, and so will treat it conservatively.
> 
> In general, regard memory allocated by the GC as anything other than 
> void[] as being *strongly typed*.
Next ›   Last »
1 2 3
Top | Discussion index | About this forum | D home