January 23, 2007 Re: Transitioning to a type aware Garbage Collector | ||||
---|---|---|---|---|
| ||||
Posted in reply to Frits van Bommel | Frits van Bommel wrote: > Walter Bright wrote: >> Paul Findlay wrote: >>> Walter Bright wrote: >>>> To improve GC performance, we need to transition to a GC that is aware of the types of what it is allocating. >>> Does this mean the D garbage collector will be an exact (or precise) garbage collector? >> >> No. > > So what exactly *does* it mean? > Is it, as Pragma put it, a "bool containsPointers" per block? > Or is it mostly-precise? > > "Not exact/precise" still leaves a pretty big range... All it means is that a bit gets set per block meaning if it might contain pointers or does not contain pointers. In the future, it might give a list of which offsets contain pointers. > And if it's just a bool, what was the reasoning behind this decision? > Ease of implementation? The overhead of the metadata needed to get to mostly-precise? All of the above? It's a fairly significant improvement for a small change. |
January 23, 2007 Re: Transitioning to a type aware Garbage Collector | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | Walter Bright wrote: > Frits van Bommel wrote: >> Walter Bright wrote: >>> Paul Findlay wrote: >>>> Walter Bright wrote: >>>>> To improve GC performance, we need to transition to a GC that is aware of the types of what it is allocating. >>>> Does this mean the D garbage collector will be an exact (or precise) garbage collector? >>> >>> No. >> >> So what exactly *does* it mean? >> Is it, as Pragma put it, a "bool containsPointers" per block? >> Or is it mostly-precise? >> >> "Not exact/precise" still leaves a pretty big range... > > All it means is that a bit gets set per block meaning if it might contain pointers or does not contain pointers. In the future, it might give a list of which offsets contain pointers. Just shooting from the hip here: one possibility for such a list that won't eat much space would be having each bit of a word (or dword) represent the presence of a pointer at that 4-byte offset. The last bit set could just signify "this is a big object, conservatively scan beyond bits*4 bytes". This would take advantage of the fact that most objects typically aren't very big. Example (16 bit list): 0000 0000 0000 0000 - zero = no pointers present 0000 0000 0000 1001 - pointers at offsets 0 and 12 1111 1111 1111 1111 - max = I'm all pointers 1000 0000 1000 0001 - pointers at offsets 0 and 28, and force scan at offsets 60 through end of block. Another option would be to make the resolution of the high-order bits more logarithmic, making it behave something like a bloom filter. > >> And if it's just a bool, what was the reasoning behind this decision? >> Ease of implementation? The overhead of the metadata needed to get to mostly-precise? All of the above? > > It's a fairly significant improvement for a small change. -- - EricAnderton at yahoo |
January 24, 2007 Re: Transitioning to a type aware Garbage Collector | ||||
---|---|---|---|---|
| ||||
Posted in reply to Pragma | Pragma wrote:
> Walter Bright wrote:
>> Frits van Bommel wrote:
>>> Walter Bright wrote:
>>>> Paul Findlay wrote:
>>>>> Walter Bright wrote:
>>>>>> To improve GC performance, we need to transition to a GC that is aware of the types of what it is allocating.
>>>>> Does this mean the D garbage collector will be an exact (or precise) garbage collector?
>>>>
>>>> No.
>>>
>>> So what exactly *does* it mean?
>>> Is it, as Pragma put it, a "bool containsPointers" per block?
>>> Or is it mostly-precise?
>>>
>>> "Not exact/precise" still leaves a pretty big range...
>>
>> All it means is that a bit gets set per block meaning if it might contain pointers or does not contain pointers. In the future, it might give a list of which offsets contain pointers.
>
> Just shooting from the hip here: one possibility for such a list that won't eat much space would be having each bit of a word (or dword) represent the presence of a pointer at that 4-byte offset. The last bit set could just signify "this is a big object, conservatively scan beyond bits*4 bytes". This would take advantage of the fact that most objects typically aren't very big.
I think there should also be a way to annotate as "array, repeat". This would likely be useful since big objects are likely to be arrays.
This would require one bit for a flag and some way to note the element size, though.
If we go by the assumption that most structs/unions aren't very big, that could perhaps be implemented as a byte total: 1 bit = flag, 7 bits = size - 1 (since 0-length elements aren't useful anyway :) ) would allow elements up to 128 bytes, which should cover most arrays.
If we want no more than 4 bytes of meta-info[1], that would leave 23 bits for pointer flags and one "big object/element" flag. Objects up to 95 bytes could then be specified exactly (23 * 4 + 3 bytes, the last 3 of which can't contain a pointer).
And yes, that's the optimum with these parameters; moving 1 bit from size to pointer flags leaves the size at a max of 64, which is under 95.
For 8 bytes of meta-info[2], the optimum would be
8 bits of elt/object size
1 bit array flag
1 bit big object flag
54 bits pointer flags
max exactly-specified size: 219 (54 * 4 + 3)
[1]: That's the minimum alignment for anything containing pointers on 32 bit systems. This would allow it to be put at the start of the block without padding for align(4) contents.
[2]: Minimum align of anything containing ptrs on 64 bit systems (presumably). Can of course also be used on 32 bit systems; IIRC doubles and structs containing them already have align(8) as default.
[...]
Wait... Since size == 0 is useless, we can also use *that* as "big" 'flag'. That would mean:
32 bits: 1 + 7 + 24, max exact size: 99
64 bits: 1 + 8 + 55, max exact size: 223
And if the array bit is 0, you can use the size bits as extra pointer flags (and one for the "big" flag), allowing normal objects up to 123/251 bytes for 32/64 bits, respectively.
</obsess over details>
|
January 24, 2007 Re: Transitioning to a type aware Garbage Collector | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | Walter Bright wrote:
> Frits van Bommel wrote:
>> Walter Bright wrote:
>>> Paul Findlay wrote:
>>>> Walter Bright wrote:
>>>>> To improve GC performance, we need to transition to a GC that is aware of the types of what it is allocating.
>>>> Does this mean the D garbage collector will be an exact (or precise) garbage collector?
>>>
>>> No.
>>
>> So what exactly *does* it mean?
>> Is it, as Pragma put it, a "bool containsPointers" per block?
>> Or is it mostly-precise?
>>
>> "Not exact/precise" still leaves a pretty big range...
>
> All it means is that a bit gets set per block meaning if it might contain pointers or does not contain pointers. In the future, it might give a list of which offsets contain pointers.
>
>> And if it's just a bool, what was the reasoning behind this decision?
>> Ease of implementation? The overhead of the metadata needed to get to mostly-precise? All of the above?
>
> It's a fairly significant improvement for a small change.
Does this incremental change include the RTTI support needed to implement algorithms like CBGC (There are others which could use it)? Or are you leaving that for a later time?
|
January 24, 2007 Re: Transitioning to a type aware Garbage Collector | ||||
---|---|---|---|---|
| ||||
Posted in reply to Kyle Furlong | Kyle Furlong wrote:
> Does this incremental change include the RTTI support needed to implement algorithms like CBGC (There are others which could use it)? Or are you leaving that for a later time?
It's most of the info, but not quite all of it. I plan to add the rest later.
|
January 25, 2007 Re: Transitioning to a type aware Garbage Collector | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | I was hoping this would happen some day. Thank you Walter!
Walter Bright wrote:
> To improve GC performance, we need to transition to a GC that is aware of the types of what it is allocating.
>
> So problems can happen if the type of allocated memory is changed after it is allocated, via casting. The way to avoid this is to allocate as void[] memory that will be cast, as in:
>
> struct Foo { ... };
> Foo[] f;
>
> p = new void[100];
> f = cast(Foo[])p; // ok
> byte[] b = cast(byte[])p;
> f = cast(Foo[])b; // ok, GC still regards memory as void[]
>
> rather than:
>
> p = new byte[100];
> f = cast(Foo[])p; // will likely eventually corrupt memory
>
> The GC will regard void[] as a chunk of memory containing arbitrary data of arbitrary types, and so will treat it conservatively.
>
> In general, regard memory allocated by the GC as anything other than void[] as being *strongly typed*.
|
Copyright © 1999-2021 by the D Language Foundation