January 23, 2007
Frits van Bommel wrote:
> Walter Bright wrote:
>> Paul Findlay wrote:
>>> Walter Bright wrote:
>>>> To improve GC performance, we need to transition to a GC that is aware of the types of what it is allocating.
>>> Does this mean the D garbage collector will be an exact (or precise) garbage collector?
>>
>> No.
> 
> So what exactly *does* it mean?
> Is it, as Pragma put it, a "bool containsPointers" per block?
> Or is it mostly-precise?
> 
> "Not exact/precise" still leaves a pretty big range...

All it means is that a bit gets set per block meaning if it might contain pointers or does not contain pointers. In the future, it might give a list of which offsets contain pointers.

> And if it's just a bool, what was the reasoning behind this decision?
> Ease of implementation? The overhead of the metadata needed to get to mostly-precise? All of the above?

It's a fairly significant improvement for a small change.
January 23, 2007
Walter Bright wrote:
> Frits van Bommel wrote:
>> Walter Bright wrote:
>>> Paul Findlay wrote:
>>>> Walter Bright wrote:
>>>>> To improve GC performance, we need to transition to a GC that is aware of the types of what it is allocating.
>>>> Does this mean the D garbage collector will be an exact (or precise) garbage collector?
>>>
>>> No.
>>
>> So what exactly *does* it mean?
>> Is it, as Pragma put it, a "bool containsPointers" per block?
>> Or is it mostly-precise?
>>
>> "Not exact/precise" still leaves a pretty big range...
> 
> All it means is that a bit gets set per block meaning if it might contain pointers or does not contain pointers. In the future, it might give a list of which offsets contain pointers.

Just shooting from the hip here: one possibility for such a list that won't eat much space would be having each bit of a word (or dword) represent the presence of a pointer at that 4-byte offset.  The last bit set could just signify "this is a big object, conservatively scan beyond bits*4 bytes".  This would take advantage of the fact that most objects typically aren't very big.

Example (16 bit list):
0000 0000 0000 0000 - zero = no pointers present
0000 0000 0000 1001 - pointers at offsets 0 and 12
1111 1111 1111 1111 - max = I'm all pointers
1000 0000 1000 0001 - pointers at offsets 0 and 28, and force scan at offsets 60 through end of block.

Another option would be to make the resolution of the high-order bits more logarithmic, making it behave something like a bloom filter.

> 
>> And if it's just a bool, what was the reasoning behind this decision?
>> Ease of implementation? The overhead of the metadata needed to get to mostly-precise? All of the above?
> 
> It's a fairly significant improvement for a small change.


-- 
- EricAnderton at yahoo
January 24, 2007
Pragma wrote:
> Walter Bright wrote:
>> Frits van Bommel wrote:
>>> Walter Bright wrote:
>>>> Paul Findlay wrote:
>>>>> Walter Bright wrote:
>>>>>> To improve GC performance, we need to transition to a GC that is aware of the types of what it is allocating.
>>>>> Does this mean the D garbage collector will be an exact (or precise) garbage collector?
>>>>
>>>> No.
>>>
>>> So what exactly *does* it mean?
>>> Is it, as Pragma put it, a "bool containsPointers" per block?
>>> Or is it mostly-precise?
>>>
>>> "Not exact/precise" still leaves a pretty big range...
>>
>> All it means is that a bit gets set per block meaning if it might contain pointers or does not contain pointers. In the future, it might give a list of which offsets contain pointers.
> 
> Just shooting from the hip here: one possibility for such a list that won't eat much space would be having each bit of a word (or dword) represent the presence of a pointer at that 4-byte offset.  The last bit set could just signify "this is a big object, conservatively scan beyond bits*4 bytes".  This would take advantage of the fact that most objects typically aren't very big.

I think there should also be a way to annotate as "array, repeat". This would likely be useful since big objects are likely to be arrays.
This would require one bit for a flag and some way to note the element size, though.
If we go by the assumption that most structs/unions aren't very big, that could perhaps be implemented as a byte total: 1 bit = flag, 7 bits = size - 1 (since 0-length elements aren't useful anyway :) ) would allow elements up to 128 bytes, which should cover most arrays.

If we want no more than 4 bytes of meta-info[1], that would leave 23 bits for pointer flags and one "big object/element" flag. Objects up to 95 bytes could then be specified exactly (23 * 4 + 3 bytes, the last 3 of which can't contain a pointer).

And yes, that's the optimum with these parameters; moving 1 bit from size to pointer flags leaves the size at a max of 64, which is under 95.

For 8 bytes of meta-info[2], the optimum would be
8 bits of elt/object size
1 bit array flag
1 bit big object flag
54 bits pointer flags
max exactly-specified size: 219 (54 * 4 + 3)


[1]: That's the minimum alignment for anything containing pointers on 32 bit systems. This would allow it to be put at the start of the block without padding for align(4) contents.
[2]: Minimum align of anything containing ptrs on 64 bit systems (presumably). Can of course also be used on 32 bit systems; IIRC doubles and structs containing them already have align(8) as default.


[...]


Wait... Since size == 0 is useless, we can also use *that* as "big" 'flag'. That would mean:
32 bits: 1 + 7 + 24, max exact size: 99
64 bits: 1 + 8 + 55, max exact size: 223

And if the array bit is 0, you can use the size bits as extra pointer flags (and one for the "big" flag), allowing normal objects up to 123/251 bytes for 32/64 bits, respectively.


</obsess over details>
January 24, 2007
Walter Bright wrote:
> Frits van Bommel wrote:
>> Walter Bright wrote:
>>> Paul Findlay wrote:
>>>> Walter Bright wrote:
>>>>> To improve GC performance, we need to transition to a GC that is aware of the types of what it is allocating.
>>>> Does this mean the D garbage collector will be an exact (or precise) garbage collector?
>>>
>>> No.
>>
>> So what exactly *does* it mean?
>> Is it, as Pragma put it, a "bool containsPointers" per block?
>> Or is it mostly-precise?
>>
>> "Not exact/precise" still leaves a pretty big range...
> 
> All it means is that a bit gets set per block meaning if it might contain pointers or does not contain pointers. In the future, it might give a list of which offsets contain pointers.
> 
>> And if it's just a bool, what was the reasoning behind this decision?
>> Ease of implementation? The overhead of the metadata needed to get to mostly-precise? All of the above?
> 
> It's a fairly significant improvement for a small change.

Does this incremental change include the RTTI support needed to implement algorithms like CBGC (There are others which could use it)? Or are you leaving that for a later time?
January 24, 2007
Kyle Furlong wrote:
> Does this incremental change include the RTTI support needed to implement algorithms like CBGC (There are others which could use it)? Or are you leaving that for a later time?

It's most of the info, but not quite all of it. I plan to add the rest later.
January 25, 2007
I was hoping this would happen some day.  Thank you Walter!

Walter Bright wrote:
> To improve GC performance, we need to transition to a GC that is aware of the types of what it is allocating.
> 
> So problems can happen if the type of allocated memory is changed after it is allocated, via casting. The way to avoid this is to allocate as void[] memory that will be cast, as in:
> 
> struct Foo { ... };
> Foo[] f;
> 
> p = new void[100];
> f = cast(Foo[])p;    // ok
> byte[] b = cast(byte[])p;
> f = cast(Foo[])b;    // ok, GC still regards memory as void[]
> 
> rather than:
> 
> p = new byte[100];
> f = cast(Foo[])p;    // will likely eventually corrupt memory
> 
> The GC will regard void[] as a chunk of memory containing arbitrary data of arbitrary types, and so will treat it conservatively.
> 
> In general, regard memory allocated by the GC as anything other than void[] as being *strongly typed*.
1 2 3
Next ›   Last »