Jump to page: 1 2
Thread overview
Moving GC
Dec 12, 2008
dsimcha
Dec 12, 2008
Sean Kelly
Dec 13, 2008
Don
Dec 12, 2008
Sergey Gromov
Dec 12, 2008
dsimcha
Dec 12, 2008
Christopher Wright
Dec 13, 2008
Sean Kelly
Dec 13, 2008
dsimcha
Dec 13, 2008
Christopher Wright
Dec 13, 2008
dsimcha
Dec 13, 2008
Fawzi Mohamed
Dec 13, 2008
Christopher Wright
Dec 13, 2008
Brad Roberts
Dec 25, 2008
Stewart Gordon
December 12, 2008
I've been playing around with custom memory allocation schemes for some programs, which are based on regions.  These work great, and are significantly faster for the use cases I'm using them for, than the general allocation scheme.  However, everything would break if it were used with reference types and a moving GC because, as far as the GC knows, the regions I'm allocating from it are untyped.

What are the odds that some D implementation gets a moving GC in the foreseeable future?  Are they significant enough that I should avoid relying on the GC being non-moving, or is worrying about such things just overdoing trying to be future-proof?
December 12, 2008
dsimcha wrote:
> 
> What are the odds that some D implementation gets a moving GC in the
> foreseeable future?

Minuscule.  The GC isn't provided nearly enough type information to safely move memory right now, and I don't see that changing any time soon.


Sean
December 12, 2008
Fri, 12 Dec 2008 15:05:46 +0000 (UTC), dsimcha wrote:

> I've been playing around with custom memory allocation schemes for some programs, which are based on regions.  These work great, and are significantly faster for the use cases I'm using them for, than the general allocation scheme.  However, everything would break if it were used with reference types and a moving GC because, as far as the GC knows, the regions I'm allocating from it are untyped.
> 
> What are the odds that some D implementation gets a moving GC in the foreseeable future?  Are they significant enough that I should avoid relying on the GC being non-moving, or is worrying about such things just overdoing trying to be future-proof?

Is this at all possible for a conservative GC to be moving?  It's one thing when some unused memory is kept allocated due to false pointers, and another when some data resembling pointers changes unpredictably.

And GC is to stay conservative as long as unions are supported.
December 12, 2008
== Quote from Sergey Gromov (snake.scaly@gmail.com)'s article
> Fri, 12 Dec 2008 15:05:46 +0000 (UTC), dsimcha wrote:
> > I've been playing around with custom memory allocation schemes for some programs, which are based on regions.  These work great, and are significantly faster for the use cases I'm using them for, than the general allocation scheme.  However, everything would break if it were used with reference types and a moving GC because, as far as the GC knows, the regions I'm allocating from it are untyped.
> >
> > What are the odds that some D implementation gets a moving GC in the foreseeable future?  Are they significant enough that I should avoid relying on the GC being non-moving, or is worrying about such things just overdoing trying to be future-proof?
> Is this at all possible for a conservative GC to be moving?  It's one thing when some unused memory is kept allocated due to false pointers, and another when some data resembling pointers changes unpredictably. And GC is to stay conservative as long as unions are supported.

There are different levels of conservatism, though.  For example, IIRC, the Mono GC scans the stack conservatively and pins all objects that have apparent pointers from the stack.  However, it scans the heap precisely and moves objects that only have pointers from the heap.  In theory (not saying it will or even should actually happen) D could implement a mostly precise GC that makes the conservative assumption only in the case of unions, and pins objects that may be pointed to by pointers contained in unions.

However, after thinking about it, I doubt D will go the route of moving/more precise GC.  I know I've advocated it in the past, but I've realized that false pointers aren't generally that big a deal if you delete a few huge (>1MB) objects manually.  Furthermore, I think it's necessary in a systems programming language to be able to allocate a big chunk of untyped memory.  Technically, this could still be done with a moving GC by using the C heap, but it would make things really, really complicated.
December 12, 2008
dsimcha wrote:
> However, after thinking about it, I doubt D will go the route of moving/more
> precise GC.  I know I've advocated it in the past, but I've realized that false
> pointers aren't generally that big a deal if you delete a few huge (>1MB) objects
> manually.  Furthermore, I think it's necessary in a systems programming language
> to be able to allocate a big chunk of untyped memory.  Technically, this could
> still be done with a moving GC by using the C heap, but it would make things
> really, really complicated.

You'd have to do more bookkeeping:
- this memory is referenced by something that I know is a pointer
- this memory is referenced by something that might be a pointer (or some random bit pattern)
- this memory is not referenced

This isn't unreasonable, and it really just depends on how complete and how fast D's runtime reflection is.
December 13, 2008
== Quote from Christopher Wright (dhasenan@gmail.com)'s article
>
> This isn't unreasonable, and it really just depends on how complete and how fast D's runtime reflection is.

This would incur more memory overhead as well.  Right now, we make do with one bit per block that indicates whether the block should be scanned for pointers, while this may theoretically require a pointer to TypeInfo per block in order to perform precise scanning.


Sean
December 13, 2008
== Quote from Sean Kelly (sean@invisibleduck.org)'s article
> == Quote from Christopher Wright (dhasenan@gmail.com)'s article
> >
> > This isn't unreasonable, and it really just depends on how complete and how fast D's runtime reflection is.
> This would incur more memory overhead as well.  Right now, we make
> do with one bit per block that indicates whether the block should be
> scanned for pointers, while this may theoretically require a pointer
> to TypeInfo per block in order to perform precise scanning.
> Sean

Under what use cases would this extra few bytes really matter?  What kinds of programs allocate so many objects that this overhead would be significant in practice?

Note that I'm not advocating a moving GC, because in a systems language like D, where working with pointers and untyped memory blocks is supposed to be feasible, I think it would create more problems/limitations than it solves.  For example, now it's ok to store a pointer to GC allocated memory in an area not scanned by the GC, such as the C heap or a memory block marked as NO_SCAN, as long as you also have another pointer to it somewhere else.  With a moving GC, this would become problematic.

However, a more precise non-moving GC (scans the whole heap precisely, only imprecise on the stack) would be nice if it can be reasonably implemented.
December 13, 2008
dsimcha wrote:
> == Quote from Sean Kelly (sean@invisibleduck.org)'s article
>> == Quote from Christopher Wright (dhasenan@gmail.com)'s article
>>> This isn't unreasonable, and it really just depends on how complete and
>>> how fast D's runtime reflection is.
>> This would incur more memory overhead as well.  Right now, we make
>> do with one bit per block that indicates whether the block should be
>> scanned for pointers, while this may theoretically require a pointer
>> to TypeInfo per block in order to perform precise scanning.
>> Sean
> 
> Under what use cases would this extra few bytes really matter?  What kinds of
> programs allocate so many objects that this overhead would be significant in practice?

It means that you can't use one block for objects of multiple types. That's a limitation on how you implement a GC, and it could possibly increase your overhead a lot more than converting a bit to a word. (Unless you use some tricks, that bit will take at least one byte, maybe more depending on alignment.)
December 13, 2008
== Quote from Christopher Wright (dhasenan@gmail.com)'s article
> It means that you can't use one block for objects of multiple types.

Sure you can, without being any worse off than under the current scheme.  Mark the
contents of the memory as an array of void*s.  This would have the same effect as
marking it for GC scanning under the current scheme, basically making the GC scan
the block conservatively.  Furthermore, if you're storing value types or pointers
to reference types that also have a pointer stored in a GC-scanned block, set the
typeinfo to byte or something, and you have the equivalent of setting the NO_SCAN bit.

December 13, 2008
Christopher Wright wrote:
> dsimcha wrote:
>> However, after thinking about it, I doubt D will go the route of
>> moving/more
>> precise GC.  I know I've advocated it in the past, but I've realized
>> that false
>> pointers aren't generally that big a deal if you delete a few huge
>> (>1MB) objects
>> manually.  Furthermore, I think it's necessary in a systems
>> programming language
>> to be able to allocate a big chunk of untyped memory.  Technically,
>> this could
>> still be done with a moving GC by using the C heap, but it would make
>> things
>> really, really complicated.
> 
> You'd have to do more bookkeeping:
> - this memory is referenced by something that I know is a pointer
> - this memory is referenced by something that might be a pointer (or
> some random bit pattern)
> - this memory is not referenced
> 
> This isn't unreasonable, and it really just depends on how complete and how fast D's runtime reflection is.

Maybe isn't good enough. You have to definitively be able to identify every pointer to an object to be able to move the object since every pointer must be adjusted.  The only alternative to that is double indirection for movable objects.  With double indirection you point to a pointer that the GC owns and can update when the objects are moved.

There's HUGE benefits to a moving gc.  The gc in java 6 is _extremely_ clever (and not all that hard to write either, at least most of it is.. there's a lot of details in the heuristics parts), but 100% relies on moving objects.

Later,
Brad
« First   ‹ Prev
1 2