June 21, 2013 [D-runtime] Precise garbage collection | ||||
---|---|---|---|---|
| ||||
Hi, I want to make precise garbage collection as presented at the D conference ready for inclusion into druntime. I have recently updated the branch at https://github.com/rainers/druntime/tree/gcx_precise to merge with Leandros changes to the GC module layout. Before creating a pull request, I'd like to hear opinions on whether this should be included, if other choices would be better and where it should be improved. Precise garbage collection must be able to work with different memory areas, namely the heap, global/thread data in the binary and stack/registers per thread. I think we do not have a feasible solution for the latter pair, so let's focus on the former two. 1. Heap In a nutshell, the implementation uses the RTInfo template to generate a bitmap of booleans that indicate whether the corresponding field in a struct/class might be a pointer or not. For built-in types, this information is predefined in object.d and the typeinfo.ti_* modules. When memory is allocated, the TypeInfo for the allocated type is passed to the garbage collector, which copies the bitmap into memory alongside the pages allocated by the pool (another GCBits member with one bit per word). During a collection, scanning the heap can then look up this information to detect false pointers and avoid keeping garbage alive. There are a number of issues that should be discussed: a. the compiler sometimes does not generate the RTInfo for a struct, but instead generates 0/1 into the respective m_RTInfo field, depending on whether this struct contains references or not. (As far as I can tell this happens if it is only the backend that needs the TypeInfo, e.g. when generating an array concatenation call.) That's why there are rtinfoNoPointers/rtinfoHasPointers enums also used by the TypeInfos for the builtin types. I consider this behaviour a bug that should be fixed, as it also disallows other usages of the RTInfo. There is also an issue with not creating RTInfo for associative array types, but there is an "easy" workaround. b. there are already other application of the RTInfo template, so there should probably be some way to combine multiple "generators". My idea is to let RTInfo!T point to some immutable(RTInfoData) struct that can then have multiple members for different generators. It'd be nice to change the return type of TypeInfo.rtInfo() from void* to immutable(RTInfoData)* to avoid casting. c. The GC interface has to be extended to pass type information to the GC. I guess just passing the respective TypeInfo pointer is obvious and correct. d. The alternative to using a pointer bitmap alongside the pool memory is to store the TypeInfo pointer instead. It's major advantage is that it adds minimal performance overhead to the allocation. d1. This needs more memory for small allocations, but less for larger. If it is stored in the same memory as the allocation itself, it should be at the end to avoid alignment issues (at the beginning it always adds 16 bytes). This would mostly reuse unused memory due to the alignment of allocations to a power of 2. The worst effect would be for allocations of just below or equal to a power of 2. We could mitigate that effect by allowing other sizes of allocations aswell, or by not reserving space for the TypeInfo pointer if the block is allocated NOSCAN. d2. Both dynamic and associative arrays currently allocate memory chunks and use them in a "non-standard" way that cannot be described by a simple TypeInfo. For example, dynamic arrays keep track of the allocated size of the array by placing it at the very end of allocations < 4k, but at the beginning for allocations >= 4k, moving the data to an offset of 16 in the latter case. Associative arrays combine hash-list-node, key and value into a single allocation, sometimes even with the value type unavailable. My implementation solves these issues by "emplacing" the appropriate type information at the given address, assuming pointers if no type information is available (using the new gc_emplace function). If only a TypeInfo pointer is available, I'm not sure how this can be solved without _allocating_ a new TypeInfo. My best guess is that it could be done if a generic array scan function could be called. d3. These leads to the idea to generate a scanning function for each type instead of the pointer bitmap. Depending on the sparseness of pointers this can be shorter or can create a lot of code bloat. As a compromise a generic version might use the pointer bitmap, but it can be overloaded to implement arrays or even unions. On the downside, it makes using std.emplace much more complicated if you want precise scanning. e. Currently, there is only one TypeInfo_Class for both a class instance and a class reference. There are currently assumptions made which one is meant depending on context (if used as a "root" it is usually an instance, when following TypeInfo.next it is assumed a reference). This does no longer work reliably when combined with "emplace". I think we need to add TypeInfo_Reference to describe the field that is a pointer to a class instance. To be honest I have no idea how much code might break by adding this indirection when traversing type information. f. Currently, the precise GC can be versioned in/out, but I think making it a runtime option is preferable. A problem here is that the user has no way to change the default configuration before the GC is initialized. I could imagine always starting with the precise GC, but the user can opt out anytime. The GC could then release any additional memory it has allocated for precise scanning. If the user opts back in, the data structures are rebuilt assuming everything allocated so far as void[]. Another option might be to make it a link time option, but that would mean the "standard" gc could not be part of the runtime library to be exchangeable. This text is already too long, so I postpone discussing the global/tls data section until later. Summing up the questions raised: 1. Assuming RTInfo generation is fixed, should we go for adding an indiretion through RTInfoData to allow multiple "generators"? 2. Are you ok with extending the gc_* function with TypeInfo parameters where appropriate? 3. Do you think the pointer bitmap aside the pool memory and adding gc_emplace is ok? Or should we investigate other alternatives first? 4. Would you object adding TypeInfo_Reference? 5. How do you prefer enabling/disabling precise garbage collection? Versioning/linking/runtime? Best, Rainer _______________________________________________ D-runtime mailing list D-runtime@puremagic.com http://lists.puremagic.com/mailman/listinfo/d-runtime |
June 21, 2013 Re: [D-runtime] Precise garbage collection | ||||
---|---|---|---|---|
| ||||
Posted in reply to Rainer Schuetze | On Fri, Jun 21, 2013 at 08:36:23PM +0200, Rainer Schuetze wrote: > Hi, > > I want to make precise garbage collection as presented at the D conference ready for inclusion into druntime. I have recently updated the branch at https://github.com/rainers/druntime/tree/gcx_precise to merge with Leandros changes to the GC module layout. Ouch. Sorry about that, but at least I'm glad I didn't have the time to get too far on integrating the concurrent collection, I will have to rewrite big parts of it so it wouldn't make any sense to break all your work if it's already updated to master. > Before creating a pull request, I'd like to hear opinions on whether this should be included, if other choices would be better and where it should be improved. I'll read this e-mail in detail next week, and hopefully I'll take a look at the code too. -- Leandro Lucarella Senior R&D Developer Sociomantic Labs GmbH <http://www.sociomantic.com> _______________________________________________ D-runtime mailing list D-runtime@puremagic.com http://lists.puremagic.com/mailman/listinfo/d-runtime |
June 21, 2013 Re: [D-runtime] Precise garbage collection | ||||
---|---|---|---|---|
| ||||
Posted in reply to Rainer Schuetze | I agree that we should put precise garbage collection into the GC. At least as an option (the GC is swappable I think). On Jun 21, 2013, at 2:36 PM, Rainer Schuetze <r.sagitario@gmx.de> wrote: > There are a number of issues that should be discussed: > > d2. Both dynamic and associative arrays currently allocate memory chunks and use them in a "non-standard" way that cannot be described by a simple TypeInfo. For example, dynamic arrays keep track of the allocated size of the array by placing it at the very end of allocations < 4k, but at the beginning for allocations >= 4k, moving the data to an offset of 16 in the latter case. Associative arrays combine hash-list-node, key and value into a single allocation, sometimes even with the value type unavailable. The dynamic array "allocated size" location and size is predictable, and can be determined whether it exists based on the APPENDABLE bit. You should be able to correctly ignore it during a collection. Or am I misunderstanding the reason you posted that? -Steve _______________________________________________ D-runtime mailing list D-runtime@puremagic.com http://lists.puremagic.com/mailman/listinfo/d-runtime |
June 21, 2013 Re: [D-runtime] Precise garbage collection | ||||
---|---|---|---|---|
| ||||
Posted in reply to Leandro Lucarella | On 21.06.2013 21:37, Leandro Lucarella wrote: > On Fri, Jun 21, 2013 at 08:36:23PM +0200, Rainer Schuetze wrote: >> Hi, >> >> I want to make precise garbage collection as presented at the D >> conference ready for inclusion into druntime. I have recently >> updated the branch at >> https://github.com/rainers/druntime/tree/gcx_precise to merge with >> Leandros changes to the GC module layout. > > Ouch. Sorry about that, but at least I'm glad I didn't have the time to > get too far on integrating the concurrent collection, I will have to > rewrite big parts of it so it wouldn't make any sense to break all your > work if it's already updated to master. No big deal, it had to hit one of us. Let's see who gets first next time. ;-) _______________________________________________ D-runtime mailing list D-runtime@puremagic.com http://lists.puremagic.com/mailman/listinfo/d-runtime |
June 21, 2013 Re: [D-runtime] Precise garbage collection | ||||
---|---|---|---|---|
| ||||
Posted in reply to Steven Schveighoffer | On 21.06.2013 22:22, Steven Schveighoffer wrote: >> d2. Both dynamic and associative arrays currently allocate memory chunks and use them in a "non-standard" way that cannot be described by a simple TypeInfo. For example, dynamic arrays keep track of the allocated size of the array by placing it at the very end of allocations < 4k, but at the beginning for allocations >= 4k, moving the data to an offset of 16 in the latter case. Associative arrays combine hash-list-node, key and value into a single allocation, sometimes even with the value type unavailable. > > The dynamic array "allocated size" location and size is predictable, and can be determined whether it exists based on the APPENDABLE bit. You should be able to correctly ignore it during a collection. Or am I misunderstanding the reason you posted that? The size value itself is only a small issue, the larger one is the address of the array data moves depending on the size of the allocation, so the pointer info needs to be placed at some offset sometimes. My first implementation actually figured this out in the GC, but I think this leaks too much implementation details of the array into the GC. So I changed it to use gc_emplace instead. _______________________________________________ D-runtime mailing list D-runtime@puremagic.com http://lists.puremagic.com/mailman/listinfo/d-runtime |
June 21, 2013 Re: [D-runtime] Precise garbage collection | ||||
---|---|---|---|---|
| ||||
Posted in reply to Rainer Schuetze | On Jun 21, 2013, at 5:37 PM, Rainer Schuetze <r.sagitario@gmx.de> wrote: > The size value itself is only a small issue, the larger one is the address of the array data moves depending on the size of the allocation, so the pointer info needs to be placed at some offset sometimes. My first implementation actually figured this out in the GC, but I think this leaks too much implementation details of the array into the GC. So I changed it to use gc_emplace instead. In case I didn't explain it well in the documentation/comments, the reason for this is because when you append to a PAGE or larger sized block, the GC can tack on additional pages and get to add more memory for free (without moving the existing data). If the "size" field was at the end, then it would have to move to the new page. This can be a problem if you have two threads looking at the block at the same time. One can get the block info, release the GC lock, then the other could extend the block. By the time the first thread comes back to look at the "end" of the block (which is checked while holding a different lock), the block info is no longer valid, and it would look at the wrong place. I think for unshared blocks, there would be no problem, but block can become shared/unshared via a cast, and that would cause problems. This could probably be done better, but that is the reasoning. I will note that it was really never a problem for things like classes because a class block would never be marked as APPENDABLE. The choice of 16 bytes was recommended by "people who know" :) I initially thought 8 would be fine but was told that wasn't a good idea. In any case, an abstraction like gc_emplace is a good idea. You do, however, have to ignore that "size" field at the front when scanning (I'm assuming you are doing that?) -Steve _______________________________________________ D-runtime mailing list D-runtime@puremagic.com http://lists.puremagic.com/mailman/listinfo/d-runtime |
June 22, 2013 Re: [D-runtime] Precise garbage collection | ||||
---|---|---|---|---|
| ||||
Posted in reply to Steven Schveighoffer | On 21.06.2013 23:57, Steven Schveighoffer wrote: > > On Jun 21, 2013, at 5:37 PM, Rainer Schuetze <r.sagitario@gmx.de> > wrote: > >> The size value itself is only a small issue, the larger one is the >> address of the array data moves depending on the size of the >> allocation, so the pointer info needs to be placed at some offset >> sometimes. My first implementation actually figured this out in the >> GC, but I think this leaks too much implementation details of the >> array into the GC. So I changed it to use gc_emplace instead. > > In case I didn't explain it well in the documentation/comments, the > reason for this is because when you append to a PAGE or larger sized > block, the GC can tack on additional pages and get to add more memory > for free (without moving the existing data). If the "size" field was > at the end, then it would have to move to the new page. I think I understand the reasoning for the design and do not mean to change it. It shows that there are cases where you want to have the liberty to place your data anywhere within the allocation. Having a single TypeInfo to describe that is too limiting. > > This can be a problem if you have two threads looking at the block at > the same time. One can get the block info, release the GC lock, then > the other could extend the block. By the time the first thread comes > back to look at the "end" of the block (which is checked while > holding a different lock), the block info is no longer valid, and it > would look at the wrong place. I think for unshared blocks, there > would be no problem, but block can become shared/unshared via a cast, > and that would cause problems. I was a bit surprised to find the special casing for shared/non-shared arrays in the array appending code, given the fact that the absence of "shared" does not guarantee it is not shared. I'm not sure whether we have to still guarantee memory safety in the case of undeclared sharing, but I'd be a bit more comfortable if we could (if it doesn't involve a global lock). I don't have a better solution, though. > > This could probably be done better, but that is the reasoning. I > will note that it was really never a problem for things like classes > because a class block would never be marked as APPENDABLE. > > The choice of 16 bytes was recommended by "people who know" :) I > initially thought 8 would be fine but was told that wasn't a good > idea. Yeah, allocating an array of simd vectors very much require an alignment of 16. And structs with alignment specificatons assume to be allocated with that alignment aswell. > > In any case, an abstraction like gc_emplace is a good idea. You do, > however, have to ignore that "size" field at the front when scanning > (I'm assuming you are doing that?) Yes, emplacing just starts with the resulting offset. (I'll have to double check whether the initial 2 or 4 pointer bits are actually cleared, though). _______________________________________________ D-runtime mailing list D-runtime@puremagic.com http://lists.puremagic.com/mailman/listinfo/d-runtime |
June 21, 2013 Re: [D-runtime] Precise garbage collection | ||||
---|---|---|---|---|
| ||||
Posted in reply to Rainer Schuetze | On 6/21/2013 11:49 PM, Rainer Schuetze wrote: > >> >> The choice of 16 bytes was recommended by "people who know" :) I >> initially thought 8 would be fine but was told that wasn't a good >> idea. > > Yeah, allocating an array of simd vectors very much require an alignment of 16. And structs with alignment specificatons assume to be allocated with that alignment aswell. 16 byte alignment is critical. _______________________________________________ D-runtime mailing list D-runtime@puremagic.com http://lists.puremagic.com/mailman/listinfo/d-runtime |
June 22, 2013 Re: [D-runtime] Precise garbage collection | ||||
---|---|---|---|---|
| ||||
Posted in reply to Rainer Schuetze | On 6/21/2013 11:36 AM, Rainer Schuetze wrote: > Hi, > > I want to make precise garbage collection as presented at the D conference ready for inclusion into druntime. I have recently updated the branch at https://github.com/rainers/druntime/tree/gcx_precise to merge with Leandros changes to the GC module layout. Yes. > > Before creating a pull request, I'd like to hear opinions on whether this should be included, if other choices would be better and where it should be improved. > > Precise garbage collection must be able to work with different memory areas, namely the heap, global/thread data in the binary and stack/registers per thread. I think we do not have a feasible solution for the latter pair, so let's focus on the former two. I believe we can successfully ignore precise scanning of the stack. > > 1. Heap > > In a nutshell, the implementation uses the RTInfo template to generate a bitmap of booleans that indicate whether the corresponding field in a struct/class might be a pointer or not. For built-in types, this information is predefined in object.d and the typeinfo.ti_* modules. > When memory is allocated, the TypeInfo for the allocated type is passed to the garbage collector, which copies the bitmap into memory alongside the pages allocated by the pool (another GCBits member with one bit per word). I assume you mean 1 bit per (void*).sizeof bytes. > During a collection, scanning the heap can then look up this information to detect false pointers and avoid keeping garbage alive. > > There are a number of issues that should be discussed: > > a. the compiler sometimes does not generate the RTInfo for a struct, but instead generates 0/1 into the respective m_RTInfo field, depending on whether this struct contains references or not. (As far as I can tell this happens if it is only the backend that needs the TypeInfo, e.g. when generating an array concatenation call.) That's why there are rtinfoNoPointers/rtinfoHasPointers enums also used by the TypeInfos for the builtin types. > > I consider this behaviour a bug that should be fixed, as it also disallows other usages of the RTInfo. I'm pretty sure it's a bug, too. > > There is also an issue with not creating RTInfo for associative array types, but there is an "easy" workaround. > > b. there are already other application of the RTInfo template, so there should probably be some way to combine multiple "generators". My idea is to let RTInfo!T point to some immutable(RTInfoData) struct that can then have multiple members for different generators. It'd be nice to change the return type of TypeInfo.rtInfo() from void* to immutable(RTInfoData)* to avoid casting. I don't think the casting is a problem, and I like the self-documenting aspect of void* making clear that the compiler doesn't know or care what that data actually is. > > c. The GC interface has to be extended to pass type information to > the GC. I guess just passing the respective TypeInfo pointer is obvious and correct. Yup. > > d. The alternative to using a pointer bitmap alongside the pool memory is to store the TypeInfo pointer instead. It's major advantage is that it adds minimal performance overhead to the allocation. But it'll subtract performance from the scanner. Which is better can only be determined with testing. > > d1. This needs more memory for small allocations, but less for larger. Storing the bitmap is a fixed 128/64 bytes for a page for 32/64 bit pointers. Notably for 64 bit pointers, it's only one pointer size. I think copying the bitmap is a net win, at least on size. > If it is stored in the same memory as the allocation itself, it should be at the end to avoid alignment issues (at the beginning it always adds 16 bytes). This would mostly reuse unused memory due to the alignment of allocations to a power of 2. The worst effect would be for allocations of just below or equal to a power of 2. We could mitigate that effect by allowing other sizes of allocations aswell, or by not reserving space for the TypeInfo pointer if the block is allocated NOSCAN. I think it should be stored separately. Storing it with the allocation is inefficient, as (for example) only 2 bits are needed for a 16 byte alloc. Also, storing it with the allocation precludes "extending" an allocation in place if the next chunk is free. > > d2. Both dynamic and associative arrays currently allocate memory chunks and use them in a "non-standard" way that cannot be described by a simple TypeInfo. For example, dynamic arrays keep track of the allocated size of the array by placing it at the very end of allocations < 4k, but at the beginning for allocations >= 4k, moving the data to an offset of 16 in the latter case. Associative arrays combine hash-list-node, key and value into a single allocation, sometimes even with the value type unavailable. > > My implementation solves these issues by "emplacing" the appropriate type information at the given address, assuming pointers if no type information is available (using the new gc_emplace function). > > If only a TypeInfo pointer is available, I'm not sure how this can be solved without _allocating_ a new TypeInfo. My best guess is that > it could be done if a generic array scan function could be called. > > d3. These leads to the idea to generate a scanning function for each type instead of the pointer bitmap. Depending on the sparseness of pointers this can be shorter or can create a lot of code bloat. As a compromise a generic version might use the pointer bitmap, but it can be overloaded to implement arrays or even unions. On the downside, it makes using std.emplace much more complicated if you want precise scanning. I spent some time thinking about this a while back. While it is a very attractive idea, I suspected the performance would be terrible as it would require two or more indirect jumps per chunk. The scanning code all needs to be present, in the cache, and predictable for high speed scanning. > > e. Currently, there is only one TypeInfo_Class for both a class instance and a class reference. There are currently assumptions made which one is meant depending on context (if used as a "root" it is usually an instance, when following TypeInfo.next it is assumed a reference). This does no longer work reliably when combined with "emplace". I think we need to add TypeInfo_Reference to describe the field that is a pointer to a class instance. To be honest I have no idea how much code might break by adding this indirection when traversing type information. > > f. Currently, the precise GC can be versioned in/out, but I think making it a runtime option is preferable. A problem here is that the user has no way to change the default configuration before the GC is initialized. > I could imagine always starting with the precise GC, but the user can opt out anytime. The GC could then release any additional memory it has allocated for precise scanning. If the user opts back in, the data structures are rebuilt assuming everything allocated so far as void[]. I don't think a runtime option is practical. It would be so "expert only" that those experts should be able to rebuild the library as required. > > Another option might be to make it a link time option, but that would mean the "standard" gc could not be part of the runtime library to be exchangeable. Users really only want one gc. > > > This text is already too long, so I postpone discussing the global/tls data section until later. > > Summing up the questions raised: > > 1. Assuming RTInfo generation is fixed, should we go for adding an indiretion through RTInfoData to allow multiple "generators"? No. Indirection makes it slower. > > 2. Are you ok with extending the gc_* function with TypeInfo parameters where appropriate? Yes. > > 3. Do you think the pointer bitmap aside the pool memory and adding gc_emplace is ok? Or should we investigate other alternatives first? > > 4. Would you object adding TypeInfo_Reference? Don't have a good answer for that. > > 5. How do you prefer enabling/disabling precise garbage collection? Versioning/linking/runtime? We should pick precise collection and commit to it. And then add Leandro's concurrent collector on top! _______________________________________________ D-runtime mailing list D-runtime@puremagic.com http://lists.puremagic.com/mailman/listinfo/d-runtime |
June 22, 2013 Re: [D-runtime] Precise garbage collection | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | On 22.06.2013 09:11, Walter Bright wrote: > > On 6/21/2013 11:36 AM, Rainer Schuetze wrote: >> In a nutshell, the implementation uses the RTInfo template to generate >> a bitmap of booleans that indicate whether the corresponding field in >> a struct/class might be a pointer or not. For built-in types, this >> information is predefined in object.d and the typeinfo.ti_* modules. >> When memory is allocated, the TypeInfo for the allocated type is >> passed to the garbage collector, which copies the bitmap into memory >> alongside the pages allocated by the pool (another GCBits member with >> one bit per word). > > I assume you mean 1 bit per (void*).sizeof bytes. Yes, I meant the "natural machine word" aka size_t. >> I consider this behaviour a bug that should be fixed, as it also >> disallows other usages of the RTInfo. > > I'm pretty sure it's a bug, too. I have filed http://d.puremagic.com/issues/show_bug.cgi?id=10442 > >> >> There is also an issue with not creating RTInfo for associative array >> types, but there is an "easy" workaround. This seems to happen only with my patch to add more debug info and might be related to the bug above. >> >> b. there are already other application of the RTInfo template, so >> there should probably be some way to combine multiple "generators". My >> idea is to let RTInfo!T point to some immutable(RTInfoData) struct >> that can then have multiple members for different generators. It'd be >> nice to change the return type of TypeInfo.rtInfo() from void* to >> immutable(RTInfoData)* to avoid casting. > > I don't think the casting is a problem, and I like the self-documenting > aspect of void* making clear that the compiler doesn't know or care what > that data actually is. ok. >[...] >> >> d. The alternative to using a pointer bitmap alongside the pool memory >> is to store the TypeInfo pointer instead. It's major advantage is that >> it adds minimal performance overhead to the allocation. > > But it'll subtract performance from the scanner. Which is better can > only be determined with testing. > >> >> d1. This needs more memory for small allocations, but less for larger. > > Storing the bitmap is a fixed 128/64 bytes for a page for 32/64 bit > pointers. Notably for 64 bit pointers, it's only one pointer size. I > think copying the bitmap is a net win, at least on size. Don't want to split hairs, but it is the size of 8 64-bit-pointers for a page. Still it fits into a cache-line on x86 processors. I think there is also quite a bit of optimization potential regarding scanning this bitmap in combination with the mark and scan bitmaps. All of these are mostly evaluated bit per bit. > > >> If it is stored in the same memory as the allocation itself, it should >> be at the end to avoid alignment issues (at the beginning it always >> adds 16 bytes). This would mostly reuse unused memory due to the >> alignment of allocations to a power of 2. The worst effect would be >> for allocations of just below or equal to a power of 2. We could >> mitigate that effect by allowing other sizes of allocations aswell, or >> by not reserving space for the TypeInfo pointer if the block is >> allocated NOSCAN. > > I think it should be stored separately. Storing it with the allocation > is inefficient, as (for example) only 2 bits are needed for a 16 byte > alloc. Also, storing it with the allocation precludes "extending" an > allocation in place if the next chunk is free. I did not mean the bitmap here, but the alternative TypeInfo pointer. But you are right, extending becomes more complicated. Currently only page-sized allocation or larger are extendable, so if these allocations have the TypeInfo pointer at the beginning, that should work and would blend pretty well with the current array implementation (storing the length there aswell). [...] >> d3. These leads to the idea to generate a scanning function for each >> type instead of the pointer bitmap. Depending on the sparseness of >> pointers this can be shorter or can create a lot of code bloat. As a >> compromise a generic version might use the pointer bitmap, but it can >> be overloaded to implement arrays or even unions. On the downside, it >> makes using std.emplace much more complicated if you want precise >> scanning. > > I spent some time thinking about this a while back. While it is a very > attractive idea, I suspected the performance would be terrible as it > would require two or more indirect jumps per chunk. The scanning code > all needs to be present, in the cache, and predictable for high speed > scanning. For data with sparse pointers the generated code might be more efficient, but I agree, in the general case, a double indirection for every small allocation could be expensive. [...] >> Summing up the questions raised: >> >> 1. Assuming RTInfo generation is fixed, should we go for adding an >> indiretion through RTInfoData to allow multiple "generators"? > > No. Indirection makes it slower. When using the copy-bitmap approach, this indirection is done only once when allocating, not during scanning. I think this is reasonable to allow extendibility. The default implementation might even make RTInfoData identical to the bitmap through some aliasing. > >> >> 2. Are you ok with extending the gc_* function with TypeInfo >> parameters where appropriate? > > Yes. > >> >> 3. Do you think the pointer bitmap aside the pool memory and adding >> gc_emplace is ok? Or should we investigate other alternatives first? >> >> 4. Would you object adding TypeInfo_Reference? > > Don't have a good answer for that. > >> >> 5. How do you prefer enabling/disabling precise garbage collection? >> Versioning/linking/runtime? > > We should pick precise collection and commit to it. And then add > Leandro's concurrent collector on top! Thanks for the feedback. _______________________________________________ D-runtime mailing list D-runtime@puremagic.com http://lists.puremagic.com/mailman/listinfo/d-runtime |
Copyright © 1999-2021 by the D Language Foundation