November 01, 2009 [Issue 3463] New: Integrate Precise Heap Scanning Into the GC | ||||
---|---|---|---|---|
| ||||
http://d.puremagic.com/issues/show_bug.cgi?id=3463 Summary: Integrate Precise Heap Scanning Into the GC Product: D Version: 2.035 Platform: Other OS/Version: Windows Status: NEW Keywords: patch Severity: enhancement Priority: P2 Component: druntime AssignedTo: sean@invisibleduck.org ReportedBy: dsimcha@yahoo.com --- Comment #0 from David Simcha <dsimcha@yahoo.com> 2009-11-01 10:45:26 PST --- Created an attachment (id=487) Patches to the GC I've created patches that allow for precise heap scanning in the GC by storing a pointer to pointer offset information in the last (void*).sizeof bytes of each allocated memory block that is to be scanned. The attached patch patches gcx.d to do this, and fixes a few other minor issues in the runtime to make everything compatible. By default, if no bitmask is provided, a conservative bitmask is used to replicate the old behavior. The bitmask format is documented in bitmaskTempl.d, which also provides templates for generating the masks, some basic tests to make sure the precise heap scanning works, and prototypes of functions for creating precisely scanned arrays and class instances. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
November 01, 2009 [Issue 3463] Integrate Precise Heap Scanning Into the GC | ||||
---|---|---|---|---|
| ||||
Posted in reply to David Simcha | http://d.puremagic.com/issues/show_bug.cgi?id=3463 --- Comment #1 from David Simcha <dsimcha@yahoo.com> 2009-11-01 10:46:03 PST --- Created an attachment (id=488) Templates to generate bit masks, documentation of format. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
November 01, 2009 [Issue 3463] Integrate Precise Heap Scanning Into the GC | ||||
---|---|---|---|---|
| ||||
Posted in reply to David Simcha | http://d.puremagic.com/issues/show_bug.cgi?id=3463 David Simcha <dsimcha@yahoo.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Attachment #487 is|0 |1 obsolete| | --- Comment #2 from David Simcha <dsimcha@yahoo.com> 2009-11-01 10:48:54 PST --- Created an attachment (id=489) Correct patch. Accidentally attached the wrong one. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
November 01, 2009 [Issue 3463] Integrate Precise Heap Scanning Into the GC | ||||
---|---|---|---|---|
| ||||
Posted in reply to David Simcha | http://d.puremagic.com/issues/show_bug.cgi?id=3463 Leandro Lucarella <llucax@gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |llucax@gmail.com --- Comment #3 from Leandro Lucarella <llucax@gmail.com> 2009-11-01 11:38:16 PST --- The patch looks nice. I have some questions: * Why did you choose to store the bitmask after the SENTINEL_POST and not before? I think that storing the bitmask before the SENTINEL could let you detect a corrupted bitmaks when version SENTINEL is compiled. * In the bitMaskMixin string mixin you have a nested function setBitMask() that's used only once. I wonder if you reused that function before or if you put that code in a nested function just because you you think it's more clear that way. It kind of confused me at first. * Why is the bitMaskMixin a mixin and not just a plain function? I can't see any reason to make it a string mixin, I am missing something? I find this very confusing and makes the code harder to follow, since some variables appear from nowhere. Thanks for the good work. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
November 01, 2009 [Issue 3463] Integrate Precise Heap Scanning Into the GC | ||||
---|---|---|---|---|
| ||||
Posted in reply to David Simcha | http://d.puremagic.com/issues/show_bug.cgi?id=3463 --- Comment #4 from David Simcha <dsimcha@yahoo.com> 2009-11-01 11:49:58 PST --- 1. I chose to store the bitmask after SENTINEL_POST so that none of the assumptions of the sentinel code (such as that the sentinel is immediately after the data) changes. 2. The fact that setBitMask() is a nested function is a minor holdover from when the design was a little different. If anyone really hates it a lot, it can be refactored. 3. The mixin is because I needed a lot of the same logic in realloc() and extend() and it was complicated enough that I felt it was the lesser of two evils to use a mixin, even with the "variables appearing out of nowhere" magic, rather than duplicate that logic. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
November 01, 2009 [Issue 3463] Integrate Precise Heap Scanning Into the GC | ||||
---|---|---|---|---|
| ||||
Posted in reply to David Simcha | http://d.puremagic.com/issues/show_bug.cgi?id=3463 --- Comment #5 from Leandro Lucarella <llucax@gmail.com> 2009-11-01 12:31:57 PST --- (In reply to comment #4) > 1. I chose to store the bitmask after SENTINEL_POST so that none of the assumptions of the sentinel code (such as that the sentinel is immediately after the data) changes. Seems reasonable, the SENTINEL version is not very used anyway. > 2. The fact that setBitMask() is a nested function is a minor holdover from when the design was a little different. If anyone really hates it a lot, it can be refactored. I agree is not terrible, but since it's a pretty trivial change I guess it could be nice to remove it, to improve readability (I don't think is a performance problem, readability and complexity is my only concern). If you don't feel like changing it yourself I can upload an amended patch. > 3. The mixin is because I needed a lot of the same logic in realloc() and extend() and it was complicated enough that I felt it was the lesser of two evils to use a mixin, even with the "variables appearing out of nowhere" magic, rather than duplicate that logic. Sure, duplicating code is never a good idea. The question is, why it can't be done with a plain-old function? -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
November 01, 2009 [Issue 3463] Integrate Precise Heap Scanning Into the GC | ||||
---|---|---|---|---|
| ||||
Posted in reply to David Simcha | http://d.puremagic.com/issues/show_bug.cgi?id=3463 --- Comment #6 from David Simcha <dsimcha@yahoo.com> 2009-11-01 12:36:22 PST --- > > 3. The mixin is because I needed a lot of the same logic in realloc() and extend() and it was complicated enough that I felt it was the lesser of two evils to use a mixin, even with the "variables appearing out of nowhere" magic, rather than duplicate that logic. > > Sure, duplicating code is never a good idea. The question is, why it can't be done with a plain-old function? Because I needed to dump a whole bunch of variables (not just 1) into the stack frames of realloc() and extend() and the only way this could have been done with a plain old function would be to create a struct, create a function that returns the struct, etc. or to use lots and lots of out paramters. I really felt the mixin was the least unclear way that this logic could be injected into both extend() and realloc(). -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
November 03, 2009 [Issue 3463] Integrate Precise Heap Scanning Into the GC | ||||
---|---|---|---|---|
| ||||
Posted in reply to David Simcha | http://d.puremagic.com/issues/show_bug.cgi?id=3463 --- Comment #7 from Sean Kelly <sean@invisibleduck.org> 2009-11-03 07:52:44 PST --- Nice work! It may be preferable to store the pointer elsewhere however. I believe all blocks returned by the allocator must be 16 byte-aligned, so tacking a pointer onto the end of a block either screws this up or uses up a lot more space than necessary. I also kind of wish that the pointer didn't have to be stored at all for small block sizes, since simply storing the mask itself would take up less space (admittedly, at the expense of more complicated logic). -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
November 03, 2009 [Issue 3463] Integrate Precise Heap Scanning Into the GC | ||||
---|---|---|---|---|
| ||||
Posted in reply to David Simcha | http://d.puremagic.com/issues/show_bug.cgi?id=3463 --- Comment #8 from David Simcha <dsimcha@yahoo.com> 2009-11-03 08:06:43 PST --- (In reply to comment #7) > Nice work! It may be preferable to store the pointer elsewhere however. I believe all blocks returned by the allocator must be 16 byte-aligned, so tacking a pointer onto the end of a block either screws this up or uses up a lot more space than necessary. I don't understand. If someone requests, for example, a 12-byte allocation, the pointer is stored in the last 4 bytes (on 32-bit) of a 16-byte block. I don't increase the block capacity unless I have to. Yes, occasionally it will result in a doubling of the required capacity, but unless you request an allocation within 4 bytes of a full block size, it uses no extra space. The expected value under pseudo-random allocation sizes is probably (I haven't worked this out formally) only 4 bytes on 32-bit. Furthermore, if the NO_SCAN bit is set, no bit mask information at all is stored. This optimization was part of the reason I chose to use the end of the block: Otherwise I probably would have had to reserve space somewhere else before I knew the status of the NO_SCAN bit, meaning that this optimization would have been unimplementable. > I also kind of wish that the pointer didn't > have to be stored at all for small block sizes, since simply storing the mask > itself would take up less space (admittedly, at the expense of more complicated > logic). I thought about this, but the problem I kept coming up with was that tracking the size of the mask would require a couple bytes anyhow. IMHO it was important to keep this patch relatively simple and stupid and easy to debug. It's clearly not a long-term solution to our GC woes because it leaves unaddressed so many unrelated issues that will eventually require a full redesign. It's more of an incremental improvement to make the GC "good enough" until D is popular enough that some GC expert implements generational, moving, parallel, etc. GC. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
November 03, 2009 [Issue 3463] Integrate Precise Heap Scanning Into the GC | ||||
---|---|---|---|---|
| ||||
Posted in reply to David Simcha | http://d.puremagic.com/issues/show_bug.cgi?id=3463 Sean Kelly <sean@invisibleduck.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |ASSIGNED --- Comment #9 from Sean Kelly <sean@invisibleduck.org> 2009-11-03 08:15:26 PST --- My apologies. Those comments were based on your description of what you were doing, and I came to the wrong conclusion. I'll give the patch a try! I'm also kind of curious what impact this will have on collection times. Seems like it should be faster overall. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- |
Copyright © 1999-2021 by the D Language Foundation