March 17, 2005 Re: Fast Memory Allocation Strategy | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Craig Black | "Craig Black" <cblack@ara.com> skrev i en meddelelse news:d1cago$192g$1@digitaldaemon.com... > You don't understand the power of contiguous memory. I believe I do, and I suggested a way to use such allocations, didn't I? Regards, Martin | |||
March 17, 2005 Re: Fast Memory Allocation Strategy | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Craig Black | In article <d1cdkg$1cp6$1@digitaldaemon.com>, Craig Black says... > >> umm. you are kidding, right? There are GC's of every sort floating about - compacting, rarely-copying, never-copying. Conservattive vs precise. Mark/sweep vs generational. Incremental vs stop-the-world. Plus everything in between. Each has different trade-offs. > >Nope not kidding. I'm a C++ guy that has no personal experience with GC. I only know what I've read about the Java and C# GC's that are compacting. That's why I originally posted this on D.learn and got flamed for it. I'm trying to learn the best approach to GC and so I proposed an idea that I thought would work well. As far as the different GC approaches you listed, I don't know any details about them. I just know basic concepts. > >> A specific issue with your proposal is that determining if the last item can be moved has a non-trivial cost. One would have to track all references and either update those references to the last block or live with blocks that can't be moved. In a language that allows pointers it gets expensive to try to move memory around. > >If we can, theoretically, design a silver bullet GC, why use pointers anymore? Yes, we would have to move memory and update references accordingly, but this would not be a problem if the right algorithm was implemented. For example, if the GC references indexed a buffer rather than pointing directly to memory, then when the buffer is resized, there would be no need to change references. With the right implementation, I believe that it can be done efficiently. > >> Boehm has an interesting paper that I can't find right now that argues that compacting collectors aren't any better than non-compacting collectors. The time spent figuring out how/if to move something can be as expensive as dealing with fragmentation. > >Why then do I hear all the time that compacting collectors outperform noncompacting ones? Fragmentation is a worse problem than moving memory IMO. The problem with fragmentation is that it gets worse over time. Moving memory doesn't. > The current D GC tries to reuse collected memory in-place before allocating another page, so it is pretty efficient with memory resources, which IMHO will probably also translate into faster running programs from start to finish, especially in high-load environments, etc. The current GC doesn't really seem to fragment that much, or at least when it does it's likely over contigous blocks in the same area of memory - which will likely be used the next time any reasonable sized (<= 2K) object(s) are allocated. On the other end, another advantage of the current design is that auto-collection is attempted only when needed to fulfill a new request for the size of the object being allocated. This should make it suitable for interactive apps. (rather than some other type of 'stop-and-move-the-world' algorithm). The price paid with the current D GC is speed of allocation of many small objects in a tight loop. I'm wondering out loud here if maybe the best all-around-gc strategy going forward would be to somehow optimize the current algorithm (or a type-aware one much like it) to allocate smaller objects faster? Ironically, the best way to do that may be to find some way to optimize the collection part of the cycle. Your idea in the OP would seem to nail the allocation speed problem, at least until the first full collection is needed. But as previously mentioned we'd almost certainly have to pay for it somewhere, like in longer GC related pauses. A different set of eyes (yours) going through and profiling the current GC code may well be worth it. >-Craig > > > | |||
March 17, 2005 Re: Fast Memory Allocation Strategy | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Craig Black | In article <d1a3oo$218h$1@digitaldaemon.com>, Craig Black says... > >Objects are organized into buffers according to their size. So if an object is 10 words long, it will go into the 10-word buffer. If an object is 5 words long, it goes into the 5-word buffer, etc. Thus all objects in each buffer are of equal size. > >The buffers must be resizable. When an object is instantiated from a buffer >that is full, the buffer will be enlarged to facilitate more objects. When >the >buffer is resized the GC must move all the objects. When an object is freed >from memory, the gap in the buffer is filled with the last item in the >buffer. This requires the GC to move that block of memory. Thus there is >no memory fragmentation within each buffer. > Perhaps an Object Pool is what you're looking for ? Just google around for some examples, and beware that they'll likely be in Java. Drop me a line if you need an example implementation... I might be able to throw one together in D. Granted, it won't fix all the fragmentation problems with D's GC. IMO, that would be best left to a later rendition that's capable of relocating chunks of memory (copying/generational GC)... which is what you're proposing. :) - EricAnderton at yahoo | |||
March 17, 2005 Re: Fast Memory Allocation Strategy | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Dave | Dave wrote:
> In article <d1cdkg$1cp6$1@digitaldaemon.com>, Craig Black says...
>
>>>umm. you are kidding, right? There are GC's of every sort floating about - compacting, rarely-copying, never-copying. Conservattive vs precise. Mark/sweep vs generational. Incremental vs stop-the-world. Plus everything in between. Each has different trade-offs.
>>
>>Nope not kidding. I'm a C++ guy that has no personal experience with GC. I only know what I've read about the Java and C# GC's that are compacting. That's why I originally posted this on D.learn and got flamed for it. I'm trying to learn the best approach to GC and so I proposed an idea that I thought would work well. As far as the different GC approaches you listed, I don't know any details about them. I just know basic concepts.
>>
>>
>>>A specific issue with your proposal is that determining if the last item can be moved has a non-trivial cost. One would have to track all references and either update those references to the last block or live with blocks that can't be moved. In a language that allows pointers it gets expensive to try to move memory around.
>>
>>If we can, theoretically, design a silver bullet GC, why use pointers anymore? Yes, we would have to move memory and update references accordingly, but this would not be a problem if the right algorithm was implemented. For example, if the GC references indexed a buffer rather than pointing directly to memory, then when the buffer is resized, there would be no need to change references. With the right implementation, I believe that it can be done efficiently.
>>
>>
>>>Boehm has an interesting paper that I can't find right now that argues that compacting collectors aren't any better than non-compacting collectors. The time spent figuring out how/if to move something can be as expensive as dealing with fragmentation.
>>
>>Why then do I hear all the time that compacting collectors outperform noncompacting ones? Fragmentation is a worse problem than moving memory IMO. The problem with fragmentation is that it gets worse over time. Moving memory doesn't.
>>
>
>
> The current D GC tries to reuse collected memory in-place before allocating
> another page, so it is pretty efficient with memory resources, which IMHO will
> probably also translate into faster running programs from start to finish,
> especially in high-load environments, etc. The current GC doesn't really seem to
> fragment that much, or at least when it does it's likely over contigous blocks
> in the same area of memory - which will likely be used the next time any
> reasonable sized (<= 2K) object(s) are allocated.
>
> On the other end, another advantage of the current design is that
> auto-collection is attempted only when needed to fulfill a new request for the
> size of the object being allocated. This should make it suitable for interactive
> apps. (rather than some other type of 'stop-and-move-the-world' algorithm).
>
> The price paid with the current D GC is speed of allocation of many small
> objects in a tight loop. I'm wondering out loud here if maybe the best
> all-around-gc strategy going forward would be to somehow optimize the current
> algorithm (or a type-aware one much like it) to allocate smaller objects faster?
> Ironically, the best way to do that may be to find some way to optimize the
> collection part of the cycle.
>
> Your idea in the OP would seem to nail the allocation speed problem, at least
> until the first full collection is needed. But as previously mentioned we'd
> almost certainly have to pay for it somewhere, like in longer GC related pauses.
>
> A different set of eyes (yours) going through and profiling the current GC code
> may well be worth it.
>
>
>>-Craig
>>
>>
>>
If you are allocating a lot of items ( which are ostensibly discarded ), you probably want an algorithm which touches only the live objects during collection, like a copying collector ( which is also compacting).
An allocation ( see the paper my other post in this topic ) should be just a pointer decrement, even if from different buckets as the OP mentioned.
-David
| |||
March 17, 2005 Re: Fast Memory Allocation Strategy | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Martin M. Pedersen | > I believe I do, and I suggested a way to use such allocations, didn't I?
Sorry, perhaps I misunderstood what you were saying.
-Craig
| |||
March 17, 2005 Re: Fast Memory Allocation Strategy | ||||
|---|---|---|---|---|
| ||||
Posted in reply to David Medlock | "David Medlock" <amedlock@nospam.org> wrote in message news:d1cpaa$1p90$1@digitaldaemon.com... > Dave wrote: >> In article <d1cdkg$1cp6$1@digitaldaemon.com>, Craig Black says... >> >>>>umm. you are kidding, right? There are GC's of every sort floating about - compacting, rarely-copying, never-copying. Conservattive vs precise. Mark/sweep vs generational. Incremental vs stop-the-world. Plus everything in between. Each has different trade-offs. >>> >>>Nope not kidding. I'm a C++ guy that has no personal experience with GC. I only know what I've read about the Java and C# GC's that are compacting. That's why I originally posted this on D.learn and got flamed for it. I'm trying to learn the best approach to GC and so I proposed an idea that I thought would work well. As far as the different GC approaches you listed, I don't know any details about them. I just know basic concepts. >>> >>> >>>>A specific issue with your proposal is that determining if the last item can be moved has a non-trivial cost. One would have to track all references and either update those references to the last block or live with blocks that can't be moved. In a language that allows pointers it gets expensive to try to move memory around. >>> >>>If we can, theoretically, design a silver bullet GC, why use pointers anymore? Yes, we would have to move memory and update references accordingly, but this would not be a problem if the right algorithm was implemented. For example, if the GC references indexed a buffer rather than pointing directly to memory, then when the buffer is resized, there would be no need to change references. With the right implementation, I believe that it can be done efficiently. >>> >>> >>>>Boehm has an interesting paper that I can't find right now that argues that compacting collectors aren't any better than non-compacting collectors. The time spent figuring out how/if to move something can be as expensive as dealing with fragmentation. >>> >>>Why then do I hear all the time that compacting collectors outperform noncompacting ones? Fragmentation is a worse problem than moving memory IMO. The problem with fragmentation is that it gets worse over time. Moving memory doesn't. >>> >> >> >> The current D GC tries to reuse collected memory in-place before >> allocating >> another page, so it is pretty efficient with memory resources, which IMHO >> will >> probably also translate into faster running programs from start to >> finish, >> especially in high-load environments, etc. The current GC doesn't really >> seem to >> fragment that much, or at least when it does it's likely over contigous >> blocks >> in the same area of memory - which will likely be used the next time any >> reasonable sized (<= 2K) object(s) are allocated. >> >> On the other end, another advantage of the current design is that >> auto-collection is attempted only when needed to fulfill a new request >> for the >> size of the object being allocated. This should make it suitable for >> interactive >> apps. (rather than some other type of 'stop-and-move-the-world' >> algorithm). >> >> The price paid with the current D GC is speed of allocation of many small >> objects in a tight loop. I'm wondering out loud here if maybe the best >> all-around-gc strategy going forward would be to somehow optimize the >> current >> algorithm (or a type-aware one much like it) to allocate smaller objects >> faster? >> Ironically, the best way to do that may be to find some way to optimize >> the >> collection part of the cycle. >> >> Your idea in the OP would seem to nail the allocation speed problem, at >> least >> until the first full collection is needed. But as previously mentioned >> we'd >> almost certainly have to pay for it somewhere, like in longer GC related >> pauses. >> >> A different set of eyes (yours) going through and profiling the current >> GC code >> may well be worth it. >> >> >>>-Craig >>> >>> >>> > > If you are allocating a lot of items ( which are ostensibly discarded ), you probably want an algorithm which touches only the live objects during collection, like a copying collector ( which is also compacting). > > An allocation ( see the paper my other post in this topic ) should be just a pointer decrement, even if from different buckets as the OP mentioned. > > -David Moving to the next pointer is basically what it does now - unless the 'bucket' is empty. Then it tries to 'backfill' to add to the bucket. If that isn't enough, then it collects. Finally it requests more memory from the system if needed. Step 3 is the only time the world stops, but it's also by far the largest block of code. AFAIK, a copying collector always needs to allocate twice as much from the system as is required for the "active" heap (so it can swap 'To Space' & 'From Space'). The ultimate in trading space for time. More to your point, the really fast allocation of a copying collector could mean less overall memory use because things simply take less time; objects are more apt to 'live' for less time (and therefore be collected faster) and programs finish faster. My point was that since the current GC is pretty resource efficient, that will probably translate into better overall performance in heavy-use environments. And the way that collections are handled would at least seem to minimize pause times for lighter-use interactive environments. No doubt slow allocation can be an issue. I don't have a real clear sense of if a copying collector would be better overall or not, but since the current GC seems to work Ok and is already developed, I'm wondering if trying to optimize that somehow would be the quickest way to a better all-around GC? - Dave | |||
March 17, 2005 Re: Fast Memory Allocation Strategy | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Dave | > The current D GC tries to reuse collected memory in-place before > allocating > another page, so it is pretty efficient with memory resources, which IMHO > will > probably also translate into faster running programs from start to finish, > especially in high-load environments, etc. The current GC doesn't really > seem to > fragment that much, or at least when it does it's likely over contigous > blocks > in the same area of memory - which will likely be used the next time any > reasonable sized (<= 2K) object(s) are allocated. Let me get this straight. The GC allocates units until it runs out of space. When it runs out of space, it makes a pass (sweep) to collect units that have no reference. Is this correct? If this is so, why is a sweep even required? When a reference is set to zero, couldn't D just record that the unit is lost right then? Why can't this be done deterministically? Wouldn't this be faster? > On the other end, another advantage of the current design is that > auto-collection is attempted only when needed to fulfill a new request for > the > size of the object being allocated. This should make it suitable for > interactive > apps. (rather than some other type of 'stop-and-move-the-world' > algorithm). > > The price paid with the current D GC is speed of allocation of many small > objects in a tight loop. I'm wondering out loud here if maybe the best > all-around-gc strategy going forward would be to somehow optimize the > current > algorithm (or a type-aware one much like it) to allocate smaller objects > faster? > Ironically, the best way to do that may be to find some way to optimize > the > collection part of the cycle. > > Your idea in the OP would seem to nail the allocation speed problem, at > least > until the first full collection is needed. But as previously mentioned > we'd > almost certainly have to pay for it somewhere, like in longer GC related > pauses. I cannot deny that my approach could cause GC pauses when the buffers resize, but this would happen very infrequently. > A different set of eyes (yours) going through and profiling the current GC > code > may well be worth it. I would like to help however I can, but my understanding of the D's GC is elementary. I don't currently understand why there even needs to be a "sweep". -Craig | |||
March 17, 2005 Re: Fast Memory Allocation Strategy | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Craig Black | In article <d1csh0$1t53$1@digitaldaemon.com>, Craig Black says... > >Let me get this straight. The GC allocates units until it runs out of space. When it runs out of space, it makes a pass (sweep) to collect units that have no reference. Is this correct? > >If this is so, why is a sweep even required? When a reference is set to zero, couldn't D just record that the unit is lost right then? Why can't this be done deterministically? Wouldn't this be faster? Possibly. But D doesn't currently do reference counting. The references are really just pointers. >I would like to help however I can, but my understanding of the D's GC is elementary. I don't currently understand why there even needs to be a "sweep". See above. Sean | |||
March 17, 2005 Re: Fast Memory Allocation Strategy | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Sean Kelly | "Sean Kelly" <sean@f4.ca> wrote in message news:d1cu7s$1v6p$1@digitaldaemon.com... > In article <d1csh0$1t53$1@digitaldaemon.com>, Craig Black says... >> >>Let me get this straight. The GC allocates units until it runs out of >>space. When it runs out of space, it makes a pass (sweep) to collect >>units >>that have no reference. Is this correct? >> >>If this is so, why is a sweep even required? When a reference is set to zero, couldn't D just record that the unit is lost right then? Why can't this be done deterministically? Wouldn't this be faster? > > Possibly. But D doesn't currently do reference counting. The references > are > really just pointers. Gotcha. Well, if you want to improve performance, it might help to use reference counting, and completely remove the sweep. I don't see any drawback to this approach, except that an extra integer must be included in each object. -Craig | |||
March 17, 2005 Re: Fast Memory Allocation Strategy | ||||
|---|---|---|---|---|
| ||||
Posted in reply to pragma | > Perhaps an Object Pool is what you're looking for ? Just google around
> for some
> examples, and beware that they'll likely be in Java. Drop me a line if
> you need
> an example implementation... I might be able to throw one together in D.
>
> Granted, it won't fix all the fragmentation problems with D's GC. IMO,
> that
> would be best left to a later rendition that's capable of relocating
> chunks of
> memory (copying/generational GC)... which is what you're proposing. :)
Yes, similar to a copying GC, except that the objects are separated into buffers depending on the size of the object. This facilitates easy compacting within buffers that is performed each time an object is deallocated.
-Craig
| |||
Copyright © 1999-2021 by the D Language Foundation
Permalink
Reply