September 10, 2010
On Friday 10 September 2010 00:50:32 bearophile wrote:
> Jonathan M Davis:
> > Aren't they _always_ on the heap?
> 
> void main() {
>     int[10] a;
>     int[] b = a[];
> }
> 
> Bye,
> bearophile

Ah, good point. When you have a slice of a static array as opposed to a dynamic arra allocated with new, then it's on the stack. Since I pretty much never use static arrays, I forgot about that.

- Jonathan M Davis
September 10, 2010
On 09/10/2010 10:17 AM, Jonathan M Davis wrote:
> On Friday 10 September 2010 00:50:32 bearophile wrote:
>> Jonathan M Davis:
>>> Aren't they _always_ on the heap?
>>
>> void main() {
>>      int[10] a;
>>      int[] b = a[];
>> }
>>
>> Bye,
>> bearophile
>
> Ah, good point. When you have a slice of a static array as opposed to a dynamic
> arra allocated with new, then it's on the stack. Since I pretty much never use
> static arrays, I forgot about that.
>
> - Jonathan M Davis

int i;
int[] heh = (&i)[0..1];
September 10, 2010
Let's see if I got this right. The GC asks for some memory from the OS, and keeps it in a pool. Then when we have to allocate an array, we take some memory from the GC pool. And when we no longer need the array, the memory gets put back into the pool to be reused. So does that mean the GC doesn't make any pauses, unless it requires more memory from the OS?

Jonathan M Davis Wrote:

> On Thursday 09 September 2010 20:17:23 Andrej Mitrovic wrote:
> > Related: Do stack variables get freed on exit or do they just get marked as unused by the GC? Because I'm not seeing any memory increase over time. I guess I have to read more about how allocation works. :p
> > 
> > Jonathan M Davis Wrote:
> > 
> > _every time_ that you use hit or
> > 
> > > hot in main(), you're calling hit() and hot() and creating them all over
> > > again
> 
> A variable on the stack has nothing to do with the GC unless it's in a delegate (since then the delegate must have its stack in a separate area on the heap). Now, dynamic arrays live on the stack, even if their references don't, so allocating a bunch of those will obviously require more memory. However, in this case, he's done with the array as soon as he's used it, so the GC (which if I understand correctly is called every time that new() is - or at least has the opportunity to run every time that new() is called) can reclaim it right away. There's a decent chance that he only ever allocates one array's worth of memory from the OS. Certainly, he wouldn't end up allocating new memory from the OS every time that he calls hit() or hot().
> 
> Every time that new is called, the GC will allocate the memory that it's asked to from its heap. At least some of the time that new is called, the GC will check to see if any of its heap memory can be recovered, and then recover it (so it's deallocated from the programs perspective but not returned to the OS). If the GC needs more memory than it has free on its heap, it will allocate more from the OS. Ideally, the GC would also look at how much memory that it has in its heap vs how much the program is currently using and then return some of it to the OS if it has too much free, but as I understand it, it doesn't currently do that. So, once your program uses a certain amount of memory, it won't ever use any less until it terminates. Presumably, that will be fixed at some point though.
> 
> - Jonathan M Davis

September 10, 2010
Andrej Mitrovic:
> So does that mean the GC doesn't make any pauses, unless it requires more memory from the OS?

When you ask memory to the GC, it may perform collections, so it performs some computations, even if no new memory gets asked to the OS.

Bye,
bearophile
September 11, 2010
On Friday 10 September 2010 09:44:10 bearophile wrote:
> Andrej Mitrovic:
> > So does that mean the GC doesn't make any pauses, unless it requires more memory from the OS?
> 
> When you ask memory to the GC, it may perform collections, so it performs some computations, even if no new memory gets asked to the OS.
> 
> Bye,
> bearophile

Yeah. That's actually generally where GCs end up causing performance problems. It's not the fact that they have to grab more memory from the OS or give it back but rather the fact that it takes the time to figure out what it can put back in its own memory pool. Even worse, in most GCs, it's completely underministic when that could happen, so it could end up slowing down your program at critical moments. For most apps, that doesn't really matter, and I suspect that D's is currently somewhat more deterministic than most since it only runs the GC code when new is called as opposed to having its own separate thread, but it's a common criticism of GCs. The other would have to do with their memory consumption which stems from the fact that they maintain a memory pool and are essentially guaranteed to be holding more memory that you would if you freed memory immediately when you were done with it. Now, that doesn't necessarily mean that they're less efficient - that depends on the GC and can be hotly debated - but it does mean that your program will require more memory using a GC than not.

- Jonathan M Davis
September 11, 2010
This is why I'm happy to see some people (Leandro in particular) are
already working on different GC designs for D. :)

So to evade the GC's pauses as much as possible, one would stick with using structs or preallocate all needed data before a critical section? I'll have to get more into that eventually, one of my future goals (future as in years+ from now) is to make a realtime musical app (a sequencer).

On Sat, Sep 11, 2010 at 2:15 AM, Jonathan M Davis <jmdavisprog@gmail.com> wrote:
> Yeah. That's actually generally where GCs end up causing performance problems.
> It's not the fact that they have to grab more memory from the OS or give it back
> but rather the fact that it takes the time to figure out what it can put back in
> its own memory pool. Even worse, in most GCs, it's completely underministic when
> that could happen, so it could end up slowing down your program at critical
> moments. For most apps, that doesn't really matter, and I suspect that D's is
> currently somewhat more deterministic than most since it only runs the GC code
> when new is called as opposed to having its own separate thread, but it's a
> common criticism of GCs. The other would have to do with their memory
> consumption which stems from the fact that they maintain a memory pool and are
> essentially guaranteed to be holding more memory that you would if you freed
> memory immediately when you were done with it. Now, that doesn't necessarily
> mean that they're less efficient - that depends on the GC and can be hotly debated
> - but it does mean that your program will require more memory using a GC than
> not.
>
> - Jonathan M Davis
>
September 11, 2010
On Friday 10 September 2010 17:36:06 Andrej Mitrovic wrote:
> This is why I'm happy to see some people (Leandro in particular) are
> already working on different GC designs for D. :)
> 
> So to evade the GC's pauses as much as possible, one would stick with using structs or preallocate all needed data before a critical section? I'll have to get more into that eventually, one of my future goals (future as in years+ from now) is to make a realtime musical app (a sequencer).

I think that realistically, most apps can do realtime just fine with full GC use (certainly with a good GC). You lack _guaranteed_ realtime performance, but you're almost certainly going to get it.

I'm not really sure how you'd avoid the GC running other than avoid using the GC. So, if you heavily use structs and very few classes or dynamic arrays, then there won't be many opportunities for the GC to run a collection cycle. However, once the GC lives in its own thread (which I think that it's bound to do eventually), it could run at any time. Low use of the GC heap means that it will have less to do, so any pauses that it has will likely be shorter (assuming that it can't do what it does in O(1), but I doubt that GCs usually can, if ever), and depending on how it decides when it should do a collection cycle, it may not run the cycle as often, and so you'd get fewer pauses. But you still may get them.

The reality of the matter is that in any program that uses a GC, you're at risk of the GC collection cycle running at some point whether you want it to or not. But with a good GC, odds are that it won't be a problem.

Now, with D's current GC, if you never call any function that allocates or frees from the GC heap, then it's not going to run a collection cycle. So, if you have a critical section of code that _must_ be realtime, and you don't do anything that could allocate or free from the GC heap in that section, then no GC collection cycle will run. That doesn't necessarily mean restricting yourself to structs - classes will work just fine - you just can't allocate any in that section. However, once the GC is more advanced and runs in its own thread (as I assume it will eventually), such a guarantee wouldn't hold anymore (since it could run at any time). However, the fact that you don't allocate or free from the GC heap in a critical section should still reduce the odds of a GC collection cycle being done because it won't need to figure out whether it has enough memory and potentially run a cycle to recover memory.

Overall, the key to minimizing the impact of the GC (other than having a good GC) is to minimize how much you do with the GC heap. But generally-speaking, you can't guarantee that a GC collection cycle isn't going to run unless you don't have a GC.

- Jonathan M Davis
September 11, 2010
What about gc.disable() and gc.enable() ? If I'm sure that I won't allocate anything within a section of code and I have to guarantee realtime performance, then I could disable the gc temporarily. Although this is not exactly what it states in the section on memory management:

"Call std.gc.disable() before the smooth code is run, and
std.gc.enable() afterwards. This will cause the GC to favor allocating
more memory instead of running a collection pass."

Is the gc disabled after the call to gc.disable(), or just relaxed? If it's not disabled, then I'm not sure why it's named like that.

On Sat, Sep 11, 2010 at 4:28 AM, Jonathan M Davis <jmdavisprog@gmail.com> wrote:
> snip
September 11, 2010
This page might need to be updated soon:

http://www.digitalmars.com/d/2.0/memory.html

It refers to custom allocators, overloading new and delete, and using scope for stack allocation.

On Sat, Sep 11, 2010 at 4:40 AM, Andrej Mitrovic <andrej.mitrovich@gmail.com> wrote:
> What about gc.disable() and gc.enable() ? If I'm sure that I won't allocate anything within a section of code and I have to guarantee realtime performance, then I could disable the gc temporarily. Although this is not exactly what it states in the section on memory management:
>
> "Call std.gc.disable() before the smooth code is run, and
> std.gc.enable() afterwards. This will cause the GC to favor allocating
> more memory instead of running a collection pass."
>
> Is the gc disabled after the call to gc.disable(), or just relaxed? If it's not disabled, then I'm not sure why it's named like that.
>
> On Sat, Sep 11, 2010 at 4:28 AM, Jonathan M Davis <jmdavisprog@gmail.com> wrote:
>> snip
>
September 11, 2010
On Friday 10 September 2010 19:40:10 Andrej Mitrovic wrote:
> What about gc.disable() and gc.enable() ? If I'm sure that I won't allocate anything within a section of code and I have to guarantee realtime performance, then I could disable the gc temporarily. Although this is not exactly what it states in the section on memory management:
> 
> "Call std.gc.disable() before the smooth code is run, and
> std.gc.enable() afterwards. This will cause the GC to favor allocating
> more memory instead of running a collection pass."
> 
> Is the gc disabled after the call to gc.disable(), or just relaxed? If it's not disabled, then I'm not sure why it's named like that.
> 
> On Sat, Sep 11, 2010 at 4:28 AM, Jonathan M Davis <jmdavisprog@gmail.com>
wrote:
> > snip

I'm really not all that well informed about the ins and outs of D's GC and trying to minimize its negative effects. The programs that I write care far more about stuff other than realtime performance for me to go to the effort of trying to avoid the GC (especially since that can seriously complicate a program). However, from the sound of it, std.gc.disable() is arguably poorly name.

It obviously doesn't disable the GC completely. new isn't going to start using the manual heap or stop working. It sounds like it doesn't even necessarily disable garbage collection from occurring. My _guess_ would be that what it does is make it so that the GC will allocate memory from the OS if it doesn't have enough free memory in its pool, and if that fails, run a garbage collection cycle to recover memory (since it's either that throw an OutOfMemoryError - or whatever the exception is called exactly - which would kill the program), but I don't know. Certainly, I believe that normally the GC will run a cycle if it's low on free memory in its pool, and then get memory from the OS only if it has to, and if you have called std.gc.disable(), it's not going to be as quick to run a garbage collection cycle. But obviously the documentation does not make it clear enough what _would_ make it run a garbage collection cycle if std.gc.disable() has been called.

In any case, if you're looking to avoid GC collection cycles, it sounds like std.gc.disable() and std.gc.enable() will help. But remember that that will increase the odds that it will have to allocate more memory from the OS, which isn't cheap either. It's almost certainly cheap_er_, but it still wouldn't be cheap. Also, with D's current GC, that means that your program will use more memory overall. Not only will you not be using perfectly useable memory from the GC heap (since it won't have been collected yet), but it will allocate more memory to the heap, and the GC's current implementation never gives that memory back to the OS. So, overall memory usage will increase.

So, odds are that the best bet would be to avoid allocations in critical sections and call std.gc.disable() before entering them and std.gc.enable() when you exit them. But you probably should get input from one of the D GC gurus if you really want to know the absolute best way to reduce the impact of the GC in critical sections.

However, I would point out, as I said before, that on today's systems, odds are that you will get properly realtime performance in spite of the GC and any collection cycles that it runs. D's GC being more primitive may not do as good a job with that as others - like Java's or .NET's - but I'm not sure that you want to complicate your program worrying about the GC unless you profile it appropriately and find out that you need to. Certainly, as D matures, it should become less of an issue.

- Jonathan M Davis