Thread overview
Heap fragmentation - can it be helped?
Feb 14, 2006
Nick
Feb 14, 2006
nick
Feb 14, 2006
Sean Kelly
Feb 14, 2006
Sean Kelly
Feb 16, 2006
Nick
February 14, 2006
I'm currently working on a project where I read in huge amounts of data sequentially, resulting in a lot of temporary memory allocations. To avoid gobbeling up the entire system memory, I though I should delete all the data manually whenever it was no longer needed, but to my surprise it didn't help one bit. I think this is caused by heap fragmentation and the failure of the gc to find a large enough free block, even when enough memory is available.

Here's an example of what I mean.

# import std.stdio;
# import std.gc;
#
# void poolSize()
# {
#   GCStats gc;
#   getStats(gc);
#   writefln("Pool size: ", gc.poolsize);
# }
#
# void main()
# {
#   int[][] allocs;
#
#   // Make some allocations
#
#   allocs ~= new int[60000];
#
#   allocs ~= new int[20]; // Comment out this line
#
#   allocs ~= new int[60000];
#
#   poolSize();
#
#   // Delete all the arrays
#   foreach(int[] i; allocs)
#     delete i;
#
#   int[] a = new int[100000];
#
#   poolSize();
#
# }

The output is:
Pool size: 589824
Pool size: 1048576

which basically means that the gc could not find room for the last array, and
had to increase the total memory pool. If I remove the middle allocation, I get
Pool size: 589824
Pool size: 589824

and everything works as it should.

What happens in the first case? Is it a failure of the DMD heap algorithm, or is it unavoidable? (I don't think it should be.)

Nick


February 14, 2006
There is a concept of "regions" in memory allocation which may help. Also, try reading this paper: http://www.cs.umass.edu/~emery/pubs/berger-oopsla2002.pdf.

As far as specific D solutions - I don't know.


In article <dssrb9$1pe0$1@digitaldaemon.com>, Nick says...
>
>I'm currently working on a project where I read in huge amounts of data sequentially, resulting in a lot of temporary memory allocations. To avoid gobbeling up the entire system memory, I though I should delete all the data manually whenever it was no longer needed, but to my surprise it didn't help one bit. I think this is caused by heap fragmentation and the failure of the gc to find a large enough free block, even when enough memory is available.
>
>Here's an example of what I mean.
>
># import std.stdio;
># import std.gc;
>#
># void poolSize()
># {
>#   GCStats gc;
>#   getStats(gc);
>#   writefln("Pool size: ", gc.poolsize);
># }
>#
># void main()
># {
>#   int[][] allocs;
>#
>#   // Make some allocations
>#
>#   allocs ~= new int[60000];
>#
>#   allocs ~= new int[20]; // Comment out this line
>#
>#   allocs ~= new int[60000];
>#
>#   poolSize();
>#
>#   // Delete all the arrays
>#   foreach(int[] i; allocs)
>#     delete i;
>#
>#   int[] a = new int[100000];
>#
>#   poolSize();
>#
># }
>
>The output is:
>Pool size: 589824
>Pool size: 1048576
>
>which basically means that the gc could not find room for the last array, and
>had to increase the total memory pool. If I remove the middle allocation, I get
>Pool size: 589824
>Pool size: 589824
>
>and everything works as it should.
>
>What happens in the first case? Is it a failure of the DMD heap algorithm, or is it unavoidable? (I don't think it should be.)
>
>Nick
>
>


February 14, 2006
Nick wrote:
> 
> What happens in the first case? Is it a failure of the DMD heap algorithm, or is
> it unavoidable? (I don't think it should be.)

Heap fragmentation is definately avoidable.  In fact, I read a research paper that considers heap fragmentation a "solved problem."  It's mostly a matter of the algorithm used, though the popular approach used by Doug Lea's allocator was one of the best strategies.  For reference, see:

http://www.cs.utexas.edu/users/oops/papers.html

Papers:

"Non-Compacting Memory Allocation and Real-Time Garbage Collection"
"The Memory Fragmentation Problem: Solved?"


Sean
February 14, 2006
Sean Kelly wrote:
> Nick wrote:
>>
>> What happens in the first case? Is it a failure of the DMD heap algorithm, or is
>> it unavoidable? (I don't think it should be.)
> 
> Heap fragmentation is definately avoidable.  In fact, I read a research paper that considers heap fragmentation a "solved problem."  It's mostly a matter of the algorithm used, though the popular approach used by Doug Lea's allocator was one of the best strategies.  For reference, see:
> 
> http://www.cs.utexas.edu/users/oops/papers.html
> 
> Papers:
> 
> "Non-Compacting Memory Allocation and Real-Time Garbage Collection"
> "The Memory Fragmentation Problem: Solved?"

Oops, the link to the second paper on that site is broken.  Try here:

http://citeseer.ist.psu.edu/johnstone97memory.html


Sean
February 16, 2006
Thanks for the hints, guys. But I found out that my memory problems were caused by a completely unrelated part of the program - so basically the DMD GC algorithm is mostly in the clear (but I still think it should perform better in the example I gave, though.)

Nick

In article <dsth2b$2g7n$2@digitaldaemon.com>, Sean Kelly says...
>
>> Heap fragmentation is definately avoidable.  In fact, I read a research paper that considers heap fragmentation a "solved problem."  It's mostly a matter of the algorithm used, though the popular approach used by Doug Lea's allocator was one of the best strategies.  For reference, see:
>> 
>> http://www.cs.utexas.edu/users/oops/papers.html
>> 
>> Papers:
>> 
>> "Non-Compacting Memory Allocation and Real-Time Garbage Collection" "The Memory Fragmentation Problem: Solved?"
>
>Oops, the link to the second paper on that site is broken.  Try here:
>
>http://citeseer.ist.psu.edu/johnstone97memory.html
>
>
>Sean