Thread overview
Memory issues. GC not giving back memory to OS?
Apr 21, 2020
Cristian Becerescu
Apr 21, 2020
Jonathan M Davis
Apr 22, 2020
Arafel
Apr 22, 2020
welkam
Apr 22, 2020
ikod
April 21, 2020
Hi!

A little bit of context first:

I was using DPP and I noticed huge amounts of RAM being used.
So I used valgrind massif and found out that 98% of the process’ memory (~6GB) was allocated for arrays / Appender with mmap.

I then performed a simple test where I incrementally appended 2^30 integers (4GB) to a dynamic array (memory measurements are the same for Appender).
-> Memory used (peak; increasing towards the end of execution): ~7GB
-> capacity == 1.107 * size (at the end of the program)

This is a bit odd, because 1.107 * 2^30 is roughly 4.4GB, and the peak memory consumption was 7GB. Apparently, the GC can correctly collect the memory when manually calling collect() at the end of appending, but that memory (we are talking 7 - 4.4 = 2.6GB) is never given back to the system. At least this is our intuition after making those observations.

I have created a gist with the test code and results (thanks Edi for augmenting the test code to profile the GC): https://gist.github.com/cbecerescu/e6606a8530c56ae06c52e5b1cd32b31f

Just some notes:
- if reserving 2^30 elements for the array (or Appender) beforehand, memory peaks are at 4GB
- C++'s std::vector, without reservation, never gets beyond 4GB and has size == capacity at the end
April 21, 2020
On Tuesday, April 21, 2020 12:31:28 PM MDT Cristian Becerescu via Digitalmars-d wrote:
> This is a bit odd, because 1.107 * 2^30 is roughly 4.4GB, and the peak memory consumption was 7GB. Apparently, the GC can correctly collect the memory when manually calling collect() at the end of appending, but that memory (we are talking 7 - 4.4 = 2.6GB) is never given back to the system. At least this is our intuition after making those observations.

It is my understanding that under normal circumstances, the GC will never return memory to the OS until the program terminates but rather will just keep it around to reuse when more memory needs to be allocated. However, the documentation for core.memory's GC.minimize says that it will return free memory to the OS. So, if you need memory to be returned to the OS while the program is running, you'll probably need to use that.

- Jonathan M Davis



April 21, 2020
On 4/21/20 2:31 PM, Cristian Becerescu wrote:

> I then performed a simple test where I incrementally appended 2^30 integers (4GB) to a dynamic array (memory measurements are the same for Appender).
> -> Memory used (peak; increasing towards the end of execution): ~7GB
> -> capacity == 1.107 * size (at the end of the program)
> 
> This is a bit odd, because 1.107 * 2^30 is roughly 4.4GB, and the peak memory consumption was 7GB. Apparently, the GC can correctly collect the memory when manually calling collect() at the end of appending, but that memory (we are talking 7 - 4.4 = 2.6GB) is never given back to the system. At least this is our intuition after making those observations.

The GC doesn't automatically give back memory to the OS. And it really can't. There's a GC.minimize function, but that is only going to release memory to the OS that can be released. It highly depends on the implementation and the mechanism the OS gives to access memory.

So for example, if all the "free" memory is in the middle of the OS-provided memory segment, then it can't give it back.

> 
> I have created a gist with the test code and results (thanks Edi for augmenting the test code to profile the GC): https://gist.github.com/cbecerescu/e6606a8530c56ae06c52e5b1cd32b31f
> 
> Just some notes:
> - if reserving 2^30 elements for the array (or Appender) beforehand, memory peaks are at 4GB

Right, because it will never reallocate, it just grows within the original memory block. This is what I'd recommend for something like this.

If you don't reserve, then as it grows, it needs a bigger and bigger segment.

And it's not always going to reuse memory that you already used on your way up. Why? Because it can't get a contiguous segment that is free and fits the new requirement. It does try extending in-place if it can, but once it can't, that memory is not usable because the segment is too small to fit your massive data.

But I'd say that the stats you are printing are a bit puzzling. Why does it all of a sudden allow you to collect at the end when it didn't before? It does seem like your output doesn't match your example code. But there are a number of reasons why the GC may not do what you are expecting, including possible bugs in the GC.

> - C++'s std::vector, without reservation, never gets beyond 4GB and has size == capacity at the end

C++ frees the original memory immediately when growing. So it's going to be more memory efficient. You are never going to match a manually managed memory efficiency in terms of space used with a GC.

-Steve
April 22, 2020
On Tuesday, 21 April 2020 at 18:31:28 UTC, Cristian Becerescu wrote:
> Hi!
>
> A little bit of context first:
>
> I was using DPP and I noticed huge amounts of RAM being used.
> So I used valgrind massif and found out that 98% of the process’ memory (~6GB) was allocated for arrays / Appender with mmap.
>
> I then performed a simple test where I incrementally appended 2^30 integers (4GB) to a dynamic array (memory measurements are the same for Appender).
> -> Memory used (peak; increasing towards the end of execution): ~7GB
> -> capacity == 1.107 * size (at the end of the program)
>
> This is a bit odd, because 1.107 * 2^30 is roughly 4.4GB, and the peak memory consumption was 7GB. Apparently, the GC can correctly collect the memory when manually calling collect() at the end of appending, but that memory (we are talking 7 - 4.4 = 2.6GB) is never given back to the system. At least this is our intuition after making those observations.
>
> I have created a gist with the test code and results (thanks Edi for augmenting the test code to profile the GC): https://gist.github.com/cbecerescu/e6606a8530c56ae06c52e5b1cd32b31f
>
> Just some notes:
> - if reserving 2^30 elements for the array (or Appender) beforehand, memory peaks are at 4GB
> - C++'s std::vector, without reservation, never gets beyond 4GB and has size == capacity at the end

IMHO this happens because each time you requested larger contiguous memory region. Runtime have to allocate (or reallocate) larger piece of memory (at higher addresses), copy old content and then release old piece of memory. But old piece of memory can't be released to OS as heap area can be released only from the top.
April 22, 2020
On Tuesday, 21 April 2020 at 20:29:37 UTC, Steven Schveighoffer wrote:
> 
> C++ frees the original memory immediately when growing. So it's going to be more memory efficient. You are never going to match a manually managed memory efficiency in terms of space used with a GC.
>

How much of Phobos is betterC compatible?

I encountered the same issues with GC couple of years ago and abandoned our plans to migrate from C++ to D for one of our core products. (I'm not encouraging anyone to do the same, do your own analysis and take the decision.)

To see recent posts about chasing Rust with @live with all these existing baggage... Hmm.. Don't know what to say... This might excite a PL theorist/researcher, but not a programmer who can't get his app to work in the most basic form...

Walter, memory efficiency first please, arcane safety later.

--
If you don't have anything nice to say, don't say anything at all.
April 22, 2020
On 21/4/20 22:23, Jonathan M Davis wrote:
> On Tuesday, April 21, 2020 12:31:28 PM MDT Cristian Becerescu via
> Digitalmars-d wrote:
>> This is a bit odd, because 1.107 * 2^30 is roughly 4.4GB, and the
>> peak memory consumption was 7GB. Apparently, the GC can correctly
>> collect the memory when manually calling collect() at the end of
>> appending, but that memory (we are talking 7 - 4.4 = 2.6GB) is
>> never given back to the system. At least this is our intuition
>> after making those observations.
> 
> It is my understanding that under normal circumstances, the GC will never
> return memory to the OS until the program terminates but rather will just
> keep it around to reuse when more memory needs to be allocated. However, the
> documentation for core.memory's GC.minimize says that it will return free
> memory to the OS. So, if you need memory to be returned to the OS while the
> program is running, you'll probably need to use that.
> 
> - Jonathan M Davis
> 
> 
> 

I had a similar issue some time ago, and found that the memory wouldn't be returned to the OS even after the GC had freed it. I had to call malloc_trim [1] manually, this seems to be a libc / OS issue (I'm exclusively using linux, I don't know if this is also an issue with Windows or Mac).

Could this be also happening here?

A.

[1]: http://man7.org/linux/man-pages/man3/malloc_trim.3.html
April 22, 2020
On Wednesday, 22 April 2020 at 07:25:34 UTC, Arun Chandrasekaran wrote:
> Walter, memory efficiency first please, arcane safety later.

You can do everything in D that you can do in C++ when it comes to memory management. Also a good system that tracks pointers can be used to turn GC allocations to malloc/free pair and some allocations can be turnet to stack allocations (llvm does some of that). Safety features can be used as performance features with some additional work.
April 22, 2020
On Wednesday, 22 April 2020 at 13:13:29 UTC, welkam wrote:
> On Wednesday, 22 April 2020 at 07:25:34 UTC, Arun Chandrasekaran wrote:
>> Walter, memory efficiency first please, arcane safety later.
>
> You can do everything in D that you can do in C++ when it comes to memory management.

We can do the same with Java as well, use JNI, manual memory management, etc. But will we?

So when I say "we can't" it doesn't mean technically we can't. It is just that the alternatives are better than what's being offered in D.