View mode: basic / threaded / horizontal-split · Log in · Help
November 07, 2012
Manually freeing up memory
Hello all,

I'm doing some work with a fairly large dataset.  For various reasons it's 
convenient to import it first as simply an array of data points which is then 
used to generate other data structures (actually, technically it's an array of 
data points plus a couple of associative arrays, which ideally would instead be 
sets; but I think that's a minor detail).

Once the various data structures are in place, it's possible to discard the 
initial array data.  It would be very desirable to free up the memory allocated, 
as it's a very large amount.  However, I can't work out how to do this.

I've tried calling destroy() on the input data, with and without a subsequent 
GC.collect(), but the program's memory usage still remains at its peak level. 
This is a shame, because that peak memory usage only needs to last for a short 
part of the program's total runtime, and it seems only polite to other computer 
users to give back the excess memory.

Can anyone advise?  I would rather not disable the GC entirely as there's lots 
of Phobos I want to be able to use -- but I'd really like it if I could indicate 
categorically to the GC, "these objects and arrays need to be deleted and the 
memory freed _now_".

Thanks and best wishes,

      -- Joe
November 07, 2012
Re: Manually freeing up memory
Joseph Rushton Wakeling:

> Can anyone advise?  I would rather not disable the GC entirely 
> as there's lots of Phobos I want to be able to use -- but I'd 
> really like it if I could indicate categorically to the GC, 
> "these objects and arrays need to be deleted and the memory 
> freed _now_".

One solution is to allocate the original array on the C heap. 
Another solution is to allocate it normally from the GC heap and 
then use GC.free().

Maybe a third option is to use a memory-mapped file for the first 
array.

Bye,
bearophile
November 07, 2012
Re: Manually freeing up memory
On 11/07/2012 03:17 PM, bearophile wrote:
> One solution is to allocate the original array on the C heap. Another solution
> is to allocate it normally from the GC heap and then use GC.free().

Well, what I've got is something like this:

      auto raw = rawInput();      /* loads data and outputs a struct containing
                                     the array of data */
      auto data = rawToData(raw); // converts the raw input to data structure
      GC.free(raw.links.ptr);     // _should_ free up the allocated memory?

... but despite the GC.free(), memory usage stays at peak level for the rest of 
the runtime of the function.

I tried preceding the free() with a destroy(raw) or destroy(raw.links) also to 
no avail.

> Maybe a third option is to use a memory-mapped file for the first array.

That's an interesting thought, which I'll look into.  Another thought was to 
dump the data into an SQL DB and read/sample from there as necessary, but IIRC 
the SQL support available for D is somewhat limited right now ... ?
November 07, 2012
Re: Manually freeing up memory
Joseph Rushton Wakeling:

> ... but despite the GC.free(), memory usage stays at peak level 
> for the rest of the runtime of the function.

GC.free() usually works. Some memory allocators don't give back 
the memory to the OS, no matter what, until the process is over, 
despite that memory is free for the process to use in other ways 
(this is what often happens in Python on Windows).

If I am right, then if you try to allocate memory from the same 
program after GC.free() the total memory used by that process 
will not increase.

Bye,
bearophile
November 07, 2012
Re: Manually freeing up memory
On Wed, Nov 07, 2012 at 06:12:52PM +0100, bearophile wrote:
> Joseph Rushton Wakeling:
> 
> >... but despite the GC.free(), memory usage stays at peak level
> >for the rest of the runtime of the function.
> 
> GC.free() usually works. Some memory allocators don't give back the
> memory to the OS, no matter what, until the process is over, despite
> that memory is free for the process to use in other ways (this is
> what often happens in Python on Windows).
[...]

I think on Posix systems, malloc/free does not return freed memory back
to the OS, it just gets reused by the process later on.

If you want to return memory back to the OS, you could call sbrk()...
but that is highly *NOT* recommended unless you know exactly what you're
doing, and you know the innards of your C library (*and* D runtime) like
the back of your hand. But it *is* the "hardcore" way of doing it. :-)

An easier workaround might be to fork() a process that constructs
whatever data structures you need, transmits that to the main process
somehow, then exit. If I understand it correctly, the large memory
allocations will be restricted to the child process, which will get
returned to the OS once it exits. (Note that you have to use fork(), not
threads, because threads share memory in the same process so you end up
with the same problem.)


T

-- 
Question authority. Don't ask why, just do it.
November 07, 2012
Re: Manually freeing up memory
On 11/07/2012 06:53 PM, H. S. Teoh wrote:
> I think on Posix systems, malloc/free does not return freed memory back
> to the OS, it just gets reused by the process later on.

I have to say that in this program, it looks like the memory usage keeps 
increasing even after the free(), even though theoretically the amount it's 
possible to free up would dwarf any subsequent memory requirements.

Using GC.missing() seems to return a very little bit of memory to the OS, 
depending on which compiler is used, but nowhere near the amount it's 
theoretically possible to hand back.

> An easier workaround might be to fork() a process that constructs
> whatever data structures you need, transmits that to the main process
> somehow, then exit. If I understand it correctly, the large memory
> allocations will be restricted to the child process, which will get
> returned to the OS once it exits. (Note that you have to use fork(), not
> threads, because threads share memory in the same process so you end up
> with the same problem.)

Nice thought!  I'll have a look at doing this.
November 08, 2012
Re: Manually freeing up memory
Am Wed, 07 Nov 2012 19:56:35 +0100
schrieb Joseph Rushton Wakeling <joseph.wakeling@webdrake.net>:

> On 11/07/2012 06:53 PM, H. S. Teoh wrote:
> > I think on Posix systems, malloc/free does not return freed memory back
> > to the OS, it just gets reused by the process later on.
> 
> I have to say that in this program, it looks like the memory usage keeps 
> increasing even after the free(), even though theoretically the amount it's 
> possible to free up would dwarf any subsequent memory requirements.

Could it be that you still hold a reference to the raw memory
in your data structures ? A slice would be a typical candidate:
s.name = raw[a .. b];
You probably checked that already...

-- 
Marco
November 08, 2012
Re: Manually freeing up memory
On 11/08/2012 05:50 AM, Marco Leise wrote:
> Could it be that you still hold a reference to the raw memory
> in your data structures ? A slice would be a typical candidate:
> s.name = raw[a .. b];
> You probably checked that already...

I don't _think_ so, although there is a point where data is passed to another 
struct something like this:

  foreach(link; raw.links)   // raw is struct, links is array
      data.add(link.expand); // each entry in links is a Tuple!(size_t, size_t)

where add() takes as input a pair of size_t's.  I assumed the values here would 
be copied.  I've tried tweaking it to take out the link.expand and it makes no 
difference.
November 10, 2012
Re: Manually freeing up memory
On Thursday, 8 November 2012 at 04:51:00 UTC, Marco Leise wrote:
> Could it be that you still hold a reference to the raw memory
> in your data structures ? A slice would be a typical candidate:

Good point. I find that with GC'd memory, you have to diligently 
keep track of where and when your references will be deallocated 
to ensure there are no persistent references left dangling by 
mistake.

I find that apps built with GC languages like Java tend to suffer 
from severe memory leak issues, perhaps due to persistent 
referenced memory that the programmer is unaware about.

I come from C++ background so I am painfully aware of why I 
cannot lower my guard just because there's a CG kicking about, in 
fact I find myself much more concerned than ever because I'm 
never certain when the GC will kick in, or if it will do the job 
correctly, and so forth.


--rt
Top | Discussion index | About this forum | D home