Jump to page: 1 2
Thread overview
[Issue 2105] New: Manual Memory Management for Associative Arrays
May 14, 2008
d-bugmail
May 17, 2008
d-bugmail
May 17, 2008
d-bugmail
May 19, 2008
d-bugmail
May 19, 2008
d-bugmail
May 19, 2008
d-bugmail
May 19, 2008
d-bugmail
Jul 03, 2008
d-bugmail
Sep 07, 2008
d-bugmail
Sep 07, 2008
d-bugmail
Sep 11, 2008
d-bugmail
Dec 25, 2008
d-bugmail
Jan 02, 2009
d-bugmail
Jan 13, 2009
d-bugmail
Aug 27, 2011
David Simcha
May 14, 2008
http://d.puremagic.com/issues/show_bug.cgi?id=2105

           Summary: Manual Memory Management for Associative Arrays
           Product: D
           Version: 2.013
          Platform: PC
        OS/Version: All
            Status: NEW
          Severity: major
          Priority: P2
         Component: DMD
        AssignedTo: bugzilla@digitalmars.com
        ReportedBy: dsimcha@yahoo.com


Due to the nature of conservative garbage collection, very large dynamic and associative arrays are often not freed properly by the Phobos garbage collector.  This is due to the high probability of a bit pattern on the stack, when interpreted as a pointer, pointing to heap space occupied by these structures.  In the case of dynamic arrays, this can be worked around by simply deleting the very large array manually:

uint[] foo=new uint[100_000_000];
//stuff
delete foo;

However, if foo is an associative array, this cannot be done.

uint[string] foo;
//Put a large amount of data into array.
delete foo;  //Compile time error.

Furthermore, looping through such an array, such as:

foreach(i, bar; foo) {
    foo.remove(i);
}

fails to prevent memory leaks.  What is needed is a method of manually deleting associative arrays that assumes that does not rely on the garbage collector.


-- 

May 17, 2008
http://d.puremagic.com/issues/show_bug.cgi?id=2105





------- Comment #1 from mmoncure@gmail.com  2008-05-17 10:45 -------
You can declare the array 'scope'...does this help?


-- 

May 17, 2008
http://d.puremagic.com/issues/show_bug.cgi?id=2105





------- Comment #2 from dsimcha@yahoo.com  2008-05-17 12:25 -------
Good idea, but surprisingly, no, declaring the array as scope does not solve the problem for either the dynamic or the associative array.  I honestly have no idea why.


-- 

May 19, 2008
http://d.puremagic.com/issues/show_bug.cgi?id=2105





------- Comment #3 from mmoncure@gmail.com  2008-05-19 07:52 -------
(In reply to comment #2)
> Good idea, but surprisingly, no, declaring the array as scope does not solve the problem for either the dynamic or the associative array.  I honestly have no idea why.

Is it possible to get a short program demonstrating the issue?


-- 

May 19, 2008
http://d.puremagic.com/issues/show_bug.cgi?id=2105


dsimcha@yahoo.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Version|2.013                       |2.014




------- Comment #4 from dsimcha@yahoo.com  2008-05-19 11:20 -------
There really isn't much to it, but here's basically the canonical example:

import std.gc, std.stdio;

void main () {
    uint count;
    while(true) {
        test();
        fullCollect();
        writefln(++count);
    }
    return;
}

void test() {
    scope uint[uint] foo;
    for(uint i=0; i < 10_000_000; i++) {
        foo[i]=i;
    }
    return;
}

Basically, the only reference to foo goes out of scope on every cycle through the main loop, and should obviously be freed.  However, with each iteration, this program will use progressively more memory, even when fullCollect() is called explicitly.  This also happens with very large dynamic arrays, but this is less of a problem, since, at least for the simple cases, it's easy enough to just free them manually using the delete keyword.

Also note that I have tested this on 2.014 and it still happens, so I changed the version number.


-- 

May 19, 2008
http://d.puremagic.com/issues/show_bug.cgi?id=2105





------- Comment #5 from mmoncure@gmail.com  2008-05-19 13:47 -------
(In reply to comment #4)
> There really isn't much to it, but here's basically the canonical example:
> 
> import std.gc, std.stdio;
> 
> void main () {
>     uint count;
>     while(true) {
>         test();
>         fullCollect();
>         writefln(++count);
>     }
>     return;
> }
> 
> void test() {
>     scope uint[uint] foo;
>     for(uint i=0; i < 10_000_000; i++) {
>         foo[i]=i;
>     }
>     return;
> }
> 
> Basically, the only reference to foo goes out of scope on every cycle through the main loop, and should obviously be freed.  However, with each iteration, this program will use progressively more memory, even when fullCollect() is called explicitly.  This also happens with very large dynamic arrays, but this is less of a problem, since, at least for the simple cases, it's easy enough to just free them manually using the delete keyword.
> 
> Also note that I have tested this on 2.014 and it still happens, so I changed the version number.

I tested the above program with both gdc v0.25 and dmd v1.028 on linux.  In both cases the memory increased until the first insert loop completed and then reached steady state (I dropped the iterations down a bit to make it more obvious).  When I injected a sleep(1) after the gc collect, memory usage drop back to baseline was noticeable in top with a low refresh interval.

Maybe this is a 2.0 bug??

merlin



-- 

May 19, 2008
http://d.puremagic.com/issues/show_bug.cgi?id=2105





------- Comment #6 from dsimcha@yahoo.com  2008-05-19 16:06 -------
If by dropping the iterations down, you mean changing the loop counter in test(), which controls the array size, from 10_000_000 to a smaller number, this misses the point of this bug.  It only happens on very large arrays.  From the informal testing I've done, it seems that size in bytes, not size in elements, is what matters.  This also makes sense in terms of the tentative diagnosis of this as a conservative GC issue.  If I drop the array size down a few notches to 5_000_000, the memory size reaches steady state as you describe, using GDC 2.014.

Also, I tried this code on GDC 0.24 (32-bit) with an array size of 20_000_000 and it seems to work there, even without the scope keyword on the Linux cluster I'm working on now.  However, I had tried a similar version of this with a GDC and a dynamic array instead of an associative on my Windows box a while back and array sizes over about 13_000_000 broke it.  Don't know why this inconsistency exists.


-- 

July 03, 2008
http://d.puremagic.com/issues/show_bug.cgi?id=2105


davidl@126.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|DMD                         |Phobos




------- Comment #7 from davidl@126.com  2008-07-02 22:11 -------
It's a phobos related issue.
and also following func possibly need a fix.
in _aaRehash func :
                len = prime_list[i];
                newb.b = new aaA*[len];

                foreach (e; aa.b)
                {
                    if (e)
                        _aaRehash_x(e);
                }

                newb.nodes = aa.nodes;
            }

            *paa.a = newb; // here the code doesn't delete the old BB struct.
where it should delete the old BB struct


-- 

September 07, 2008
http://d.puremagic.com/issues/show_bug.cgi?id=2105





------- Comment #8 from dsimcha@yahoo.com  2008-09-07 18:22 -------
Here's a possible fix.  A one-line patch that just deletes the old array of pointers to AA structs when rehashing is needed in aaA.d.  I will attach it. Once that is added, the following test case works without memory consumption going up at all on successive iterations, even with the GC disabled.  I am not fully confident that this patch doesn't break other stuff, since I don't know exactly what the Phobos unit tests cover and don't cover, but I did run the Phobos unit tests with this patch in place and they all passed.

IMHO, especially while GC is partially conservative, it is essential that builtin types be usable without GC as much as possible.  This certainly seems feasible in the case of AAs.

import std.gc, std.stdio;

void main () {
    std.gc.disable;
    while(true) {
        test;
        printGCStats;
    }
}

void test() {
    uint[uint] foo;
    for(uint i=0; i < 20_000_000; i++) {
        foo[i]=i;
    }
    deleteAA(foo);
}

void printGCStats() {
    std.gc.GCStats stats;
    std.gc.getStats(stats);
    foreach(s; stats.tupleof)
        write(s, "\t");
    writeln();
}

//These structs are copied and pasted from aaA.d.
struct aaA
{
    aaA *left;
    aaA *right;
    hash_t hash;
    /* key   */
    /* value */
}

struct BB
{
    aaA*[] b;
    size_t nodes;       // total number of aaA nodes
}

struct AA
{
    BB* a;
}

//Could be included in the Phobos runtime
//and called when a delete AA; line is
//encountered.  Might also be called
//automatically if AA is allocated as scope.

void deleteAA(T, U)(T[U] input) {
    auto aaPtr = cast(BB*) cast(void*) input;
    deleteBB(aaPtr);
}

void deleteBB(BB* aaPtr) {

    void rdelAA(aaA* input) {
        if(input.left !is null)
            rdelAA(input.left);
        if(input.right !is null)
            rdelAA(input.right);
        delete input;
    }

    foreach(node; aaPtr.b)
        if(node !is null)
            rdelAA(node);
    delete aaPtr.b;
    delete aaPtr;
}


-- 

September 07, 2008
http://d.puremagic.com/issues/show_bug.cgi?id=2105





------- Comment #9 from dsimcha@yahoo.com  2008-09-07 18:26 -------
Created an attachment (id=274)
 --> (http://d.puremagic.com/issues/attachment.cgi?id=274&action=view)
Fix for aaA.d


-- 

« First   ‹ Prev
1 2