Thread overview
[Issue 1561] New: AA's create many false references for garbage collector
Oct 09, 2007
d-bugmail
Oct 10, 2007
d-bugmail
Oct 11, 2007
d-bugmail
October 09, 2007
http://d.puremagic.com/issues/show_bug.cgi?id=1561

           Summary: AA's create many false references for garbage collector
           Product: D
           Version: 1.022
          Platform: PC
        OS/Version: Windows
            Status: NEW
          Severity: major
          Priority: P2
         Component: Phobos
        AssignedTo: bugzilla@digitalmars.com
        ReportedBy: wbaxter@gmail.com


A program that uses a lot of AA's will leak memory.
It looks like maybe the reason is that the aaA structs which contain the hash
value are allocated as void[size], so the hash value is always interpreted as a
pointer.

Tangos version of aaA.d does it slightly differently, checking the size and setting the gc NO_SCAN bit if the key and value types can't hold a pointer:

        // Not found, create new elem
        //printf("create new one\n");
        size_t size = aaA.sizeof + keysize + valuesize;
        uint   bits = keysize   < (void*).sizeof &&
                      keysize   > (void).sizeof  &&
                      valuesize < (void*).sizeof &&
                      valuesize > (void).sizeof  ? BlkAttr.NO_SCAN : 0;
        e = cast(aaA *) gc_calloc(size, bits);


Test case for leakage using AA's with phobos below.  Not sure what this does with Tango, but I think it will probably still fail on a 32-bit architecture since int.sizeof is in the range of sizes that still gets scanned.

It seems like a better approach would be to a) use keyti's TypeInfo to decide if it's a pointer type or not and b) arrange for _aaGet to be called with the value type's TypeInfo too, instead of just a size, and also use that to decide if the alloc'ed memory should have the NO_SCAN bit set or not.

It could be that all of the above diagnostic guesswork is wrong.  But I am certain that the program below leaks memory, whatever the actual reason.

---------------< test case >-----------------------
import std.stdio;
import std.gc;

// Just an ordinary AA with a lot of values.
// neither keys nor values look like pointers.
class BigAA
{
    int[int] aa;
    this() {
        for(int i=0;i<1000;i++) {
            aa[i] = i;
        }
    }
}

void main()
{
    int nloops = 10_000;
    auto b = new BigAA[100];

    for(int i=0; i<nloops; ++i)
    {
        // Create some AAs (overwriting old ones)
        foreach(ref v; b) { v = new BigAA; }

        // See how we're doing
        std.gc.GCStats stats;
        std.gc.fullCollect();
        std.gc.getStats(stats);
        writefln("Loop %-5s - poolsize=%-10s   %s Mbytes  (%s KB)",
                 i, stats.poolsize,
                 stats.usedsize/1024/1024,
                 stats.usedsize/1024);

    }
}


-- 

October 10, 2007
http://d.puremagic.com/issues/show_bug.cgi?id=1561





------- Comment #1 from kamm-removethis@incasoftware.de  2007-10-10 13:14 -------
I compiled and ran the program against tango using dmd 1.020. Since there doesn't seem to be gcstats equivalent functionality, I removed it and monitored the memory usage with top.

I terminated the process after eight minutes of it running: it consumed a constant and small amount of memory during its execution.

With dmd 1.021 and phobos, I see the memory usage growing over time.


-- 

October 11, 2007
http://d.puremagic.com/issues/show_bug.cgi?id=1561





------- Comment #2 from wbaxter@gmail.com  2007-10-10 20:04 -------
Thanks for trying it out with Tango.  But if Tango doesn't show the problem then that almost certainly means my diagnosis is incorrect.


--