January 09, 2007
http://d.puremagic.com/issues/show_bug.cgi?id=820

           Summary: gc should scan only pointer types for pointers
           Product: D
           Version: 1.00
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: major
          Priority: P2
         Component: DMD
        AssignedTo: bugzilla@digitalmars.com
        ReportedBy: wbaxter@gmail.com


From Oskar Linde: http://www.digitalmars.com/pnews/read.php?server=news.digitalmars.com&group=digitalmars.D&artnum=46407

And me: http://www.digitalmars.com/pnews/read.php?server=news.digitalmars.com&group=digitalmars.D&artnum=46462

The gc currently scans any data for things that look like pointers to GC'ed memory.  For programs handling large numbers of random-looking pointers, or arrays of chars, or floating point data, this means that basically much memory will never get freed.  The result is that the allocated memory grows and grows, and gc mark-and-sweep cycles take longer and longer until the program slows to a crawl and finally fails outright from lack of memory.

This is a serious problem with the GC as currently implemented.   It pretty much prevents any use of the GC in number crunching code, among other things.

Oskar's listing:
----
import std.random;

void main() {
         // The real memory use, ~20 mb
         uint[] data;
         data.length = 5_000_000;
         foreach(inout x; data)
                 x = rand();
         while(1) {
                // simulate reading a few kb of data
                 uint[] incoming;
                 incoming.length = 1000 + rand() % 5000;
                 foreach(inout x; incoming)
                         x = rand();
                 // do something with the data...
         }
}
----

My modification to make it a little more real-world:
-----

import std.math;
import std.random;
import std.stdio;

void main() {
    // The real memory use, ~40 mb
    double[] data;
    data.length = 5_000_000;
    foreach(i, inout x; data) {
        x = sin(cast(double)i/data.length);
        //x = 1;
    }
    int count = 0;
    int gcount = 0;
    while(1) {
        // simulate reading a few kb of data
        double[] incoming;
        incoming.length = 1000 + rand() % 5000;
        foreach(i, inout x; incoming) {
            x = sin(cast(double)i/incoming.length);
            //x = 5;
        }
        // do something with the data...

        // print status message every so often
        count += incoming.length;
        if (count > 1_000_000) {
            count = 0;
            gcount++;
            writefln("%s processed", gcount);
        }
    }
}
// if you comment out the 'sin' lines and put in the lines that
// set the values to constants, then the program does indeed hover around 40MB.
// Otherwise memory usage grows to hundreds of MB.


-- 

January 28, 2007
http://d.puremagic.com/issues/show_bug.cgi?id=820


bugzilla@digitalmars.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED




------- Comment #1 from bugzilla@digitalmars.com  2007-01-27 18:55 -------
Fixed DMD 1.001


--