Jump to page: 1 2
Thread overview
[Issue 1382] New: memory allocated for arrays in CTFE functions during compilation is not released
Jul 27, 2007
d-bugmail
Apr 29, 2008
d-bugmail
Jul 22, 2008
d-bugmail
Dec 02, 2008
d-bugmail
Aug 03, 2009
Don
Sep 01, 2009
Don
Dec 06, 2010
Rob Jacques
Sep 02, 2011
Don
Jan 20, 2012
Jesse Phillips
Jan 21, 2012
Don
Jan 23, 2012
Leandro Lucarella
Jan 23, 2012
Walter Bright
Jan 24, 2012
Leandro Lucarella
Jan 24, 2012
Walter Bright
Jan 24, 2012
Leandro Lucarella
Jan 24, 2012
Don
July 27, 2007
http://d.puremagic.com/issues/show_bug.cgi?id=1382

           Summary: memory allocated for arrays in CTFE functions during
                    compilation is not released
           Product: D
           Version: 1.019
          Platform: PC
        OS/Version: Linux
            Status: NEW
          Severity: normal
          Priority: P2
         Component: DMD
        AssignedTo: bugzilla@digitalmars.com
        ReportedBy: kamm-removethis@incasoftware.de


This problem is encountered when large arrays are manipulated during CTFE. Symtoms are dmd allocating a lot of memory, eating up all swap and finally terminating with an out of memory error.

The core of the problem can be seen in this code
--
char[] make_empty_string(int n)
{
  char[] result;

  for(int i = 0; i < n; ++i)
    result ~= " ";
  return result;
}

const char[] largestring = make_empty_string(100000);
void main() {}
---

This snippet will require 5 gb of memory to compile instead of the mere 100 kb the final string will require. It is caused by the intermediate strings stored in result during the iteration never being discarded. The problem is not limited to concatenation, modifications of a single element of the array will also cause the whole array to be duplicated.

While this particular piece of code can be rewritten to consume less memory, that's not generally possible. An example where reduction is not possible is splitting a string into substrings.

It does come up in practice: Someone wanted to generate a function to get the Unicode general catergory for a dchar from the textfiles from the Unicode Character Database and ran into this issue. I wanted to parse the D BNF to generate a parser at compile time and had dmd exit with an out of memory error.

It seems CTFE needs a compile time garbage collector.


-- 

April 29, 2008
http://d.puremagic.com/issues/show_bug.cgi?id=1382


clugdbug@yahoo.com.au changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|normal                      |blocker




------- Comment #1 from clugdbug@yahoo.com.au  2008-04-29 05:30 -------
I'm raising the severity of this to blocker, since it makes CTFE metaprogramming libraries unusable in practice. Fortunately development of such libraries is still possible, although both BCS and I are experiencing some horrific compilation times, even after inserting workarounds where possible. We are having to reduce the complexity of our test code.


-- 

July 22, 2008
http://d.puremagic.com/issues/show_bug.cgi?id=1382





------- Comment #2 from kamm-removethis@incasoftware.de  2008-07-22 00:52 -------
Has DMD stopped using boehm-gc? If I try this very example on LLVMDC (which
does use boehm-gc), memory usage never exceeds a certain level (< 1 MB).

So maybe re-enabling the garbage collector for DMD will fix all CTFE related memory issues?


-- 

December 02, 2008
http://d.puremagic.com/issues/show_bug.cgi?id=1382





------- Comment #3 from kamm-removethis@incasoftware.de  2008-12-02 12:31 -------
We've had some success with reenabling boehm-gc: http://www.dsource.org/projects/ldc/ticket/49 .

"Another test with USE_BOEHM_GC=0, REDIRECT_MALLOC=GC_malloc and IGNORE_FREE seemed to yield good results, with no segfaults and collecting CTFE memory properly."


-- 

August 03, 2009
http://d.puremagic.com/issues/show_bug.cgi?id=1382


Don <clugdbug@yahoo.com.au> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |clugdbug@yahoo.com.au




--- Comment #4 from Don <clugdbug@yahoo.com.au>  2009-08-03 05:18:25 PDT ---
I don't think Boehm gc is the answer. Note that this is very closely related to bug#1330. I think the CTFE implementation of arrays needs (a) reference semantics and (b) reference counting. Here's an example of a terrible case, which allocates several Gb of RAM:

int junk(int n)
{
  int[] result = new int[10000];

  for(int i = 0; i < n; ++i) {
    result[0]= i;
  }
  return 0;
}

const int bad = junk(100000);
void main() {}

This particular case could be solved by adding a reference-based system for
storing array values, instead of doing copy-on-write -- and that's required for
bug #1330 anyway.
Once that's in place, the array values could be allocated in a controlled
manner (eg, retain a list of all allocated CTFE arrays). A dedicated precise GC
can then be simple and fast, since it only needs to check for array references
in the current function, and they can only be in the local variables which are
arrays or structs.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
September 01, 2009
http://d.puremagic.com/issues/show_bug.cgi?id=1382


Don <clugdbug@yahoo.com.au> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|blocker                     |critical




--- Comment #5 from Don <clugdbug@yahoo.com.au>  2009-08-31 23:59:27 PDT ---
Reducing severity back to critical, since the voting system takes care of the importance.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
June 04, 2010
http://d.puremagic.com/issues/show_bug.cgi?id=1382


bearophile_hugs@eml.cc changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |bearophile_hugs@eml.cc


--- Comment #6 from bearophile_hugs@eml.cc 2010-06-04 15:19:11 PDT ---
A partially artificial test case. Faster and better versions are quite possible, but I expect dmd to be able to run this quickly.


import std.stdio: writeln;

ubyte[1 << NPOW] setBits(int NPOW)() {
    nothrow pure uint setBits8(uint n) {
        uint result;
        foreach (i; 0 .. 8)
            if (n & (1 << i))
                result++;
        return result;
    }

    nothrow pure uint setBits16(uint n) {
        enum uint FIRST_UBYTE =  0b0000_0000_1111_1111;
        enum uint SECOND_UBYTE = 0b1111_1111_0000_0000;
        return setBits8(n & FIRST_UBYTE) + setBits8((n & SECOND_UBYTE) >> 8);
    }

    typeof(return) result;
    foreach (i; 1 .. result.length)
        result[i] = cast(typeof(result[0]))setBits16(i);
    return result;
}

enum nbits = setBits!16(); // currently 12 is about the max

void main() {
    writeln(nbits);
}

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
December 06, 2010
http://d.puremagic.com/issues/show_bug.cgi?id=1382


Rob Jacques <sandford@jhu.edu> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |sandford@jhu.edu


--- Comment #7 from Rob Jacques <sandford@jhu.edu> 2010-12-06 12:09:04 PST ---
I just came across this bug while  working on improving std.variant: the combination of templates + ctfe + unittests resulted in out of memory errors. I've also traced down another issue (I don't know if it should be filed separately or not):

It appears that _any_ access of an array variable allocates ram, resulting in
drastically slower compile times (+55 seconds) and excess memory usage (30+ mb
in this case using DMD 2.050)

string ctfeTest() {
    char[] result;
    result.length = ushort.max;
    char c;
    for(size_t i = 0; i < result.length; i++) {} // Allocates
    for(size_t i = 0; i < ushort.max; i++) {}    // Doesn't allocate

    for(size_t i = 0; i < ushort.max; i++) {     // Allocates
        c = result[i];
    }
    for(size_t i = 0; i < ushort.max; i++) {     // Doesn't allocate
        c = cast(ubyte)('A' + i%26);
    }
    return cast(string)result;
}

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
September 02, 2011
http://d.puremagic.com/issues/show_bug.cgi?id=1382



--- Comment #8 from Don <clugdbug@yahoo.com.au> 2011-09-02 00:21:14 PDT ---
(In reply to comment #7)
> It appears that _any_ access of an array variable allocates ram, resulting in
> drastically slower compile times (+55 seconds) and excess memory usage (30+ mb
> in this case using DMD 2.050)

This was fixed in 2.054. There were several cases where reading or writing a
single array element could cause the entire array to be copied! These cases
have now been fixed, giving an order of magnitude improvement in memory use and
compilation time. (The original test case (concatenation) hasn't changed; it's
simply caused by absence of a compile-time gc).
This bug is now a far less serious problem than bug 6498.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
January 20, 2012
http://d.puremagic.com/issues/show_bug.cgi?id=1382


Jesse Phillips <Jesse.K.Phillips+D@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |Jesse.K.Phillips+D@gmail.co
                   |                            |m
   Target Milestone|---                         |2.059


-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
« First   ‹ Prev
1 2