Thread overview
compile-time function evaluation consuming too much memory
Jun 28, 2007
Christian Kamm
Jun 29, 2007
Thomas Kuehne
Jun 29, 2007
Christian Kamm
June 28, 2007
I have been experimenting with using ctfe and static foreach to generate code from strings. While for small strings all is fine, larger strings make DMD's memory usage go over the top. It came up when splitting a string into an array of strings - something I'd think is relatively common. Simple example that illustrates the problem:

char[] make_empty_string(int n)
{
  char[] result;

  for(int i = 0; i < n; ++i)
    result ~= " ";
  return result;
}

const char[] largestring1 = make_empty_string(10000);
const char[] largestring2 = make_empty_string(10000);
const char[] largestring3 = make_empty_string(10000);
...

Every call to make_empty_string(10000) will consume 50 MB of memory which it seems will not be released until dmd terminates. That happens because every intermediate value in 'result' is stored.

My question is if
- this is expected and actually intentional as ctfe merely extends constant
folding or if
- this is a bug and dmd could just free the old memory on array
appends/assigns/... (would this ever be a problem?) or if
- D needs a compile-time garbage collector. ;)

Cheers,
Christian

June 29, 2007
Christian Kamm schrieb am 2007-06-28:
> I have been experimenting with using ctfe and static foreach to generate code from strings. While for small strings all is fine, larger strings make DMD's memory usage go over the top. It came up when splitting a string into an array of strings - something I'd think is relatively common. Simple example that illustrates the problem:
>
> char[] make_empty_string(int n)
> {
>   char[] result;
>
>   for(int i = 0; i < n; ++i)
>     result ~= " ";
>   return result;
> }
>
> const char[] largestring1 = make_empty_string(10000);
> const char[] largestring2 = make_empty_string(10000);
> const char[] largestring3 = make_empty_string(10000);
> ...
>
> Every call to make_empty_string(10000) will consume 50 MB of memory which it seems will not be released until dmd terminates. That happens because every intermediate value in 'result' is stored.
>
> My question is if
> - this is expected and actually intentional as ctfe merely extends constant
> folding or if
> - this is a bug and dmd could just free the old memory on array
> appends/assigns/... (would this ever be a problem?) or if
> - D needs a compile-time garbage collector. ;)

The memory performance of ctfe can definatly be improved, however your sample code too. Yes I know the above was just a sample code to demonstrate the issue but quite frequently that issue can be evaded by using another algoritm:

# char[] make_empty_string(int n){
#    char[] result;
#
#    if(n){
#       result = " ";
#       while(result.length * 2 <= n){
#          result ~= result;
#       }
#       if(result.length < n){
#          result ~= result[0 .. n - $];
#       }
#    }
#
#    return result;
# }

Thomas


June 29, 2007
> The memory performance of ctfe can definatly be improved,

Do you think this warrants being entered as a bug?

> Yes I know the above was just a sample code to
> demonstrate the issue but quite frequently that issue can be evaded by
> using another algoritm:

Yes, I use the same for tuples. First I was splitting strings directly into tuples in order to statically foreach over them. However, storing the tuples seemed to take a lot more memory than an array of strings, so nowadays I just have

const char[][] splitstring = split(str, ' ');
foreach(i, unused; MakeIterTuple!(splitstring.length))
{ ... splitstring[i] ... }

where MakeIterTuple is

template MakeIterTuple(uint size)
{
  static if(size == 0)
    alias Tuple!() MakeIterTuple;
  else static if(size % 2 == 0)
    alias Tuple!(MakeIterTuple!(size/2), MakeIterTuple!(size/2))
      MakeIterTuple;
  else
    alias Tuple!(void, MakeIterTuple!((size-1)/2),
      MakeIterTuple!((size-1)/2)) MakeIterTuple;
}

And that works well. What's currently my greatest memory hog is the split function above. It's basically implemented like this

char[][] split(char[] str, char by)
{
  char[][] result;
  size_t c = 0;
  int pos;

  while((pos = find(str[c..$], by)) != -1)
  {
    result ~= str[c..c+pos];
    c += pos+1;
  }

  result ~= str[c..$];

  return result;
}

and I can't seem to find a way to make it more memory efficient. I've thought about finding the positions and lengths of the strings to be taken and then using a mixin to create one large array literal, but can't get around building the string somewhere.

Cheers,
Christian