compile-time function evaluation consuming too much memory

Jun 28, 2007

Jun 29, 2007

I have been experimenting with using ctfe and static foreach to generate code from strings. While for small strings all is fine, larger strings make DMD's memory usage go over the top. It came up when splitting a string into an array of strings - something I'd think is relatively common. Simple example that illustrates the problem: char[] make_empty_string(int n) { char[] result; for(int i = 0; i < n; ++i) result ~= " "; return result; } const char[] largestring1 = make_empty_string(10000); const char[] largestring2 = make_empty_string(10000); const char[] largestring3 = make_empty_string(10000); ... Every call to make_empty_string(10000) will consume 50 MB of memory which it seems will not be released until dmd terminates. That happens because every intermediate value in 'result' is stored. My question is if - this is expected and actually intentional as ctfe merely extends constant folding or if - this is a bug and dmd could just free the old memory on array appends/assigns/... (would this ever be a problem?) or if - D needs a compile-time garbage collector. ;) Cheers, Christian

June 29, 2007

Re: compile-time function evaluation consuming too much memory

Posted by Thomas Kuehne
in reply to Christian Kamm

Permalink

Thomas Kuehne

Posted in reply to Christian Kamm

Attachments:

pgp.sig

Permalink

Christian Kamm schrieb am 2007-06-28:
> I have been experimenting with using ctfe and static foreach to generate code from strings. While for small strings all is fine, larger strings make DMD's memory usage go over the top. It came up when splitting a string into an array of strings - something I'd think is relatively common. Simple example that illustrates the problem:
>
> char[] make_empty_string(int n)
> {
>   char[] result;
>
>   for(int i = 0; i < n; ++i)
>     result ~= " ";
>   return result;
> }
>
> const char[] largestring1 = make_empty_string(10000);
> const char[] largestring2 = make_empty_string(10000);
> const char[] largestring3 = make_empty_string(10000);
> ...
>
> Every call to make_empty_string(10000) will consume 50 MB of memory which it seems will not be released until dmd terminates. That happens because every intermediate value in 'result' is stored.
>
> My question is if
> - this is expected and actually intentional as ctfe merely extends constant
> folding or if
> - this is a bug and dmd could just free the old memory on array
> appends/assigns/... (would this ever be a problem?) or if
> - D needs a compile-time garbage collector. ;)

The memory performance of ctfe can definatly be improved, however your sample code too. Yes I know the above was just a sample code to demonstrate the issue but quite frequently that issue can be evaded by using another algoritm:

# char[] make_empty_string(int n){
#    char[] result;
#
#    if(n){
#       result = " ";
#       while(result.length * 2 <= n){
#          result ~= result;
#       }
#       if(result.length < n){
#          result ~= result[0 .. n - $];
#       }
#    }
#
#    return result;
# }

Thomas

> The memory performance of ctfe can definatly be improved, Do you think this warrants being entered as a bug? > Yes I know the above was just a sample code to > demonstrate the issue but quite frequently that issue can be evaded by > using another algoritm: Yes, I use the same for tuples. First I was splitting strings directly into tuples in order to statically foreach over them. However, storing the tuples seemed to take a lot more memory than an array of strings, so nowadays I just have const char[][] splitstring = split(str, ' '); foreach(i, unused; MakeIterTuple!(splitstring.length)) { ... splitstring[i] ... } where MakeIterTuple is template MakeIterTuple(uint size) { static if(size == 0) alias Tuple!() MakeIterTuple; else static if(size % 2 == 0) alias Tuple!(MakeIterTuple!(size/2), MakeIterTuple!(size/2)) MakeIterTuple; else alias Tuple!(void, MakeIterTuple!((size-1)/2), MakeIterTuple!((size-1)/2)) MakeIterTuple; } And that works well. What's currently my greatest memory hog is the split function above. It's basically implemented like this char[][] split(char[] str, char by) { char[][] result; size_t c = 0; int pos; while((pos = find(str[c..$], by)) != -1) { result ~= str[c..c+pos]; c += pos+1; } result ~= str[c..$]; return result; } and I can't seem to find a way to make it more memory efficient. I've thought about finding the positions and lengths of the strings to be taken and then using a mixin to create one large array literal, but can't get around building the string somewhere. Cheers, Christian

Forums