April 12, 2010 Re: Memory leak with dynamic array | ||||
---|---|---|---|---|
| ||||
Posted in reply to Joseph Wakeling | Joseph Wakeling: > I was very happy to see that D _does_ have a 'reserve' function for arrays, which I had been missing compared to C++ (it's not mentioned in the array docs). 'reserve' is not a dynamic array method, it's a free function contained in the Phobos default module that also contains object, see the docs page: http://www.digitalmars.com/d/2.0/phobos/object.html It's not mentioned in the array docs because the 'reserve' and related functions are new, they come from a recent change in how dynamic arrays are implemented in D. > Still, I don't think that pre-reserving the memory per se is the influencing factor on the differences in performance. You are right. The difference in performance comes from mostly from the way D arrays are designed. I agree with you that dynamic array append in D is slow, in absolute terms. Comparing the performance of D code with C++ is a good thing to do, it gives a baseline for comparison. D dynamic arrays are more flexible than C++ vector, they can be sliced, such slicing is O(1), and the slices are seen by the language just like other arrays. So you pay the price of some performance for such increased flexibility. The idea here is that the built-in data types must be as flexible as possible even if their performance is not so high, so they can be used for many different purposes. Then D standard library will have specialized data structures that are faster thanks to being more specialized and less flexible. In D dynamic arrays some of the performance price is also paid for the automatic memory management, for the GC that's not a precise GC (for example if your array has some empty items at the end past its true length, the GC must ignore them). I think D dynamic arrays have a lower append performance because they don't contain the capacity in a handy place (their new design has a capacity, but it's not beside length and pointer). I don't know how the D array append works in situations of multithreading. If you don't require the slicing flexibility, you can use D language to implement a C++-style Vector in D. If you implement a struct that contains capacity, length and pointer to data, and you don't let the GC scan the memory of the data, and you add method overloading for the ~= that doubles the size of the memory with a realloc once in a while, you probably have a performance similar to the C++ one (not equal because the performance profile of the GC in allocating memory chunks is a little more complex than the C malloc). See the bottom of this post. > D also uses about 20% more memory than the C++ I don't know why, I presume it's caused by the GC bookkeeping, it must be careful to avoid memory fragmentation. Currently the D GC is not so refined. > Using an Appender class and put() (from std.array) is even slower, > despite the std.array docs recommending this over ~. :-( That was useful and designed for the precedent design of the dynamic arrays. Maybe it can be removed from Phobos now... > It's disappointing because I'm loving so much of D's syntax, but I can write far more efficient code (with less explicit memory management) in C++ ... As you know all data structures are the result of compromises, they are a balance of performance for their different uses, there is no perfect data structure; in practice D arrays are not designed for an efficient "en mass" appending. A D dynamic array is seen locally as a struct of 2 words. So it's cheap to copy and pass around. Why they don't contain the capacity too in such struct is a long story. If you like to write benchmarks you will also find performance difference between for example a C++ hash_map and the built-in associative arrays of D. Beside the design (that for example can make it hard to implement append-efficient dynamic arrays in D), D language is also rather new still, so far the focus was on its design, not in tuning the performance of its implementation. It's being several years they are performance-tuning C++ implementations :-) ------------------- Below three implementations in D and C++. The Vector(T) is bare-bones. It missed most features and it's unsafe. In the D2 version I don't use the GC, and the memory is deallocated when arr gets out of scope. Timings, T=double, N=5_000_000, NLOOPS=100, seconds: D1: 38.20 D2: 8.32 C++: 6.65 You can see the performance of the append in D2 is a matter of data structure implementation, it's not a language problem :-) With LDC (once we'll have a D2 version of it) the performance of D2 can probably be the same as the C++. DMD maybe loses a little here because it's not so good at inlining, or maybe because the C++ vector is better than this D2 code. ------------------- // First D version import std.stdio: printf; void main() { alias double T; enum int N = 5_000_000; enum int NLOOPS = 100; T[] arr; foreach (i; 0 .. NLOOPS) { arr.length = 0; arr.assumeSafeAppend; printf("Array capacity = %d\n", arr.capacity); foreach (uint j; 0 .. N) arr ~= j; printf("At iteration %u, x has %u elements.\n", i, arr.length); } } ------------------ // Second D version import std.stdio: printf; import std.c.stdlib: malloc, realloc, free; struct Vector(T) { enum double FAST_GROW = 2; enum double SLOW_GROW = 1.3; enum int STARTING_SIZE = 4; enum int BIG_MEM = (1 << 27) / T.sizeof; T* ptr; int length, capacity; ~this() { free(ptr); ptr = null; length = 0; capacity = 0; } void opOpAssign(string Op:"~=")(T item) { if (length >= capacity) { if (capacity == 0) _grow_capacity(STARTING_SIZE); else if (capacity < BIG_MEM) _grow_capacity(cast(int)(capacity * FAST_GROW)); else _grow_capacity(cast(int)(capacity * SLOW_GROW)); } ptr[length] = item; length++; } void _grow_capacity(int new_capacity) { ptr = cast(T*)realloc(ptr, new_capacity * T.sizeof); assert(ptr); this.capacity = new_capacity; } } void main() { alias double T; enum int N = 5_000_000; enum int NLOOPS = 100; Vector!T arr; foreach (i; 0 .. NLOOPS) { arr.length = 0; printf("Array capacity = %d\n", arr.capacity); foreach (uint j; 0 .. N) arr ~= j; printf("At iteration %u, x has %u elements.\n", i, arr.length); } } ------------------ // C++ version #include "stdio.h" #include <vector> int main() { typedef double T; const unsigned int N = 5000000; const unsigned int NLOOPS = 100; std::vector<T> arr; for (unsigned int i = 0; i < NLOOPS; i++) { arr.resize(0); printf("Array capacity = %d\n", arr.capacity()); for (unsigned int j = 0; j < N; j++) arr.push_back(j); printf("At iteration %u, x has %u elements.\n", i, arr.size()); } } Bye, bearophile |
April 12, 2010 Re: Memory leak with dynamic array | ||||
---|---|---|---|---|
| ||||
Posted in reply to bearophile | On 04/11/2010 07:42 PM, bearophile wrote:
> D dynamic arrays are more flexible than C++ vector, they can be sliced, such slicing is O(1), and the slices are seen by the language just like other arrays. So you pay the price of some performance for such increased flexibility. The idea here is that the built-in data types must be as flexible as possible even if their performance is not so high, so they can be used for many different purposes. Then D standard library will have specialized data structures that are faster thanks to being more specialized and less flexible. In D dynamic arrays some of the performance price is also paid for the automatic memory management, for the GC that's not a precise GC (for example if your array has some empty items at the end past its true length, the GC must ignore them).
>
OTish: It would be nice if the data-structures-to-be are designed such that they can be easily dropped in wherever dynamic arrays are used. At least the list-like ones.
|
April 12, 2010 Re: Memory leak with dynamic array | ||||
---|---|---|---|---|
| ||||
Posted in reply to bearophile | A third uglier D version, this is a bit faster than the C++ version. With few experiments I've seen that the second D version was slower than the C++ code because dmd missed the inlining of the opOpAssign and because it contaned the if(capacity<BIG_MEM) test too. LDC can surely inline the method in a situation like that. But the BIG_MEM test was something I have added to reduce memory usage for very large vectors. I don't know if the C++ implementation has something similar. // Third D version import std.stdio: printf; import std.c.stdlib: malloc, realloc, free; struct Vector(T) { enum double FAST_GROW = 2; enum double SLOW_GROW = 1.3; enum int STARTING_SIZE = 4; enum int BIG_MEM = (1 << 27) / T.sizeof; T* ptr; int length, capacity; ~this() { free(ptr); ptr = null; length = 0; capacity = 0; } } void main() { alias double T; enum int N = 5_000_000; enum int NLOOPS = 100; Vector!T arr; foreach (i; 0 .. NLOOPS) { arr.length = 0; printf("Array capacity = %d\n", arr.capacity); foreach (uint j; 0 .. N) { if (arr.length >= arr.capacity) { int new_capacity = arr.capacity ? cast(int)(arr.capacity * arr.FAST_GROW) : arr.STARTING_SIZE; arr.ptr = cast(T*)realloc(arr.ptr, new_capacity * T.sizeof); assert(arr.ptr); arr.capacity = new_capacity; } arr.ptr[arr.length] = j; arr.length++; } printf("At iteration %u, x has %u elements.\n", i, arr.length); } } Bye, bearophile |
April 12, 2010 Re: Memory leak with dynamic array | ||||
---|---|---|---|---|
| ||||
Posted in reply to Ellery Newcomer | Ellery Newcomer:
> OTish: It would be nice if the data-structures-to-be are designed such that they can be easily dropped in wherever dynamic arrays are used. At least the list-like ones.
Andrei will probably do what you are asking for here.
But those list-like data structures probably can't be defined using the normal array literals :-) D2 operator overloading is flexible enough for many purposes, but data structure literals are a weak spot of D. For example you need to put the digits in a string to initialize a large BigInt. It's not easy to invent a way to allow literal design through code in a C-like language.
Bye,
bearophile
|
April 12, 2010 Re: Memory leak with dynamic array | ||||
---|---|---|---|---|
| ||||
Posted in reply to bearophile | > But those list-like data structures probably can't be defined using the normal array literals :-)
I was wrong, this is not so bad looking:
struct List(T) {
// Node of std.range.SListRange is not static!
static struct Node {
T data;
Node* next;
this(T d, Node* p=null) {
data = d;
next = p;
}
}
Node* lh;
this(T[] arr) {
foreach_reverse (el; arr)
lh = new Node(el, lh);
}
}
void main() {
List!int items = [1, 2, 3];
}
Bye,
bearophile
|
April 12, 2010 Re: Memory leak with dynamic array | ||||
---|---|---|---|---|
| ||||
Posted in reply to bearophile | On 04/11/2010 09:36 PM, bearophile wrote:
>> But those list-like data structures probably can't be defined using the normal array literals :-)
>
> I was wrong, this is not so bad looking:
>
> struct List(T) {
> // Node of std.range.SListRange is not static!
> static struct Node {
> T data;
> Node* next;
> this(T d, Node* p=null) {
> data = d;
> next = p;
> }
> }
>
> Node* lh;
>
> this(T[] arr) {
> foreach_reverse (el; arr)
> lh = new Node(el, lh);
> }
> }
>
> void main() {
> List!int items = [1, 2, 3];
> }
>
> Bye,
> bearophile
It won't work for classes, though.
I'd think there'd be more trouble with other parts of the language though, like slices. foreach with index is another thing I care about particularly, but it should be trivial to implement. I can't recall what else is still special-cased with array literals.
And I agree about integer literals. What would be nice is a way to override how the literals are interpreted. Maybe some sort of macroish facility that pattern matches and manipulates the AST at compile time...
<dreaming>
macro @(<Decl>: @(<Type> is X)
@(<Identifier>)
=
@(<IntegerLiteral>, lit), decl){
string str = "`" ~ lit.getText() ~ "`";
lit = @( str2X( $(<Exp>, str) ));
return decl;
}
Ick, that turned into a mess fast.
</dreaming>
|
April 12, 2010 Re: Memory leak with dynamic array | ||||
---|---|---|---|---|
| ||||
Posted in reply to Joseph Wakeling | On Sun, 11 Apr 2010 12:50:11 -0400, Joseph Wakeling <joseph.wakeling@gmail.com> wrote: > I was very happy to see that D _does_ have a 'reserve' function for arrays, > which I had been missing compared to C++ (it's not mentioned in the array > docs). It's new. It probably should be mentioned in the spec, but it's a druntime thing. > > Still, I don't think that pre-reserving the memory per se is the influencing > factor on the differences in performance. No, and like bearophile said, the D array is a compromise between performance and flexibility. There are amazing ways to use D arrays that you could never do with C++ vectors. > Note that in C++ the memory is not preassigned either. The difference > between the performance of these pieces of code is striking -- on my > machine the D example takes about 70--75 seconds to run, whereas the > C++ example speeds through it in 10s or less. The C++ example is reallocating memory, freeing memory it is no longer using. It also manually handles the memory management, allocating larger and larger arrays in some algorithmically determined fashion (for example, multiplying the length by some constant factor). This gives it an edge in performance because it does not have to do any costly lookup to determine if it can append in place, plus the realloc of the memory probably is cheaper than the GC realloc of D. If you want to compare apples to apples (well, probably more like red apples to green apples), you need to do these things in a struct for D. I had thought the D appender class would do the trick, but as you stated below, it's even slower. This needs to be remedied. > D also uses about 20% more memory than the C++ even though the C++ code > declares a higher capacity for the vector (above 8 million) than D does > for the array (a little over 5 million). D does not assume you stopped caring about the memory being pointed to when it had to realloc. Therefore, it leaves it around in case you are still using it. You can also expect more overhead in the GC because it tends to hang on to memory in case it wants to use it again, or because it hasn't collected it yet. This is true of most GC-based languages. For example, you can do this in D: int[] arr = new int[50]; int *arrelem = &arr[5]; for(int i = 0; i < 10000; i++) arr ~= i; *arrelem = 12345; // valid. You can't do the same thing with C++ vectors, when they reallocate, the memory they used to own could be freed. This invalidates all pointers and iterators into the vector, but the language doesn't prevent you from having such dangling pointers. Using one of them can result in memory corruption, one of the worst bugs. D tries to eliminate as much memory corruption problems as possible. > > I don't know if it was naive to assume that D's dynamic arrays would be > equivalent to vectors in C++. But it's clear that the D array appender > ~= is much slower than the C++ vector push_back(). But is much safer, and supports safe slicing. Compromises. > Using an Appender class and put() (from std.array) is even slower, > despite the std.array docs recommending this over ~. :-( This must be fixed, the appender should be blazingly fast at appending (almost as fast as C++), with the drawback that the overhead is higher. > It's disappointing because I'm loving so much of D's syntax, but I can > write far more efficient code (with less explicit memory management) > in C++ ... You haven't done much with it yet. When you start discovering how much D takes care of, you will be amazed :) array appending isn't the fastest operation, but it is safe, and safety can be worth infinitely more than speed sometimes. The thing about D is it *can* be fast and unsafe, just as fast and unsafe as C, but that's not the default. -Steve |
April 12, 2010 Re: Memory leak with dynamic array | ||||
---|---|---|---|---|
| ||||
Posted in reply to Ellery Newcomer | Something that I have forgotten, for the original poster: the D append is not templated code, so this produces no template bloat as in the C++ <vector>. This is another compromise. ----------- Ellery Newcomer: >It won't work for classes, though.< You have to write: class List(T) { static struct Node { T data; Node* next; this(T d, Node* p=null) { data = d; next = p; } } Node* lh; this(T[] arr) { foreach_reverse (el; arr) lh = new Node(el, lh); } } void main() { auto items = new List!int([1, 2, 3]); } >Ick, that turned into a mess fast.< I'd like to be a computer scientist, able to create a DMeta, similar to PyMeta. A DMeta can be much cleaner than that "mess". See the original OMeta: http://tinlizzie.org/ometa/ and its Python version: http://washort.twistedmatrix.com/ https://launchpad.net/pymeta Bye, bearophile |
April 12, 2010 Re: Memory leak with dynamic array | ||||
---|---|---|---|---|
| ||||
Posted in reply to bearophile | Thanks to all again for the discussion, examples and explanations. :-) One note -- I wouldn't want anyone to think I'm bashing D or complaining. I've been interested in the language for some time and this seemed an opportune time to start experimenting properly. It's fun, I'm learning a lot, and I'm genuinely touched by the amount of effort put in by everyone on this list to teach and share examples. I'm also fully aware that D is still growing, and that I need to be patient in some cases ... :-) bearophile wrote: > D dynamic arrays are more flexible than C++ vector, they can be sliced, such slicing is O(1), and the slices are seen by the language just like other arrays. So you pay the price of some performance for such increased flexibility. The idea here is that the built-in data types must be as flexible as possible even if their performance is not so high, so they can be used for many different purposes. No complaint there. :-) > Then D standard library will have specialized data structures that are faster thanks to being more specialized and less flexible. In my case -- I'm turning into 'Mr C++' again -- probably that's often what I need. If I look at the major benefits I found in moving from C to C++, the first was memory management that was as automatic as I required. For example, C++ vectors are great because they do away with having to put in malloc/realloc/free statements and let you treat dynamic arrays pretty much as 'just another variable'. Within my own needs I've not yet found a case where the kind of smart GC functionality discussed on this thread seemed necessary, but of course I've never had it available to use before ... :-) > In D dynamic arrays some of the performance price is also paid for the automatic memory management, for the GC that's not a precise GC (for example if your array has some empty items at the end past its true length, the GC must ignore them). An idea was floating in my head about whether it is/could be possible to turn off GC safety features in a scope where they are unnecessary -- rather like a more general version of the 'assumeSafeAppend' function... > With LDC (once we'll have a D2 version of it) the performance of D2 can probably be the same as the C++. DMD maybe loses a little here because it's not so good at inlining, or maybe because the C++ vector is better than this D2 code. I thought dev effort was now focusing back on GDC ... ? :-P I have actually not made much use of the -inline function because in the code I wrote (maybe not best suited to inlining...), it made the program generally run slower ... Steven Schveighoffer wrote: > The C++ example is reallocating memory, freeing memory it is no longer using. It also manually handles the memory management, allocating larger and larger arrays in some algorithmically determined fashion (for example, multiplying the length by some constant factor). This gives it an edge in performance because it does not have to do any costly lookup to determine if it can append in place, plus the realloc of the memory probably is cheaper than the GC realloc of D. Right. In fact you get precisely 24 allocs/deallocs, each doubling the memory reserve to give a total capacity of 2^23 -- and then that memory is there and can be used for the rest of the 100 iterations of the outer loop. The shock for me was finding that D wasn't treating the memory like this but was preserving each loop's memory (as you say, for good reason). > D does not assume you stopped caring about the memory being pointed to when it had to realloc. [...] You can't do the same thing with C++ vectors, when they reallocate, the memory they used to own could be freed. This invalidates all pointers and iterators into the vector, but the language doesn't prevent you from having such dangling pointers. I have a vague memory of trying to do something exactly like your example when I was working with C++ for the first time, and getting bitten on the arse by exactly the problem you describe. I wish I could remember where. I know that I found another (and possibly better) solution to do what I wanted, but it would be nice to see if a D-ish solution would give me something good. > This must be fixed, the appender should be blazingly fast at appending (almost as fast as C++), with the drawback that the overhead is higher. Overhead = memory cost? I'm not so bothered as long as the memory stays within constant, predictable bounds. It was the memory explosion that scared me. And I suspect I'd pay a small performance cost (though it would have to be small) for the kind of safety and flexibility the arrays have. > You haven't done much with it yet. When you start discovering how much D takes care of, you will be amazed :) I know. :-) My needs are in some ways quite narrow -- numerical simulations in interdisciplinary physics -- hence the C background, and hence the premium on performance. They're also not very big programs -- simple enough for me to generally keep a personal overview on the memory management, even though with C++ that's usually all taken care of automatically (no new or delete statements if I can avoid it). What I'm fairly confident about is that, given not too much time, D will become a _far_ preferable language for that kind of development. > The thing about D is it *can* be fast and unsafe, just as fast and unsafe as C, but that's not the default. That's apparent -- I mean, given that D wraps the whole C standard library, I could basically write C code in D if I wanted, no? But of course it would have all the notational complexities of C, which is what I'd like to escape from ... :-P |
April 12, 2010 Re: Memory leak with dynamic array | ||||
---|---|---|---|---|
| ||||
Posted in reply to Joseph Wakeling | On Mon, 12 Apr 2010 12:03:38 -0400, Joseph Wakeling <joseph.wakeling@gmail.com> wrote: > I thought dev effort was now focusing back on GDC ... ? :-P AFAIK, gdc hasn't been actively developed for a few years. ldc, on the other hand, has regular releases. I think ldc may be the future of D compilers, but I currently use dmd since I'm using D2. > Steven Schveighoffer wrote: >> The C++ example is reallocating memory, freeing memory it is no longer >> using. It also manually handles the memory management, allocating larger >> and larger arrays in some algorithmically determined fashion (for example, >> multiplying the length by some constant factor). This gives it an edge in >> performance because it does not have to do any costly lookup to determine >> if it can append in place, plus the realloc of the memory probably is >> cheaper than the GC realloc of D. > > Right. In fact you get precisely 24 allocs/deallocs, each doubling the > memory reserve to give a total capacity of 2^23 -- and then that memory is > there and can be used for the rest of the 100 iterations of the outer loop. > The shock for me was finding that D wasn't treating the memory like this > but was preserving each loop's memory (as you say, for good reason). Yes, you get around this by preallocating. >> D does not assume you stopped caring about the memory being pointed to >> when it had to realloc. [...] You can't do the same thing with C++ >> vectors, when they reallocate, the memory they used to own could be >> freed. This invalidates all pointers and iterators into the vector, >> but the language doesn't prevent you from having such dangling pointers. > > I have a vague memory of trying to do something exactly like your example > when I was working with C++ for the first time, and getting bitten on the > arse by exactly the problem you describe. I wish I could remember where. > I know that I found another (and possibly better) solution to do what I > wanted, but it would be nice to see if a D-ish solution would give me > something good. It's often these types of performance discrepancies that critics point to (not that you are a critic), but it's the cost of having a more comprehensive language. Your appetite for the sheer performance of a language will sour once you get bit by a few of these nasty bugs. But D fosters a completely different way of thinking about solving problems. One problem with C++'s vector is it is a value type -- you must pass a reference in order to avoid copying an entire vector. However, D's arrays are a hybrid between reference and value type. Often, once you set data in a vector/array, you never change it again. D allows ways to enforce this (i.e. immutable) and also allows you to pass around "slices" of your array with zero overhead (no copying). It results in some extremely high-performance code, which wouldn't be easy, or maybe even possible, with C++. Take for instance a split function. In C++, I'd expect split(string x) to return a vector<string>. However, vector<string> makes a copy of each part of the string it has split out. D, however, can return references to the original data (slices), which consume no overhead. The only extra space allocated is the array to hold the string references. All this is also completely safe! You could then even modify the original string (assuming you were not using immutable strings) in place! Or append to any one of the strings in the array safely. >> This must be fixed, the appender should be blazingly fast at appending >> (almost as fast as C++), with the drawback that the overhead is higher. > > Overhead = memory cost? I'm not so bothered as long as the memory stays > within constant, predictable bounds. It was the memory explosion that > scared me. And I suspect I'd pay a small performance cost (though it > would have to be small) for the kind of safety and flexibility the arrays > have. Overhead = bigger initialization cost, memory footprint. It's not important if you are building a large array (which is what appender should be for), but the cost would add up if you had lots of little appenders that you didn't append much to. The point is, the builtin array optimizes performance for operations besides append, but allows appending as a convenience. Appender should optimize appending, sacrificing performance in other areas. It all depends on your particular application whether you should use appender or builtin arrays (or something entirely different/custom). >> You haven't done much with it yet. When you start discovering how much D >> takes care of, you will be amazed :) > > I know. :-) > > My needs are in some ways quite narrow -- numerical simulations in > interdisciplinary physics -- hence the C background, and hence the premium > on performance. They're also not very big programs -- simple enough for me > to generally keep a personal overview on the memory management, even though > with C++ that's usually all taken care of automatically (no new or delete > statements if I can avoid it). There are many in the community that use D for numerical stuff. It's definitely not as mature as it could be, but getting better. Don is adding a lot of cool stuff to it, including a builtin exponent operator and arbitrary precision numbers. >> The thing about D is it *can* be fast and unsafe, just as fast and unsafe >> as C, but that's not the default. > > That's apparent -- I mean, given that D wraps the whole C standard library, > I could basically write C code in D if I wanted, no? Yes, but that's not what I meant ;) I mean, you can write your own types, like the Appender (or what the appender *should* be) that optimize the behavior of code to meet any needs. And it can do it with a much better syntax than C. D's template system and ability to make user-types seem like builtins I think is unparalleled in C-like languages. -Steve |
Copyright © 1999-2021 by the D Language Foundation