Intermediate arrays

Jul 07, 2012

bearophile

Jul 07, 2012

Jonathan M Davis

Jul 07, 2012

bearophile

Jul 07, 2012

Jonathan M Davis

July 07, 2012

Intermediate arrays

Posted by bearophile

Permalink

bearophile

Permalink

When I write D code from scratch often I write it in a high level style. This helps me focus on the algorithm and the problem, solve it efficiently, and avoid many mistakes. If I translate code from a high level language to D, the resulting code is similar.

Later I sometimes optimize the D code, using many different strategies, including replacing calls to std.algorithm functions with more specialized code, replacing associative arrays with dynamic arrays, packing better the memory in several ways, and so on. Most times I don't need inline assembly, I usually gain 20X-50X-200X speed using normal D code in few hours of work.

One important strategy to increase the efficiency of the code is to remove all or most heap allocations from the critical parts of the code. The resulting code is usually longer and quite less nice looking.

I've seen that one quite significant source of performance improvement comes from replacing dynamic arrays with fixed size ones, despite the presence of some array appends.

- - - - - - - - - - - - -

So my first suggestion is to add to std.array one simple array data structure, to be used to replace some usages of dynamic arrays (this suggestion requires no changes in the D language).

Its memory is allocated in-place, so internally it uses a fixed-size array. But it also keeps inside a length, that denotes how many array items you are actually using. So it's like a dynamic array allocated in-place with a maximum length. In this array you can't insert more than 200 ints:


LimitedInplaceArray!(int, 200) data;
data.length = 10;
data[0 .. 10] = 1;
data.sort();
data ~= 2;
data[0 .. 11].sort();
data.length += 1000; // assert error (no enforce)


The slice is seen as normal dynamic array. Like with fixed-size arrays you need to be careful with slices, because their memory is sometimes in the stack frame. Increasing the length of one of those slices seems a bad idea (despite unfortunately the D type system doesn't forbid you to do that).

With a very simple data structure like that (but implemented manually), I have sometimes doubled the performance of some routines.

Internally the fixed-size array is initialized with "void" followed by something like std.array.minimallyInitializedArray.

- - - - - - - - - - - - -

A bit more flexible data structure requires a change in the D language.

In this Issue I've discussed the desire for C99-like Variable Length Arrays (I have received not enough feedback about it):
http://d.puremagic.com/issues/show_bug.cgi?id=5348

With alloca-style stack allocation capabilities you are able to create a data structure very similar to the precedent one, that is limited in size still, but where the fixed max length is now a run-time value (like the precedent data structure this one isn't really able to grow, it pre-allocates a fixed size on the stack, and that's it):


int n = 200; // run-time value
VariableLimitedInplaceArray!int data = variableLimitedInplaceArray!int(n);
data.length = 10;
data[0 .. 10] = 1;
data ~= 2;
data[0 .. 11].sort();
data.length += 1000; // assert error (no enforce)


LimitedInplaceArray is usable as member of structs/classes too, while VariableLimitedInplaceArray can only be allocated on the stack.

Bye,
bearophile

On Saturday, July 07, 2012 21:14:58 bearophile wrote: > So my first suggestion is to add to std.array one simple array data structure, to be used to replace some usages of dynamic arrays (this suggestion requires no changes in the D language). > > Its memory is allocated in-place, so internally it uses a fixed-size array. But it also keeps inside a length, that denotes how many array items you are actually using. So it's like a dynamic array allocated in-place with a maximum length. In this array you can't insert more than 200 ints: It sounds to me like you want std.container.Array with a custom allocator which uses the stack. - Jonathan M Davis

Jonathan M Davis: > It sounds to me like you want std.container.Array with a custom allocator > which uses the stack. My most presents two related requests. I don't think std.container.Array can implement the second one, that requires a change in the D language. Regarding the first suggestion, is std.container.Array able to give me a run-time assert error when I go past the memory allocated on the stack? Bye, bearophile

On Saturday, July 07, 2012 21:39:43 bearophile wrote: > Jonathan M Davis: > > It sounds to me like you want std.container.Array with a custom > > allocator > > which uses the stack. > > My most presents two related requests. I don't think std.container.Array can implement the second one, that requires a change in the D language. > > Regarding the first suggestion, is std.container.Array able to give me a run-time assert error when I go past the memory allocated on the stack? Custom allocators haven't been implemented and added to std.container yet. I don't know what they will and won't be able to do. A stack allocator was present in a previous proposal, so I would expect there to be one in the final implementation, but I don't know any details about how it will work. - Jonathan M Davis

Forums