August 29, 2007 Compile time loop unrolling | ||||
|---|---|---|---|---|
| ||||
Has anyone done this before?
It's pretty similar to what Don's stuff does, and maybe Don is even doing this in part of Blade somewhere, but anyway it's a little different from the type of thing he's got on his web page.
Here the basic idea is to optimize templated small vector classes.
Say you've got a struct Vector(N) type. A lot of the operations look like
values_[0] op other.values_[0];
values_[1] op other.values_[1];
...
values_[N-1] op other.values_[N-1];
//----------------------------------------------------------------------------
import std.metastrings;
// Create a string that unrolls the given expression N times replacing
// idx in the expression each time
string unroll(int N,int i=0)(string expr, char idx='z') {
static if(i<N) {
char[] subs_expr;
foreach (c; expr) {
if (c==idx) {
subs_expr ~= ToString!(i);
} else {
subs_expr ~= c;
}
}
return subs_expr ~ "\n" ~ unroll!(N,i+1)(expr,idx);
}
return "";
}
Then to use it to implement opAddAssign you write code like:
alias unroll!(N) unroll_;
void opAddAssign(ref vector_type _rhs) {
const string expr = "values_[z] += _rhs[z];";
//pragma(msg,unroll_(expr)); // handy for debug
mixin( unroll_(expr) );
}
Seems to work pretty well despite the braindead strategy of "replace every 'z' with the loop number".
I suspect this would improve performance significantly when using DMD since it can't inline anything with loops.
With the D2.0 and a "static foreach(i;N)" type of construct you could probably do this by just saying:
static foreach(i;N) {
values_[i] = _rhs.values_[i];
}
I wish that were coming to D1.0.
--bb
| ||||
August 29, 2007 Re: Compile time loop unrolling | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Bill Baxter |
I've done loop unrolling in a few places using Tuples and foreach.
template Tuple(T...) { alias T Tuple; }
template Range(uint n)
{
static if( n == 0 )
alias Tuple!() Range;
else
alias Tuple!(Range!(n-1), n-1) Range;
}
void copy_four(int[] src, int[] dst)
{
foreach( i,_ ; Range!(4) )
src[i] = dst[i];
}
Which *should* unroll the loop. Note that I haven't checked the assembly to make sure of this, but since it works when you have tuples inside the loop, I'd assume that it would have to :)
-- Daniel
| |||
Copyright © 1999-2021 by the D Language Foundation
Permalink
Reply