dmd optimizer now converted to D! (page 2)

On 7/4/2018 10:22 AM, H. S. Teoh wrote: > Actually, what will make dmd produce better code IMO is: (1) a more > aggressive metric for the inliner (currently it gives up too easily, at > the slightest increase in code complexity), and (2) implement loop > unrolling. It's already doing some loop unrolling (added recently): https://github.com/dlang/dmd/blob/master/src/dmd/backend/gloop.d#L3763 There's still room for improvement there, this is a first stab at it.

July 05, 2018

Re: dmd optimizer now converted to D!

Posted by Ivan Kazmenko
in reply to H. S. Teoh

Permalink

Ivan Kazmenko

Posted in reply to H. S. Teoh

Permalink

On Wednesday, 4 July 2018 at 17:22:22 UTC, H. S. Teoh wrote:
> ... dmd *is* capable of things like strength reduction and code lifting, but as Walter himself has said, it does *not* implement loop unrolling.

Ow!  I always thought it did loop unrolling in some cases, I was just never lucky when I checked.  And now you and Walter say its implementation started only recently.

Good to know the actual state of things.  Manual loop unrolling did help me a couple of times with C++ and D.

-----

By the way, what's a relatively painless way to manually unroll a loop in D?  As a simple example, consider:

    for (int i = 0; i < 4 * n; i++)
        a[i] += i;

With C[++], I did simply like this:

    for (int j = 0; j < 4 * n; j += 4) {
#define doit(i) a[i] += i
        doit(j + 0);
        doit(j + 1);
        doit(j + 2);
        doit(j + 3);
    }

This looks long, but on the positive side, it does not actually alter the expression: however complex and obscure the "a[i] += i" would be in a real example, it can remain untouched.

With D, I used mixins, and they were cumbersome.  Now that we have static foreach, it's just this:

    for (int i = 0; i < 4 * n; i += 4)
        static foreach (k; 0..4)
            a[i + k] += i + k;

This looks very nice to me, but still not ideal: a static-foreach argument cannot encapsulate a runtime variable, so we have to repeat "i + k" twice.  This can get cumbersome for a more complex example.  Is there any better way?  To prevent introducing bugs when micro-optimizing, I'd like the loop body to remain as unchanged as it can be.

Ivan Kazmenko.

On Thursday, 5 July 2018 at 12:50:18 UTC, Ivan Kazmenko wrote: > With D, I used mixins, and they were cumbersome. Now that we have static foreach, it's just this: > > for (int i = 0; i < 4 * n; i += 4) > static foreach (k; 0..4) > a[i + k] += i + k; > > This looks very nice to me, but still not ideal: a static-foreach argument cannot encapsulate a runtime variable, so we have to repeat "i + k" twice. This can get cumbersome for a more complex example. Is there any better way? To prevent introducing bugs when micro-optimizing, I'd like the loop body to remain as unchanged as it can be. > > Ivan Kazmenko. FYI: you can introduce scopes with static foreach to declare new variables: for (int i = 0; i < 4 * n; i += 4) { static foreach (k; 0..4) {{ auto idx = i + k a[idx] += idx; }} } However, LDC is pretty good at loop unrolling out of the box: https://godbolt.org/g/4nSWzQ (even though gdc is written there, it's "ldc" - known typo: https://github.com/mattgodbolt/compiler-explorer/pull/988)

On Thursday, 5 July 2018 at 12:50:18 UTC, Ivan Kazmenko wrote: > Is there any better way? To prevent introducing bugs when micro-optimizing, I'd like the loop body to remain as unchanged as it can be. foreach(j, ref piece; cast(int[4][]) a) { auto pieceI = j * 4; static foreach(i; 0 .. piece.length) piece[i] = pieceI + i; } Can probably be made even better by designing some template helper.

On Thursday, 5 July 2018 at 14:05:42 UTC, Seb wrote: > FYI: you can introduce scopes with static foreach to declare new variables: > > for (int i = 0; i < 4 * n; i += 4) > { > static foreach (k; 0..4) > {{ > auto idx = i + k > a[idx] += idx; > }} > } Thanks! The two parentheses trick is nice. Generally, I was reluctant to declare a variable because, well, micro-optimizing means being dissatisfied with compiler optimization. So the mindset didn't allow me to just go and declare a variable in the innermost loop, in fear that the optimizer might not optimize the allocation away.

On Thursday, 5 July 2018 at 14:30:05 UTC, Dukc wrote: > foreach(j, ref piece; cast(int[4][]) a) > { auto pieceI = j * 4; > static foreach(i; 0 .. piece.length) piece[i] = pieceI + i; > } > > Can probably be made even better by designing some template helper. Thanks! The cast to an array of int[4]s is just hilarious.

Forums