July 03, 2012 Re: popFront causing more memory to be used | ||||
---|---|---|---|---|
| ||||
Posted in reply to bearophile | I tested that, modulus is slower. The compiler is surely converting it to something branchless like: uint iter_next = (iter + 1) * !(iter + 1 > k); I take your point but I think most people know that the equals operators have the lowest associativity. |
July 03, 2012 Re: popFront causing more memory to be used | ||||
---|---|---|---|---|
| ||||
Posted in reply to ixid | In any case with large values of k the branch prediction will be right almost all of the time, explaining why this form is faster than modulo as modulo is fairly slow while this is a correctly predicted branch doing an addition if it doesn't make it branchless. The branchless version gives the same time result as branched, is there a way to force that line not to optimized to compare the predicted version? |
July 03, 2012 Re: popFront causing more memory to be used | ||||
---|---|---|---|---|
| ||||
Posted in reply to ixid | ixid:
> I take your point but I think most people know that the equals operators have the lowest associativity.
Sorry I meant:
nums[iter_next] = total % (10 ^^ 8);
Instead of:
nums[iter_next] = total % 10^^8;
But I presume lot of people know that powers are higher precedence :-)
Bye,
bearophile
|
July 03, 2012 Re: popFront causing more memory to be used | ||||
---|---|---|---|---|
| ||||
Posted in reply to bearophile | Oops! I have a bad habit of thinking of the power operator as a part of the value rather than as an operator. |
July 03, 2012 Re: popFront causing more memory to be used | ||||
---|---|---|---|---|
| ||||
Posted in reply to ixid | ixid: > In any case with large values of k the branch prediction will be right almost all of the time, explaining why this form is faster than modulo as modulo is fairly slow while this is a correctly predicted branch doing an addition if it doesn't make it branchless. That seems the explanation. > The branchless version gives the same time result as branched, is there a way to force that line not to optimized to compare the predicted version? I don't fully understand the question. Do you mean annotations like the __builtin_expect of GCC? Bye, bearophile |
July 03, 2012 Re: popFront causing more memory to be used | ||||
---|---|---|---|---|
| ||||
Posted in reply to bearophile | On Tuesday, 3 July 2012 at 17:25:18 UTC, bearophile wrote:
> ixid:
>
>> In any case with large values of k the branch prediction will be right almost all of the time, explaining why this form is faster than modulo as modulo is fairly slow while this is a correctly predicted branch doing an addition if it doesn't make it branchless.
>
> That seems the explanation.
>
>
>> The branchless version gives the same time result as branched, is there a way to force that line not to optimized to compare the predicted version?
>
> I don't fully understand the question. Do you mean annotations like the __builtin_expect of GCC?
>
> Bye,
> bearophile
If
uint iter_next = iter + 1 > k? 0 : iter + 1;
is getting optimized to
uint iter_next = (iter + 1) * !(iter + 1 > k);
or something like it by the compiler then it would be nice to be able to test the branched code without having the rest of the program lose optimizations for speed because as I said, for large k branching will almost always be correctly predicted making me think it'd be faster than the branchless version.
|
Copyright © 1999-2021 by the D Language Foundation