May 21, 2015
On Thursday, 21 May 2015 at 13:24:57 UTC, Steven Schveighoffer wrote:
> Gah, I messed up, used output from old code that wasn't doing what I thought it was. You are right about that.
>
> But my whole point there was that x >> bsr(x) is ALWAYS 1.
>
> 2 >> bsr(2) == 1
> 3 >> bsr(3) == 1
> 4 >> bsr(4) == 1
> 17 >> bsr(17) == 1
>
> So really, your test checks to see if a value is zero or not, not whether it's a power of 2.
>
> BUT, the opposite mechanism would work:
>
> 1 << bsr(x) == x;
>

Ha yes. You'd want to use TZCNT.

Alternatively, with bsf on could do:

x << bsf(x) == 1 << [32|64]

>> 0 >> anything is 0.
>
> Hah, good point :) Even if bsr(0) is undefined it doesn't matter. Didn't think of that.
>
> But that means the opposite solution I mentioned above, doesn't work, still back to square one.
>
> -Steve

Well no, there all needed to make it work :) Still no idea if this is actually faster, but there is one less operation to perform.
May 21, 2015
On 5/21/15 4:46 PM, deadalnix wrote:

> Alternatively, with bsf on could do:
>
> x << bsf(x) == 1 << [32|64]

Hm... bsf does work in your original code, I'm thinking you may have messed up bsf and bsr :)

x >> bsf(x) == 1

Which makes a LOT more sense (I tested and it works).

Don't ask me why I knew bsr vs. bsf...

-Steve
May 22, 2015
This formula measures  a little faster on dmd.  Release build, three tests, find all values for 0..uint.max.

first result uses
if (((x-1)&(x|0x80000000))==0)

second result uses
if ((x & (x - 1) | !x) == 0)


D:\pow2\pow2\pow2\Release>pow2
duration(msec)=10259
duration(msec)=10689

D:\pow2\pow2\pow2\Release>pow2
duration(msec)=10256
duration(msec)=10695

D:\pow2\pow2\pow2\Release>pow2
duration(msec)=10264
duration(msec)=10726
May 22, 2015
On Friday, 22 May 2015 at 05:24:15 UTC, Jay Norwood wrote:

> first result uses
> if (((x-1)&(x|0x80000000))==0)

00F81005  mov         eax,edx
00F81007  lea         ecx,[edx-1]
00F8100A  or          eax,80000000h
00F8100F  test        ecx,eax

Above is what a Microsoft C++ compiler does with the first expression.  It gets inserted in-line in the test.
1 2 3 4 5
Next ›   Last »