0 is not a power of 2

So there's this classic trick: bool isPowerOf2(uint x) { return (x & (x - 1)) == 0; } Pretty neat, but it wrongly returns true for x == 0. So the obvious fix is: bool isPowerOf2(uint x) { return x && (x & (x - 1)) == 0; } But that has branches in it. So I came up with: bool isPowerOf2(uint x) { return (x & (x - 1) | !x) == 0; } which has no branches at least with dmd, see http://goo.gl/TVkCwc. Any ideas for faster code? Thanks, Andrei

On Tuesday, 19 May 2015 at 05:16:48 UTC, Andrei Alexandrescu wrote: > Any ideas for faster code? Unless I'm mistaken, any uint that's a power of 2 will only have a single set bit, so why not use the "popcnt" instruction? http://dlang.org/phobos/core_bitop.html#.popcnt

On Mon, May 18, 2015 at 10:16:47PM -0700, Andrei Alexandrescu via Digitalmars-d wrote: [...] > bool isPowerOf2(uint x) > { > return (x & (x - 1) | !x) == 0; > } [...] Are you sure that's correct? Doesn't that return true for all non-zero numbers? T -- "Uhh, I'm still not here." -- KD, while "away" on ICQ.

On 5/18/15 10:37 PM, H. S. Teoh via Digitalmars-d wrote: > On Mon, May 18, 2015 at 10:16:47PM -0700, Andrei Alexandrescu via Digitalmars-d wrote: > [...] >> bool isPowerOf2(uint x) >> { >> return (x & (x - 1) | !x) == 0; >> } > [...] > > Are you sure that's correct? Doesn't that return true for all non-zero > numbers? Excerpt from std.experimental.allocator.common: package bool isPowerOf2(uint x) { return (x & (x - 1) | !x) == 0; } unittest { assert(!isPowerOf2(0)); assert(isPowerOf2(1)); assert(isPowerOf2(2)); assert(!isPowerOf2(3)); assert(isPowerOf2(4)); assert(!isPowerOf2(5)); assert(!isPowerOf2(6)); assert(!isPowerOf2(7)); assert(isPowerOf2(8)); assert(!isPowerOf2(9)); assert(!isPowerOf2(10)); } Andrei

On 5/18/15 10:40 PM, Brian Schott wrote: > On Tuesday, 19 May 2015 at 05:16:48 UTC, Andrei Alexandrescu wrote: >> Any ideas for faster code? > > Unless I'm mistaken, any uint that's a power of 2 will only have a > single set bit, so why not use the "popcnt" instruction? > http://dlang.org/phobos/core_bitop.html#.popcnt There are complaints it's not that fast, e.g. http://danluu.com/assembly-intrinsics/. -- Andrei

Aren't predictable branches cheap on current architectures? Atila On Tuesday, 19 May 2015 at 05:16:48 UTC, Andrei Alexandrescu wrote: > So there's this classic trick: > > bool isPowerOf2(uint x) > { > return (x & (x - 1)) == 0; > } > > Pretty neat, but it wrongly returns true for x == 0. So the obvious fix is: > > bool isPowerOf2(uint x) > { > return x && (x & (x - 1)) == 0; > } > > But that has branches in it. So I came up with: > > bool isPowerOf2(uint x) > { > return (x & (x - 1) | !x) == 0; > } > > which has no branches at least with dmd, see http://goo.gl/TVkCwc. > > Any ideas for faster code? > > > Thanks, > > Andrei

May 19, 2015

Re: 0 is not a power of 2

Posted by John Colvin
in reply to Andrei Alexandrescu

Permalink

John Colvin

Posted in reply to Andrei Alexandrescu

Permalink

On Tuesday, 19 May 2015 at 05:16:48 UTC, Andrei Alexandrescu wrote:
> So there's this classic trick:
>
> bool isPowerOf2(uint x)
> {
>     return (x & (x - 1)) == 0;
> }
>
> Pretty neat, but it wrongly returns true for x == 0. So the obvious fix is:
>
> bool isPowerOf2(uint x)
> {
>     return x && (x & (x - 1)) == 0;
> }
>
> But that has branches in it. So I came up with:
>
> bool isPowerOf2(uint x)
> {
>     return (x & (x - 1) | !x) == 0;
> }
>
> which has no branches at least with dmd, see http://goo.gl/TVkCwc.
>
> Any ideas for faster code?
>
>
> Thanks,
>
> Andrei

I tested with a few different (modern) backends to see what was generated, they all essentially give you this (gcc 5.1.0 -O3 -march=broadwell):

isPowerOf2:
	xorl	%eax, %eax
	testl	%edi, %edi
	je	.L5
	blsr	%edi, %edi
	testl	%edi, %edi
	sete	%al
.L5:
	ret
isPowerOf2b:
	blsr	%edi, %edx
	xorl	%eax, %eax
	testl	%edi, %edi
	sete	%al
	orl	%eax, %edx
	sete	%al
	ret

but if your not restricting the build to such recent processor, you can't emit BLSR, so you get this instead (gcc 5.1.0 -O3):

isPowerOf2b:
	leal	-1(%rdi), %eax
	xorl	%edx, %edx
	andl	%edi, %eax
	testl	%edi, %edi
	sete	%dl
	orl	%edx, %eax
	sete	%al
	ret

On Tuesday, 19 May 2015 at 05:16:48 UTC, Andrei Alexandrescu wrote: > So there's this classic trick: > > bool isPowerOf2(uint x) > { > return (x & (x - 1)) == 0; > } > > which has no branches at least with dmd, see http://goo.gl/TVkCwc. > > Any ideas for faster code? If you dont mind asm then after you do... tmp = x-1; you could add the borrow/carry flag back onto the tmp, so it'd add back up to zero any time there's an underflow in the (x-1) op. So two extra instructions, (you need a zero for the ADC) no branch.

On Tuesday, 19 May 2015 at 07:56:27 UTC, Atila Neves wrote: > Aren't predictable branches cheap on current architectures? Yes they are, and it seems one would rarely if ever call isPowOf2 with 0 in std.allocator. A good thing to do, is to use a good hardware event profiler like perf, and record the branch misses. Perf is also the right tool to compare the branchless version (I doubt it's significantly better, especially since the felt of the 0 branch is just a constant).

I believe you can also do x & -x == x. I'm not sure if that will be actually faster or slower. You could maybe cut the instructions down a little with an asm{} block. The compiler might not figure out that it can re-use a register for x on the left and x on the right there. You might use popcnt in a version() block too, so you can use the instruction when you've got it.

Forums