April 24, 2016
On 4/24/2016 8:33 AM, Andrei Alexandrescu wrote:
> On 04/24/2016 02:57 AM, deadalnix wrote:
>> On Saturday, 23 April 2016 at 15:29:08 UTC, Andrei Alexandrescu wrote:
>>> Yah, that's the canonical. I forgot why I chose (x & -x) > (x - 1)
>>> over it.
>>
>> I'm not sure why do you test against x - 1 when you could test for
>> equality. Not only it looks like it is going to require an extra
>> computation (x - 1) but also it doesn't work for 0.
>
> So if you do (x & -x) == x that returns 1 for x == 0. For many applications you
> want to yield 0. So you test for (x & -x) > (x - 1) such that for x == 0 the
> right hand side is a large number. -- Andrei

This sort of stuff should go on wiki.dlang.org page!
April 24, 2016
On Sunday, 24 April 2016 at 21:45:32 UTC, Walter Bright wrote:
> On 4/24/2016 8:33 AM, Andrei Alexandrescu wrote:
>> On 04/24/2016 02:57 AM, deadalnix wrote:
>>> On Saturday, 23 April 2016 at 15:29:08 UTC, Andrei Alexandrescu wrote:
>>>> Yah, that's the canonical. I forgot why I chose (x & -x) > (x - 1)
>>>> over it.
>>>
>>> I'm not sure why do you test against x - 1 when you could test for
>>> equality. Not only it looks like it is going to require an extra
>>> computation (x - 1) but also it doesn't work for 0.
>>
>> So if you do (x & -x) == x that returns 1 for x == 0. For many applications you
>> want to yield 0. So you test for (x & -x) > (x - 1) such that for x == 0 the
>> right hand side is a large number. -- Andrei
>
> This sort of stuff should go on wiki.dlang.org page!

Please no cmp.
Just
bool isPowerOf2(uint x) { return x && !(x & (x - 1)); }
April 24, 2016
On Sunday, 24 April 2016 at 23:00:56 UTC, Temtaime wrote:
> Please no cmp.
> Just
> bool isPowerOf2(uint x) { return x && !(x & (x - 1)); }

You do realise that this will (typically) emit a branch?

 — David
April 25, 2016
On Sunday, 24 April 2016 at 23:17:53 UTC, David Nadlinger wrote:
> On Sunday, 24 April 2016 at 23:00:56 UTC, Temtaime wrote:
>> Please no cmp.
>> Just
>> bool isPowerOf2(uint x) { return x && !(x & (x - 1)); }
>
> You do realise that this will (typically) emit a branch?
>
>  — David

I compiled using dmd -O and looked at the disassembly using objdump -D and there are indeed a couple branches in the disassembly:

    0000000000000000 <_D4test8powerOf2FkZk>:
       0:   50                      push   %rax
       1:   48 89 f9                mov    %rdi,%rcx
       4:   85 c9                   test   %ecx,%ecx
       6:   74 0a                   je     12 <_D4test8powerOf2FkZk+0x12>
       8:   8d 81 ff ff ff ff       lea    -0x1(%rcx),%eax
       e:   85 c1                   test   %eax,%ecx
      10:   74 04                   je     16 <_D4test8powerOf2FkZk+0x16>
      12:   31 c0                   xor    %eax,%eax
      14:   eb 05                   jmp    1b <_D4test8powerOf2FkZk+0x1b>
      16:   b8 01 00 00 00          mov    $0x1,%eax
      1b:   59                      pop    %rcx
      1c:   c3                      retq
      1d:   0f 1f 00                nopl   (%rax)


I modified David's solution a bit to (hopefully) eliminate the branch:
bool powerOf2(uint x){ return !x ^ !(x & (x - 1)); }

    0000000000000000 <_D4test8powerOf2FkZb>:
       0:   50                      push   %rax
       1:   53                      push   %rbx
       2:   48 89 fa                mov    %rdi,%rdx
       5:   85 d2                   test   %edx,%edx
       7:   0f 94 c0                sete   %al
       a:   8d 8a ff ff ff ff       lea    -0x1(%rdx),%ecx
      10:   85 ca                   test   %ecx,%edx
      12:   0f 94 c3                sete   %bl
      15:   32 c3                   xor    %bl,%al
      17:   5b                      pop    %rbx
      18:   59                      pop    %rcx
      19:   c3                      retq
      1a:   66 0f 1f 44 00 00       nopw   0x0(%rax,%rax,1)

The branches are gone but I'm really not sure if this is better or worse.
April 25, 2016
On Monday, 25 April 2016 at 01:17:48 UTC, Xinok wrote:
> ...

Sorry, didn't mean to say David's solution. Too many edits. >_<
April 25, 2016
On 4/24/16 9:17 PM, Xinok wrote:
> I modified David's solution a bit to (hopefully) eliminate the branch:
> bool powerOf2(uint x){ return !x ^ !(x & (x - 1)); }
>
>      0000000000000000 <_D4test8powerOf2FkZb>:
>         0:   50                      push   %rax
>         1:   53                      push   %rbx
>         2:   48 89 fa                mov    %rdi,%rdx
>         5:   85 d2                   test   %edx,%edx
>         7:   0f 94 c0                sete   %al
>         a:   8d 8a ff ff ff ff       lea    -0x1(%rdx),%ecx
>        10:   85 ca                   test   %ecx,%edx
>        12:   0f 94 c3                sete   %bl
>        15:   32 c3                   xor    %bl,%al
>        17:   5b                      pop    %rbx
>        18:   59                      pop    %rcx
>        19:   c3                      retq
>        1a:   66 0f 1f 44 00 00       nopw   0x0(%rax,%rax,1)
>
> The branches are gone but I'm really not sure if this is better or worse.

With gdc https://godbolt.org/g/jcU4np isPow2B is the winner (shortest code, simplest instructions). -- Andrei
April 25, 2016
On 4/24/16 7:00 PM, Temtaime wrote:
> On Sunday, 24 April 2016 at 21:45:32 UTC, Walter Bright wrote:
>> On 4/24/2016 8:33 AM, Andrei Alexandrescu wrote:
>>> On 04/24/2016 02:57 AM, deadalnix wrote:
>>>> On Saturday, 23 April 2016 at 15:29:08 UTC, Andrei Alexandrescu wrote:
>>>>> Yah, that's the canonical. I forgot why I chose (x & -x) > (x - 1)
>>>>> over it.
>>>>
>>>> I'm not sure why do you test against x - 1 when you could test for
>>>> equality. Not only it looks like it is going to require an extra
>>>> computation (x - 1) but also it doesn't work for 0.
>>>
>>> So if you do (x & -x) == x that returns 1 for x == 0. For many
>>> applications you
>>> want to yield 0. So you test for (x & -x) > (x - 1) such that for x
>>> == 0 the
>>> right hand side is a large number. -- Andrei
>>
>> This sort of stuff should go on wiki.dlang.org page!
>
> Please no cmp.

Why? -- Andrei

April 25, 2016
On Monday, 25 April 2016 at 05:35:12 UTC, Andrei Alexandrescu wrote:
>
> With gdc https://godbolt.org/g/jcU4np isPow2B is the winner (shortest code, simplest instructions). -- Andrei

I generalized function isPow2B to work for negative numbers in case of a signed type, while keeping the assembly the same or better. (That also incidentally made it correctly return false for zero.)

>bool isPow2B(T)(T x)
>{
>    return (x & -x) > (x - 1);
>}

bool isPow2F(T)(T x)
{
	return (x & -x) > (x >>> 1);
}

assembly diff:
--        subl    $1, %edi
++        shrl    %edi

April 25, 2016
On 4/25/16 6:42 AM, Solomon E wrote:
> On Monday, 25 April 2016 at 05:35:12 UTC, Andrei Alexandrescu wrote:
>>
>> With gdc https://godbolt.org/g/jcU4np isPow2B is the winner (shortest
>> code, simplest instructions). -- Andrei
>
> I generalized function isPow2B to work for negative numbers in case of a
> signed type, while keeping the assembly the same or better. (That also
> incidentally made it correctly return false for zero.)
>
>> bool isPow2B(T)(T x)
>> {
>>    return (x & -x) > (x - 1);
>> }
>
> bool isPow2F(T)(T x)
> {
>      return (x & -x) > (x >>> 1);
> }
>
> assembly diff:
> --        subl    $1, %edi
> ++        shrl    %edi
>

That's smaller and faster, thanks. But how do you show it is still correct? -- Andrei
April 25, 2016
On Monday, 25 April 2016 at 12:56:27 UTC, Andrei Alexandrescu wrote:
> On 4/25/16 6:42 AM, Solomon E wrote:
>> On Monday, 25 April 2016 at 05:35:12 UTC, Andrei Alexandrescu wrote:
>>>
>>> With gdc https://godbolt.org/g/jcU4np isPow2B is the winner (shortest
>>> code, simplest instructions). -- Andrei
>>
>> I generalized function isPow2B to work for negative numbers in case of a
>> signed type, while keeping the assembly the same or better. (That also
>> incidentally made it correctly return false for zero.)
>>
>>> bool isPow2B(T)(T x)
>>> {
>>>    return (x & -x) > (x - 1);
>>> }
>>
>> bool isPow2F(T)(T x)
>> {
>>      return (x & -x) > (x >>> 1);
>> }
>>
>> assembly diff:
>> --        subl    $1, %edi
>> ++        shrl    %edi
>>
>
> That's smaller and faster, thanks. But how do you show it is still correct? -- Andrei

Brute force.

http://dpaste.dzfl.pl/882d7cdc5f74