LLVM codgen improvement, count bits intrinsics - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » LDC » LLVM codgen improvement, count bits intrinsics

Thread overview

LLVM codgen improvement, count bits intrinsics
Apr 27, 2019 NaN
Apr 28, 2019 Johan Engelen
Apr 30, 2019 NaN

April 27, 2019

LLVM codgen improvement, count bits intrinsics

Posted by NaN

NaN

Where do you sugest to LLVM people that codegem could be improved? The bit scan forward and reverse both test for zero and do jumps (when you want zero defined), when they could be doing conditional moves because both instructions st the zero flag if the input is zero. Basically...

import ldc.intrinsics;
alias llvm_bsf = llvm_cttz;

void foo(int a)
{
    a = llvm_bsf(a,false);
    writeln(a);
}

compiles to this...

        test    ebx, ebx
        je      .LBB0_1
        bsf     ebx, ebx
        jmp     .LBB0_3
.LBB0_1:
        mov     ebx, 32
.LBB0_3:

where it could just be

        mov     edi,32
        bsf     ebx,ebx
        cmovz   ebx,edi

April 28, 2019

Re: LLVM codgen improvement, count bits intrinsics

Posted by Johan Engelen
in reply to NaN

Johan Engelen

Posted in reply to NaN

On Saturday, 27 April 2019 at 20:25:01 UTC, NaN wrote:
> Where do you sugest to LLVM people that codegem could be improved?

On their mailinglist or in their bug tracker.

> The bit scan forward and reverse both test for zero and do jumps (when you want zero defined), when they could be doing conditional moves because both instructions st the zero flag if the input is zero.

Two remarks:
1. Conditional move is not necessarily faster than branching
2. On recent CPUs `tzcnt` is the better instruction that has defined output for input 0

-Johan

April 30, 2019

Re: LLVM codgen improvement, count bits intrinsics

Posted by NaN
in reply to Johan Engelen

NaN

Posted in reply to Johan Engelen

On Sunday, 28 April 2019 at 11:22:57 UTC, Johan Engelen wrote:
> On Saturday, 27 April 2019 at 20:25:01 UTC, NaN wrote:
>>
> Two remarks:
> 1. Conditional move is not necessarily faster than branching
> 2. On recent CPUs `tzcnt` is the better instruction that has defined output for input 0

Unfortunately neither my CPU or myself are very recent.

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation