Thread overview
ROR is now optimized as SHR ?
Sep 14
user1234
Sep 14
kinke
Sep 14
user1234
Sep 14
Johan
September 14

I'm confused, not sure if it's a codegen bug but as you can observe here https://godbolt.org/z/PKn4Tnzff, it seems that since LDC 1.38, a SHR is generated but 1.37 previously it was a ROL (and not ROR either).

Have a nice week-end ;)

September 14

On Saturday, 14 September 2024 at 11:20:26 UTC, user1234 wrote:

>

I'm confused, not sure if it's a codegen bug but as you can observe here https://godbolt.org/z/PKn4Tnzff, it seems that since LDC 1.38, a SHR is generated but 1.37 previously it was a ROL (and not ROR either).

What can be seen is that optimized core.bitop.ror does use the ror instruction, but the comparison with inlined ror can apparently be transformed by the optimizer, and LLVM 18 (LDC v1.38+) chooses a different transformation with movzx+shr instead of LLVM 17's rol.

September 14

On Saturday, 14 September 2024 at 11:53:14 UTC, kinke wrote:

>

On Saturday, 14 September 2024 at 11:20:26 UTC, user1234 wrote:

>

I'm confused, not sure if it's a codegen bug but as you can observe here https://godbolt.org/z/PKn4Tnzff, it seems that since LDC 1.38, a SHR is generated but 1.37 previously it was a ROL (and not ROR either).

What can be seen is that optimized core.bitop.ror does use the ror instruction, but the comparison with inlined ror can apparently be transformed by the optimizer, and LLVM 18 (LDC v1.38+) chooses a different transformation with movzx+shr instead of LLVM 17's rol.

I understand that it's not LDC fault, sorry but that was so weird that I needed to know. The transformation operated by LLVM is just wrong. I guess it's caused by a combination of aggressive optimz + folding. That should just generates

mov   rax, rdi
rol   rdi, 8
cmp   rax, rdi
sete  al
ret

Actually If you use the LLVM asm to write a naked asm function then you just get wrong results.

Thanks for you time.

September 14

On Saturday, 14 September 2024 at 15:18:43 UTC, user1234 wrote:

>

On Saturday, 14 September 2024 at 11:53:14 UTC, kinke wrote:

>

On Saturday, 14 September 2024 at 11:20:26 UTC, user1234 wrote:

>

I'm confused, not sure if it's a codegen bug but as you can observe here https://godbolt.org/z/PKn4Tnzff, it seems that since LDC 1.38, a SHR is generated but 1.37 previously it was a ROL (and not ROR either).

What can be seen is that optimized core.bitop.ror does use the ror instruction, but the comparison with inlined ror can apparently be transformed by the optimizer, and LLVM 18 (LDC v1.38+) chooses a different transformation with movzx+shr instead of LLVM 17's rol.

I understand that it's not LDC fault, sorry but that was so weird that I needed to know. The transformation operated by LLVM is just wrong. I guess it's caused by a combination of aggressive optimz + folding. That should just generates

mov   rax, rdi
rol   rdi, 8
cmp   rax, rdi
sete  al
ret

Actually If you use the LLVM asm to write a naked asm function then you just get wrong results.

Thanks for you time.

Looks like a bad bug.
https://github.com/ldc-developers/ldc/issues/4753

September 15

On 14 Sep 2024, at 23:16, Johan via digitalmars-d-ldc wrote:

>

Looks like a bad bug. https://github.com/ldc-developers/ldc/issues/4753

Ah, just saw this now; I filed an upstream bug too: https://github.com/llvm/llvm-project/issues/108722

—David