[Issue 21041] core.bitop.byteswap(ushort) should used ROL/ROR instead of XCHG

Jul 12, 2020

safety0ff.bugz

Jul 12, 2020

Bruce Carneal

Jul 12, 2020

Jul 12, 2020

Jul 12, 2020

Jul 12, 2020

Mar 21, 2021

Dec 17, 2022

https://issues.dlang.org/show_bug.cgi?id=21041 safety0ff.bugz <safety0ff.bugz@gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Keywords| |performance --

July 12, 2020

[Issue 21041] core.bitop.byteswap(ushort) should used ROL/ROR instead of XCHG

Posted by Bruce Carneal

Permalink

Bruce Carneal

Permalink

https://issues.dlang.org/show_bug.cgi?id=21041

Bruce Carneal <bcarneal11@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |bcarneal11@gmail.com

--- Comment #1 from Bruce Carneal <bcarneal11@gmail.com> ---
I didn't find a 'byteswap' in the core.bitop documentation.  There is a bswap but only for uints and ulongs AFAICT.  Regardless, here's a byteswap implementation for discussion:

auto byteswap(ushort x) { return cast(ushort)(x >> 8 | x << 8); }

For the above code ldc at -O or above generates:
  movl %edi, %eax
  rolw $8, %ax
  retq

With ldc you can also get the above sequence using core.bitop.rol!8 explicitly.

Current dmd -O emits 7 instructions to accomplish the rolw in the code body. The code emitted by dmd -O for the explicit call to core.bitop.rol is even worse, which is strange.

So, yes, there's room here for DMD code gen improvement but ldc is right there.

--

https://issues.dlang.org/show_bug.cgi?id=21041 --- Comment #2 from safety0ff.bugz <safety0ff.bugz@gmail.com> --- (In reply to Bruce Carneal from comment #1) > I didn't find a 'byteswap' in the core.bitop documentation. There is a bswap but only for uints and ulongs AFAICT. The intrinsic in question was added in the master branch here: https://github.com/dlang/dmd/pull/11388 Also the 64 bit version is to be added here: https://github.com/dlang/dmd/pull/11408 > For the above code ldc at -O or above generates: > movl %edi, %eax > rolw $8, %ax > retq I'd expect that since C/C++ clang emit that. --

https://issues.dlang.org/show_bug.cgi?id=21041 --- Comment #3 from safety0ff.bugz <safety0ff.bugz@gmail.com> --- (In reply to Bruce Carneal from comment #1) > Current dmd -O emits 7 instructions to accomplish the rolw in the code body. D converts many operations on narrow types to int, which DMD's backend then fails to optimize away when it is possible/advantageous. --

https://issues.dlang.org/show_bug.cgi?id=21041 --- Comment #4 from safety0ff.bugz <safety0ff.bugz@gmail.com> --- (In reply to safety0ff.bugz from comment #3) > (In reply to Bruce Carneal from comment #1) > > Current dmd -O emits 7 instructions to accomplish the rolw in the code body. > > D converts many operations on narrow types to int, which DMD's backend then fails to optimize away when it is possible/advantageous. Further investigation: dmd/backend/cod2.d function cdshift also converts rotates of 8 in upper/lower 8 of word into XCHG's --

https://issues.dlang.org/show_bug.cgi?id=21041 --- Comment #5 from Bruce Carneal <bcarneal11@gmail.com> --- (In reply to safety0ff.bugz from comment #3) > (In reply to Bruce Carneal from comment #1) > > Current dmd -O emits 7 instructions to accomplish the rolw in the code body. > > D converts many operations on narrow types to int, which DMD's backend then fails to optimize away when it is possible/advantageous. Yes. DMDs back end is quick, but the code it generates is not state-of-the-art. That said, optimizing the DMD code gen for code.bitop rotations seems more useful than a ushort byteswap improvement. The latter could be implemented as an "inline" of the former. Recognizing the rotation patterns generally, ala LLVM, would be even better but quite a bit of work I'd imagine. Probably not worth it given current resource constraints (Walter's time). Lots of big front-end fish to fry. --

https://issues.dlang.org/show_bug.cgi?id=21041 Iain Buclaw <ibuclaw@gdcproject.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Keywords| |backend CC| |ibuclaw@gdcproject.org --

Forums