July 12, 2020
https://issues.dlang.org/show_bug.cgi?id=21041

          Issue ID: 21041
           Summary: core.bitop.byteswap(ushort) should used ROL/ROR
                    instead of XCHG
           Product: D
           Version: D2
          Hardware: x86_64
                OS: All
            Status: NEW
          Severity: enhancement
          Priority: P1
         Component: dmd
          Assignee: nobody@puremagic.com
          Reporter: safety0ff.bugz@gmail.com

ROL/ROR should provide better performance and less constraints for register allocation.

The claim of better performance is based on:
 - https://www.agner.org/optimize/instruction_tables.pdf
 - Looking at gcc & llvm compiler output

The only disadvantage I see is that the instruction is longer.

--