Thread overview
[Issue 23049] [SIMD][CODEGEN] Wrong code for XMM.RCPSS after inlining
Apr 23, 2022
ponce
Apr 24, 2022
Walter Bright
Apr 24, 2022
Walter Bright
Apr 24, 2022
Walter Bright
Apr 24, 2022
ponce
April 23, 2022
https://issues.dlang.org/show_bug.cgi?id=23049

ponce <aliloko@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |backend, SIMD, wrong-code

--
April 24, 2022
https://issues.dlang.org/show_bug.cgi?id=23049

Walter Bright <bugzilla@digitalmars.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |bugzilla@digitalmars.com

--- Comment #1 from Walter Bright <bugzilla@digitalmars.com> ---
I finally figured out what was going on here. The code generated is:

    float4 A = [2.34f, -70000.0f, 0.00001f, 345.5f];
                movaps  XMM0,FLAT:.rodata[00h][RIP]
                movaps  -020h[RBP],XMM0

    float4 R = cast(float4) __simd(XMM.RCPSS, A);
                rcpss   XMM1,-020h[RBP]      (*)
                movaps  -010h[RBP],XMM1

    assert(R.array[1] == -70000.0f)
                movss   XMM2,-0Ch[RBP]
                ...

(*) rcpss stores a value into the lower 4 bytes of XMM1, leaving the rest of XMM1 unchanged. But, according to the compiler, the entirety of XMM1 was changed by the assignment, even though it wasn't. Hence, the upper 12 bytes of XMM1 are garbage.

You can make it work by explicitly passing the implicit argument:

    float4 R = A;
    R = cast(float4) __simd(XMM.RCPSS, R, A);

--
April 24, 2022
https://issues.dlang.org/show_bug.cgi?id=23049

Walter Bright <bugzilla@digitalmars.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |INVALID

--
April 24, 2022
https://issues.dlang.org/show_bug.cgi?id=23049

--- Comment #2 from Walter Bright <bugzilla@digitalmars.com> ---
https://github.com/dlang/druntime/pull/3808

--
April 24, 2022
https://issues.dlang.org/show_bug.cgi?id=23049

--- Comment #3 from ponce <aliloko@gmail.com> ---
Thanks for the __simd explanations!

--