Thread overview
__simd_sto confusion
Oct 03, 2015
Nachtraaf
Oct 03, 2015
Marco Leise
Oct 03, 2015
Nachtraaf
Oct 04, 2015
Marco Leise
Oct 04, 2015
Benjamin Thaut
Oct 04, 2015
Nachtraaf
October 03, 2015
I'm trying to create some linear algebra functions using simd intrinsics. I watched the dconf 2013 presentation by Manu Evans but i'm still confused about some aspects and the following piece of code doesn't work. I'm trying to copy the result of a dot product from the register to memory but dmd fails with an overload resolution error, which i guess is due some implicit conversion?

dmd error:

simd1.d(34): Error: core.simd.__simd_sto called with argument types (XMM, float, __vector(float[4])) matches both:
/usr/include/dlang/dmd/core/simd.d(434):     core.simd.__simd_sto(XMM opcode, double op1, __vector(void[16]) op2)
and:
/usr/include/dlang/dmd/core/simd.d(435):     core.simd.__simd_sto(XMM opcode, float op1, __vector(void[16]) op2)

from the following piece of code:

float dot_simd1(float4  a, float4 b)
{
    float4 result = __simd(XMM.DPPS, a, b, 0xFF);
    float value;
    __simd_sto(XMM.STOSS, value, result);
    return value;
}

What am I doing wrong here?
October 03, 2015
This is a bug in overload resolution when __vector(void[16])
is involved. You can go around it by changing float4 to void16,
only to run into an internal compiler error:
  backend/gother.c 988
So file a bug for both @ issues.dlang.org
Also it looks like DMD wants you to use the return value of
the intrinsic, is that expected?

-- 
Marco

October 03, 2015
On Saturday, 3 October 2015 at 15:39:33 UTC, Marco Leise wrote:
> This is a bug in overload resolution when __vector(void[16])
> is involved. You can go around it by changing float4 to void16,
> only to run into an internal compiler error:
>   backend/gother.c 988
> So file a bug for both @ issues.dlang.org
> Also it looks like DMD wants you to use the return value of
> the intrinsic, is that expected?

I guessed I wouldn't need the return value as the intel C intrinsic for this opcode has a void return type. I did try supplying a return type but I couldn't circumvent the overload error so I had no clue if it would make any difference.

I changed the type of result to void16 like this:

float dot_simd1(float4  a, float4 b)
{
    void16 result = __simd(XMM.DPPS, a, b, 0xFF);
    float value;
    __simd_sto(XMM.STOSS, value, result);
    return value;
}

and for me this code compiles and runs without any errors now.
I'm using DMD64 D Compiler v2.068 on Linux. If you got an internal compiler error that means that it's a compiler bug though I have no clue what. Did you try the same thing I did or casting the variable?
I guess I should file a bugreport for overload resolution if it's not a duplicate for now?
October 04, 2015
Am Sat, 03 Oct 2015 23:42:22 +0000
schrieb Nachtraaf <nachtraaf80@gmail.com>:

> I changed the type of result to void16 like this:
> 
> float dot_simd1(float4  a, float4 b)
> {
>      void16 result = __simd(XMM.DPPS, a, b, 0xFF);
>      float value;
>      __simd_sto(XMM.STOSS, value, result);
>      return value;
> }
> 
> and for me this code compiles and runs without any errors now.
> I'm using DMD64 D Compiler v2.068 on Linux. If you got an
> internal compiler error that means that it's a compiler bug
> though I have no clue what. Did you try the same thing I did or
> casting the variable?
> I guess I should file a bugreport for overload resolution if it's
> not a duplicate for now?

Yes. At some point the intrinsics will need a more thorough rework. Currently none of those that return void, int or set flags work as they should.

-- 
Marco

October 04, 2015
On Saturday, 3 October 2015 at 14:47:02 UTC, Nachtraaf wrote:
> I'm trying to create some linear algebra functions using simd intrinsics. I watched the dconf 2013 presentation by Manu Evans but i'm still confused about some aspects and the following piece of code doesn't work. I'm trying to copy the result of a dot product from the register to memory but dmd fails with an overload resolution error, which i guess is due some implicit conversion?
>
> dmd error:
>
> simd1.d(34): Error: core.simd.__simd_sto called with argument types (XMM, float, __vector(float[4])) matches both:
> /usr/include/dlang/dmd/core/simd.d(434):     core.simd.__simd_sto(XMM opcode, double op1, __vector(void[16]) op2)
> and:
> /usr/include/dlang/dmd/core/simd.d(435):     core.simd.__simd_sto(XMM opcode, float op1, __vector(void[16]) op2)
>
> from the following piece of code:
>
> float dot_simd1(float4  a, float4 b)
> {
>     float4 result = __simd(XMM.DPPS, a, b, 0xFF);
>     float value;
>     __simd_sto(XMM.STOSS, value, result);
>     return value;
> }
>
> What am I doing wrong here?

core.simd is horribly broken. I recommend that you avoid it for any serious work. If you want to do simd programming with D get LDC or GDC and use their simd intrinsics instead of core.simd.
If you have to do simd with dmd write inline assembly.
October 04, 2015
That's a shame. I've read that each compiler has his own quirks and not support everything dmd supports. I do want to keep the code as portable as possible. Guess I'll try using inline assembler and runtime checks for the right cpu architecture.

Thanks for the help people.