Thread overview
[Issue 20112] __vector casts don't do type conversions
Aug 09, 2019
Iain Buclaw
Dec 11, 2020
ponce
Dec 22, 2020
Walter Bright
Dec 23, 2020
Walter Bright
Dec 23, 2020
Walter Bright
Dec 23, 2020
Iain Buclaw
August 09, 2019
https://issues.dlang.org/show_bug.cgi?id=20112

Iain Buclaw <ibuclaw@gdcproject.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |ibuclaw@gdcproject.org

--- Comment #1 from Iain Buclaw <ibuclaw@gdcproject.org> ---
That's because `__vector(int[4]) i = cast(__vector(int[4])) f;` is a
reinterpret cast.

Semantically, this can only be done by unrolling the assignment, but probably easier to do this in phobos std.conv instead.

private T to(T, S)(S value)
{
    alias E = typeof(T.init[0]);
    T res = void;
    static foreach (i; 0 .. S.length)
        res[i] = cast(E)value[i];
    return res;
}

void main() {
    import std.stdio;
    __vector(float[4]) f = [3, 2, 1, 0];
    __vector(int[4]) i = to!(__vector(int[4]) f;
    writeln(i[0]);
}

--
August 09, 2019
https://issues.dlang.org/show_bug.cgi?id=20112

--- Comment #2 from thomas.bockman@gmail.com ---
That is very surprising. There is already a way to express reinterpretation casts: `*cast(T*) &variable`. Why is it necessary to overload the conversion syntax in such a confusing fashion? Is this documented anywhere in the language standard?

--
August 09, 2019
https://issues.dlang.org/show_bug.cgi?id=20112

--- Comment #3 from thomas.bockman@gmail.com ---
> Semantically, this can only be done by unrolling the assignment

I've found that this is very unreliable. Sometimes the optimizer correctly replaces the individual component casts with the SIMD conversion instructions, and sometimes it doesn't. On LLVM, at least, inlining sometimes undoes the optimization.

I haven't been able to get this working reliably without resorting to inline assembly language.

--
December 11, 2020
https://issues.dlang.org/show_bug.cgi?id=20112

ponce <aliloko@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |aliloko@gmail.com

--- Comment #4 from ponce <aliloko@gmail.com> ---
For intel-intrinsics it is very handy that this cast is a reinterpret cast (like it is in C and C++...)

--
December 22, 2020
https://issues.dlang.org/show_bug.cgi?id=20112

Walter Bright <bugzilla@digitalmars.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |SIMD
                 CC|                            |bugzilla@digitalmars.com

--
December 23, 2020
https://issues.dlang.org/show_bug.cgi?id=20112

Walter Bright <bugzilla@digitalmars.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |INVALID

--- Comment #5 from Walter Bright <bugzilla@digitalmars.com> ---
It is indeed a reinterpret cast, although https://issues.dlang.org/show_bug.cgi?id=21469 would cause that not to work sometimes.

It is this way because of consistency with how casting of static arrays works:

  import core.stdc.stdio;

  void main() {
    byte[16] b = 3;
    int[4] ia = cast(int[4]) b;
    foreach (i; ia)
        printf("%x\n", i);
  }

which prints:

  3030303
  3030303
  3030303
  3030303

It is working as designed. At this point, I don't think this can be changed even if we wanted to.

--
December 23, 2020
https://issues.dlang.org/show_bug.cgi?id=20112

--- Comment #6 from Walter Bright <bugzilla@digitalmars.com> ---
Added a couple spec pulls to clarify:

https://github.com/dlang/dlang.org/pull/2924 https://github.com/dlang/dlang.org/pull/2925

--
December 23, 2020
https://issues.dlang.org/show_bug.cgi?id=20112

--- Comment #7 from Iain Buclaw <ibuclaw@gdcproject.org> ---
(In reply to thomas.bockman from comment #3)
> > Semantically, this can only be done by unrolling the assignment
> 
> I've found that this is very unreliable. Sometimes the optimizer correctly replaces the individual component casts with the SIMD conversion instructions, and sometimes it doesn't. On LLVM, at least, inlining sometimes undoes the optimization.
> 
> I haven't been able to get this working reliably without resorting to inline assembly language.
Just having a quick look, it requires -O3 in order to coerce out a 'cvttps2dq' instruction.  To make it consistent, you can set @optimize and @target attributes on the function (I think it works identically for both gdc and ldc).

--