On 6 January 2012 12:16, a
<a@a.com> wrote:
Walter Bright Wrote:
> which provides two functions:
>
> __v128 simdop(operator, __v128 op1);
> __v128 simdop(operator, __v128 op1, __v128 op2);
You would also need functions that take an immediate too to support instructions such as shufps.
> One caveat is it is typeless; a __v128 could be used as 4 packed ints or 2
> packed doubles. One problem with making it typed is it'll add 10 more types to
> the base compiler, instead of one. Maybe we should just bite the bullet and do
> the types:
>
> __vdouble2
> __vfloat4
> __vlong2
> __vulong2
> __vint4
> __vuint4
> __vshort8
> __vushort8
> __vbyte16
> __vubyte16
I don't see it being typeless as a problem. The purpose of this is to expose hardware capabilities to D code and the vector registers are typeless, so why shouldn't vector type be "typeless" too? Types such as vfloat4 can be implemented in a library (which could also be made portable and have a nice API).
Hooray! I think we're on exactly the same page. That's refreshing :)
I think this __simdop( op, v1, v2, etc ) api is a bit of a bad idea... there are too many permutations of arguments.
I know some PPC functions that receive FIVE arguments (2-3 regs, and 2-3 literals)..
Why not just expose the opcodes as intrinsic functions directly, for instance (maybe in std.simd.sse)?
__v128 __sse_mul_ss( __v128 v1, __v128 v2 );
__v128 __sse_mul_ps( __v128 v1, __v128 v2 );
__v128 __sse_madd_epi16( __v128 v1, __v128 v2, __v128 v3 ); // <- some have more args
__v128 __sse_shuffle_ps( __v128 v1, __v128 v2, immutable int i ); // <- some need literal ints
etc...
This works best for other architectures too I think, they expose their own set of intrinsics, and some have rather different parameter layouts.
VMX for instance (perhaps in std.simd.vmx?):
__v128 __vmx_vmsum4fp( __v128 v1, __v128 v2, __v128 v3 );
__v128 __vmx_vpermwi( __v128 v1, immutable int i ); // <-- needs a literal
__v128 __vmx_vrlimi( __v128 v1, __v128 v2, immutable int mask, immutable int rot ); // <-- you really don't want to add your enum style function for all these prototypes?
etc...
I have seen at least these argument lists:
( v1 )
( v1, v2 )
( v1, v2, v3 )
( v1, immutable int )
( v1, v2, immutable int )
( v1, v2,
immutable int,
immutable int )