On 6 January 2012 04:12, Walter Bright <newshound2@digitalmars.com> wrote:

If you could cut&paste them here, I would find it most helpful. I have some ideas on making that work, but I need to know everything wrong with it first.

On 5 January 2012 11:02, Manu <turkeyman@gmail.com> wrote:

On 5 January 2012 02:42, bearophile <bearophileHUGS@lycos.com> wrote:
Think about future CPU evolution with SIMD registers 128, then 256, then 512, then 1024 bits long. In theory a good compiler is able to use them with no changes in the D code that uses vector operations.

These are all fundamentally different types, like int and long.. float and double... and I certainly want a keyword to identify each of them. Even if the compiler is trying to make auto vector optimisations, you can't deny programmers explicit control to the hardware when they want/need it.

Look at x86 compilers, been TRYING to perform automatic SSE optimisations for 10 years, with basically no success... do you really think you can do better then all that work by microsoft and GCC?
In my experience, I've even run into a lot of VC's auto-SSE-ed code that is SLOWER than the original float code.

Let's not even mention architectures that receive much less love than x86, and are arguably more important (ARM; slower, simpler processors with more demand to perform well, and not waste power)

...

Vector ops and SIMD ops are different things. float[4] (or more realistically, float[3]) should NOT be a candidate for automatic SIMD implementation, likewise, simd_type should not have its components individually accessible. These are operations the hardware can not actually perform. So no syntax to worry about, just a type.

I think the good Hara will be able to implement those syntax fixes in a matter of just one day or very few days if a consensus is reached about what actually is to be fixed in D vector ops syntax.

Instead of discussing about *adding* something (register intrinsics) I suggest to discuss about what to fix about the *already present* vector op syntax. This is not a request to just you Manu, but to this whole newsgroup.

And I think this is exactly the wrong approach. A vector is NOT an array of 4 (actually, usually 3) floats. It should not appear as one. This is overly complicated and ultimately wrong way to engage this hardware.

Imagine the complexity in the compiler to try and force float[4] operations into vector arithmetic vs adding a 'v128' type which actually does what people want anyway... What about when when you actually WANT a float[4] array, and NOT a vector?

SIMD units are not float units, they should not appear like an aggregation of float units. They have:
* Different error semantics, exception handling rules, sometimes different precision...

* Special alignment rules.
* Special literal expression/assignment.
* You can NOT access individual components at will.
* May be reinterpreted at any time as float[1] float[4] double[2] short[8] char[16], etc... (up to the architecture intrinsics)

* Can not be involved in conventional comparison logic (array of floats would make you think they could)
*** Can NOT interact with the regular 'float' unit... Vectors as an array of floats certainly suggests that you can interact with scalar floats...

I will use architecture intrinsics to operate on these regs, and put that nice and neatly behind a hardware vector type with version()'s for each architecture, and an API with a whole lot of sugar to make them nice and friendly to use.

My argument is that even IF the compiler some day attempts to make vector optimisations to float[4] arrays, the raw hardware should be exposed first, and allow programmers to use it directly. This starts with a language defined (platform independant) v128 type.

...