January 15, 2012
On 15 January 2012 08:16, Sean Cavanaugh <WorksOnMyMachine@gmail.com> wrote:

> On 1/15/2012 12:09 AM, Walter Bright wrote:
>
>> On 1/14/2012 9:58 PM, Sean Cavanaugh wrote:
>>
>>> MS has three types, __m128, __m128i and __m128d (float, int, double)
>>>
>>> Six if you count AVX's 256 forms.
>>>
>>> On 1/7/2012 6:54 PM, Peter Alexander wrote:
>>>
>>>> On 7/01/12 9:28 PM, Andrei Alexandrescu wrote:
>>>> I agree with Manu that we should just have a single type like __m128 in
>>>> MSVC. The other types and their conversions should be solvable in a
>>>> library with something like strong typedefs.
>>>>
>>>>
>> The trouble with MS's scheme, is given the following:
>>
>> __m128i v;
>> v += 2;
>>
>> Can't tell what to do. With D,
>>
>> int4 v;
>> v += 2;
>>
>> it's clear (add 2 to each of the 4 ints).
>>
>
> Working with their intrinsics in their raw form for real code is pure insanity :)  You need to wrap it all with a good math library (even if 90% of the library is the intrinsics wrapped into __forceinlined functions), so you can start having sensible operator overloads, and so you can write code that is readable.
>
>
> if (any4(a > b))
> {
>  // do stuff
> }
>
>
> is way way way better than (pseudocode)
>
> if (__movemask_ps(_mm_gt_ps(a, b)) == 0x0F)
> {
> }
>
>
>
> and (if the ternary operator was overrideable in C++)
>
> float4 foo = (a > b) ? c : d;
>
> would be better than
>
> float4 mask = _mm_gt_ps(a, b);
> float4 foo = _mm_or_ps(_mm_and_ps(mask, c), _mm_nand_ps_(mask, d));
>

Yep, it's coming... baby steps :)

Walter: I told you games devs would be all over this! :P


January 15, 2012
On 15 January 2012 09:20, Sean Cavanaugh <WorksOnMyMachine@gmail.com> wrote:

> On 1/13/2012 7:38 AM, Manu wrote:
>
>> On 13 January 2012 08:34, Norbert Nemec <Norbert@nemec-online.de <mailto:Norbert@nemec-online.**de <Norbert@nemec-online.de>>> wrote:
>>
>>
>> This has already been concluded some days back, the language has a quite of types, just like GCC.
>>
>
> So I would definitely like to help out on the SIMD stuff in some way, as I
> have a lot of experience using SIMD math to speed up the games I work on.
>  I've got a vectorized set of transcendetal (currently in the form of
> MSVC++ intrinics) functions for float and double that would be a good start
> if anyone is interested.  Beyond that I just want to help 'make it right'
> because its a topic I care alot about, and is my personal biggest gripe
> with the langauge at the moment.
>
> I also have experience with VMX as they two are not exactly the same, it definitely would help to avoid making the code too intel-centric (though typically the VMX is the more flexible design as it can do dynamic shuffling based on the contents of the vector registers etc)
>

I too have a long history with VMX, CELL SPU, ARMs VFP/NEON, and others (PSP's VFPU, PS2s VU, SH4), and SSE of course, and writing the efficient libraries that take all hardwares into consideration. We should compare notes, are you on IRC? :)


January 15, 2012
On 1/15/2012 3:02 AM, Manu wrote:
> On 15 January 2012 09:20, Sean Cavanaugh <WorksOnMyMachine@gmail.com
>     I also have experience with VMX as they two are not exactly the same, it
>     definitely would help to avoid making the code too intel-centric (though
>     typically the VMX is the more flexible design as it can do dynamic shuffling
>     based on the contents of the vector registers etc)
>
>
> I too have a long history with VMX, CELL SPU, ARMs VFP/NEON, and others (PSP's
> VFPU, PS2s VU, SH4), and SSE of course, and writing the efficient libraries that
> take all hardwares into consideration. We should compare notes, are you on IRC? :)

A nice vector math library for D that puts us competitive will be a nice addition to Phobos.
January 16, 2012
On 1/15/2012 1:42 PM, Walter Bright wrote:
>
> A nice vector math library for D that puts us competitive will be a nice
> addition to Phobos.

The gl3n library might be something good to build on:
https://bitbucket.org/dav1d/gl3n

It looks to be a continuation of the OMG library used by Deadlock, and is similar to the glm (http://glm.g-truc.net) c++ library which emulates glsl vector ops in software.

We'd need to ask if it can be re-licensed from MIT to Boost.
January 16, 2012
On 1/15/2012 6:54 PM, JoeCoder wrote:
> On 1/15/2012 1:42 PM, Walter Bright wrote:
>>
>> A nice vector math library for D that puts us competitive will be a nice
>> addition to Phobos.
>
> The gl3n library might be something good to build on:
> https://bitbucket.org/dav1d/gl3n
>
> It looks to be a continuation of the OMG library used by Deadlock, and is
> similar to the glm (http://glm.g-truc.net) c++ library which emulates glsl
> vector ops in software.
>
> We'd need to ask if it can be re-licensed from MIT to Boost.

I have never used libraries like that, and so it isn't obvious to me what a good one would look like.
January 16, 2012
Am 15.01.2012, 11:45 Uhr, schrieb Manu <turkeyman@gmail.com>:

> On 15 January 2012 08:16, Sean Cavanaugh <WorksOnMyMachine@gmail.com> wrote:
>
>> On 1/15/2012 12:09 AM, Walter Bright wrote:
>>
>>> On 1/14/2012 9:58 PM, Sean Cavanaugh wrote:
>>>
>>>> MS has three types, __m128, __m128i and __m128d (float, int, double)
>>>>
>>>> Six if you count AVX's 256 forms.
>>>>
>>>> On 1/7/2012 6:54 PM, Peter Alexander wrote:
>>>>
>>>>> On 7/01/12 9:28 PM, Andrei Alexandrescu wrote:
>>>>> I agree with Manu that we should just have a single type like __m128 in
>>>>> MSVC. The other types and their conversions should be solvable in a
>>>>> library with something like strong typedefs.
>>>>>
>>>>>
>>> The trouble with MS's scheme, is given the following:
>>>
>>> __m128i v;
>>> v += 2;
>>>
>>> Can't tell what to do. With D,
>>>
>>> int4 v;
>>> v += 2;
>>>
>>> it's clear (add 2 to each of the 4 ints).
>>>
>>
>> Working with their intrinsics in their raw form for real code is pure
>> insanity :)  You need to wrap it all with a good math library (even if 90%
>> of the library is the intrinsics wrapped into __forceinlined functions), so
>> you can start having sensible operator overloads, and so you can write code
>> that is readable.
>>
>>
>> if (any4(a > b))
>> {
>>  // do stuff
>> }
>>
>>
>> is way way way better than (pseudocode)
>>
>> if (__movemask_ps(_mm_gt_ps(a, b)) == 0x0F)
>> {
>> }
>>
>>
>>
>> and (if the ternary operator was overrideable in C++)
>>
>> float4 foo = (a > b) ? c : d;
>>
>> would be better than
>>
>> float4 mask = _mm_gt_ps(a, b);
>> float4 foo = _mm_or_ps(_mm_and_ps(mask, c), _mm_nand_ps_(mask, d));
>>
>
> Yep, it's coming... baby steps :)
>
> Walter: I told you games devs would be all over this! :P

And even a compression algorithms. I found one written in C, that uses external .asm files to be compiled into object files with NASM for use on the linker command line. They contain some MMX/SSE code depending on the processor you plan to use. The author claims, that the MMX version of the 'outsourced' routines run 8x faster. I didn't verify this, but the idea that these instructions become part of the language and easy to use for regular programmers like me (and not just console game developers) is exciting. I bet there are more programs that could benefit from SSE than is obvious or code that could be rewritten in way, that multiple data sets can be processed simultaneous.
January 16, 2012
Here is mine:
http://suicide.zoadian.de/ext/math/geometry/vector.d
i haven't tested (not even compiled) it yet. It needs polishing, but i have not much time to work on it atm. But you may use it as you wish ;)
Any suggestions/improvement is welcome.

Greetings,
Felix



Am 16.01.2012, 04:00 Uhr, schrieb Walter Bright <newshound2@digitalmars.com>:

> On 1/15/2012 6:54 PM, JoeCoder wrote:
>> On 1/15/2012 1:42 PM, Walter Bright wrote:
>>>
>>> A nice vector math library for D that puts us competitive will be a nice
>>> addition to Phobos.
>>
>> The gl3n library might be something good to build on:
>> https://bitbucket.org/dav1d/gl3n
>>
>> It looks to be a continuation of the OMG library used by Deadlock, and is
>> similar to the glm (http://glm.g-truc.net) c++ library which emulates glsl
>> vector ops in software.
>>
>> We'd need to ask if it can be re-licensed from MIT to Boost.
>
> I have never used libraries like that, and so it isn't obvious to me what a good one would look like.


-- 
Erstellt mit Operas revolutionärem E-Mail-Modul: http://www.opera.com/mail/
January 16, 2012
On 1/16/2012 5:06 AM, Marco Leise wrote:
> I bet there are more programs that
> could benefit from SSE than is obvious or code that could be rewritten in way,
> that multiple data sets can be processed simultaneous.

I think there's quite a bit more, it's just that using SIMD instructions has historically been so clumsy, few take advantage.

For example, a memchr operation could be dramatically speeded up with SIMD, which has implications for regex.
January 16, 2012
On Monday, 16 January 2012 at 17:57:38 UTC, suicide wrote:
> Here is mine:
> http://suicide.zoadian.de/ext/math/geometry/vector.d
> i haven't tested (not even compiled) it yet. It needs polishing, but i have not much time to work on it atm. But you may use it as you wish ;)
> Any suggestions/improvement is welcome.
>
> Greetings,
> Felix
>
>
>
> Am 16.01.2012, 04:00 Uhr, schrieb Walter Bright <newshound2@digitalmars.com>:
>
>> On 1/15/2012 6:54 PM, JoeCoder wrote:
>>> On 1/15/2012 1:42 PM, Walter Bright wrote:
>>>>
>>>> A nice vector math library for D that puts us competitive will be a nice
>>>> addition to Phobos.
>>>
>>> The gl3n library might be something good to build on:
>>> https://bitbucket.org/dav1d/gl3n
>>>
>>> It looks to be a continuation of the OMG library used by Deadlock, and is
>>> similar to the glm (http://glm.g-truc.net) c++ library which emulates glsl
>>> vector ops in software.
>>>
>>> We'd need to ask if it can be re-licensed from MIT to Boost.
>>
>> I have never used libraries like that, and so it isn't obvious to me what a good one would look like.

Nice start, though it have quite a few issues.

1. for (i; 0 .. D) needs to be: foreach (i; 0 .. D)
2. asserts(r != 0) should be done in a contract
3. 'Vector(D, T)' can be internally used as just 'Vector'
4. instead of making opAdd, opSub, opMul, etc.. use opBinary and mixins
5. don't pass vectors as 'ref' unless they are going to be modified
6. for performance, don't pass all values through 'real'

   auto opBinary(string op, U)(U r)
   if (U.sizeof <= T.sizeof && isImplicitlyConvertible(T, U))
   in {
       assert(r != 0);
   }
   body {
       Vector nvec(this);
       foreach (i; 0 .. D)
           mixin("nvec.vec[i]" ~ op ~ "= r;");
       return nvec;
   }


   auto opBinary(string op, U)(U r)
   if (U.sizeof > T.sizeof && isImplicitlyConvertible(U, T))
   in {
       assert(r != 0);
   }
   body {
       Vector nvec(this);
       foreach (i; 0 .. D)
           mixin("nvec.vec[i]" ~ op ~ "= cast(T) r;");
       return nvec;
   }

   . . . . .

   auto opBinary(string op, V, U)(Vector!(V, U) vec)
   if (U.sizeof <= T.sizeof && isImplicitlyConvertible(U, T))
   in {
       foreach (i; 0 .. V)
           assert(vec.vec[i] != 0);
   }
   body {
       Vector nvec(this);
       static if (D <= V) {
           foreach (i; 0 .. D)
               mixin("nvec.vec[i]" ~ op ~ "= vec.vec[i];");
       }
       else {
           foreach (i; 0 .. V)
               mixin("nvec.vec[i]" ~ op ~ "= vec.vec[i];");
       }
       return nvec;
   }


   // etc...

Something along those lines. Also, make sure you can't create a vector of zero or one length (struct Vector(D, T) if (D >= 2) { ... }). Plus, none of your Vector(D,T) instances will compile because you for the '!' mark: Vector!(D, T)
January 16, 2012
Whoops! opBinary should be opOpAssign in my examples.