View mode: basic / threaded / horizontal-split · Log in · Help
February 06, 2012
Re: std.simd module
On 6 February 2012 10:49, a <a@a.com> wrote:

> On Saturday, 4 February 2012 at 23:15:17 UTC, Manu wrote:
>
>  First criticism I expect is for many to insist on a class-style vector
>> library, which I personally think has no place as a low level, portable
>> API.
>> Everyone has a different idea of what the perfect vector lib should look
>> like, and it tends to change significantly with respect to its
>> application.
>>
>> I feel this flat API is easier to implement, maintain, and understand, and
>> I expect the most common use of this lib will be in the back end of
>> peoples
>> own vector/matrix/linear algebra libs that suit their apps.
>>
>> My key concern is with my function names... should I be worried about name
>> collisions in such a low level lib? I already shadow a lot of standard
>> float functions...
>> I prefer them abbreviated in this (fairly standard) way, keeps lines of
>> code short and compact. It should be particularly familiar to anyone who
>> has written shaders and such.
>>
>
> I prefer the flat API and short names too.
>
>
>  Opinions? Shall I continue as planned?
>>
>
> Looks nice. Please do continue :)
>
> You have only run this on a 32 bit machine, right? Cause I tried to
> compile this simple example and got some errors about converting ulong to
> int:
>

True, I have only been working in x86 GDC so far, but I just wanted to get
feedback about my approach and API design at this point.
It seems there are no serious objections, I'll continue as is. I have an
ARM compiler too now, so I'll be implementing/testing against that as
reference also.


> auto testfun(float4 a, float4 b)
> {
>   return swizzle!("yxwz")(a);
> }
>
> It compiles if I do this changes:
>
> 566c566
> <               foreach(i; 0..N)
> ---
>
>>                foreach(int i; 0..N)
>>
> 574c574
> <                               int i = countUntil(s, swizzleKey[0]);
> ---
>
>>                                int i = cast(int)countUntil(s,
>> swizzleKey[0]);
>>
> 591c591
> <                                       foreach(j, c; s) // find the
> offset of the ---
>
>>                                        foreach(int j, c; s) // find the
>> offset of the
>
>
February 06, 2012
Re: std.simd module
> True, I have only been working in x86 GDC so far, but I just 
> wanted to get
> feedback about my approach and API design at this point.
> It seems there are no serious objections, I'll continue as is.

I have one proposal about API design of matrix operations. Maybe 
there could be functions that would take row vectors as 
parameters in addition to those that take matrix structs. That 
way one could call matrix functions on data that isn't stored as 
matrix structures without copying. So for example for the 
transpose function there would also be a function that would be 
used like this (a* are inputs and r* are outputs):

transpose(aX, aY, aZ, aW, rX, rY, rZ, rW);

Maybe those functions could be used to implement the functions 
that take and return structs.

I also think that interleave and deinterleave operations would be 
useful. For four element float vectors those can be implemented 
with only one instruction at least for SSE (using unpcklps, 
unpckhps and shufps) and  NEON (using vuzp and vzip).

> I have an
> ARM compiler too now, so I'll be implementing/testing against 
> that as
> reference also.

Could you please tell me how did you get the ARM compiler to work?
February 06, 2012
Re: std.simd module
On 6 February 2012 15:13, a <a@a.com> wrote:
>> True, I have only been working in x86 GDC so far, but I just wanted to get
>> feedback about my approach and API design at this point.
>> It seems there are no serious objections, I'll continue as is.
>
>
> I have one proposal about API design of matrix operations. Maybe there could
> be functions that would take row vectors as parameters in addition to those
> that take matrix structs. That way one could call matrix functions on data
> that isn't stored as matrix structures without copying. So for example for
> the transpose function there would also be a function that would be used
> like this (a* are inputs and r* are outputs):
>
> transpose(aX, aY, aZ, aW, rX, rY, rZ, rW);
>
> Maybe those functions could be used to implement the functions that take and
> return structs.
>
> I also think that interleave and deinterleave operations would be useful.
> For four element float vectors those can be implemented with only one
> instruction at least for SSE (using unpcklps, unpckhps and shufps) and  NEON
> (using vuzp and vzip).
>
>
>> I have an
>> ARM compiler too now, so I'll be implementing/testing against that as
>> reference also.
>
>
> Could you please tell me how did you get the ARM compiler to work?

There's a thread in d.gnu with Linux and MinGW cross compiler binaries.


-- 
Iain Buclaw

*(p < e ? p++ : p) = (c & 0x0f) + '0';
February 06, 2012
Re: std.simd module
> There's a thread in d.gnu with Linux and MinGW cross compiler 
> binaries.

I didn't know that, thanks.
February 06, 2012
Re: std.simd module
On 6 February 2012 17:13, a <a@a.com> wrote:

> True, I have only been working in x86 GDC so far, but I just wanted to get
>> feedback about my approach and API design at this point.
>> It seems there are no serious objections, I'll continue as is.
>>
>
> I have one proposal about API design of matrix operations. Maybe there
> could be functions that would take row vectors as parameters in addition to
> those that take matrix structs. That way one could call matrix functions on
> data that isn't stored as matrix structures without copying. So for example
> for the transpose function there would also be a function that would be
> used like this (a* are inputs and r* are outputs):
>
> transpose(aX, aY, aZ, aW, rX, rY, rZ, rW);
>

... the problem is, without multiple return values (come on, D should have
multiple return values!), how do you return the result? :)


> Maybe those functions could be used to implement the functions that take
> and return structs.
>

Yes... I've been pondering how to do this properly for ages actually.
That's the main reason I haven't fleshed out any matrix functions yet; I'm
still not at all sold on how to represent the matrices.
Ideally, there should not be any memory access. But even if they pass by
ref/pointer, as soon as the function is inlined, the memory access will
disappear, and it'll effectively generate the same code...

So the problem is not so much with respect to THIS API, but with respect to
the matrix calling convention in general...

I also think that interleave and deinterleave operations would be useful.
> For four element float vectors those can be implemented with only one
> instruction at least for SSE (using unpcklps, unpckhps and shufps) and
>  NEON (using vuzp and vzip).


Sure. I wasn't sure how useful they were in practise... I didn't want to
load it with countless silly permutation routines so I figured I'll add
them by request, or as they are proven useful in real world apps.
What would you typically do with the interleave functions at a high level?
Sure you don't just use it as a component behind a few actually useful
functions which should be exposed instead?

I have an
>> ARM compiler too now, so I'll be implementing/testing against that as
>> reference also.
>>
>
> Could you please tell me how did you get the ARM compiler to work?
>

I did not.. It was the work of another fine chap in the gdc newsgroup ;)
February 06, 2012
Re: std.simd module
>> used like this (a* are inputs and r* are outputs):
>>
>> transpose(aX, aY, aZ, aW, rX, rY, rZ, rW);
>>
>
> ... the problem is, without multiple return values (come on, D 
> should have
> multiple return values!), how do you return the result? :)
>
>
>> Maybe those functions could be used to implement the functions 
>> that take
>> and return structs.
>>
>
> Yes... I've been pondering how to do this properly for ages 
> actually.
> That's the main reason I haven't fleshed out any matrix 
> functions yet; I'm
> still not at all sold on how to represent the matrices.
> Ideally, there should not be any memory access. But even if 
> they pass by
> ref/pointer, as soon as the function is inlined, the memory 
> access will
> disappear, and it'll effectively generate the same code...

I meant having functions that would return through reference 
parameters. The transpose function above would have signature 
transpose(float4, float4, float4, float4, ref float4, ref float4, 
ref float4, ref float4).

> Sure. I wasn't sure how useful they were in practise... I 
> didn't want to
> load it with countless silly permutation routines so I figured 
> I'll add
> them by request, or as they are proven useful in real world 
> apps.
> What would you typically do with the interleave functions at a 
> high level?
> Sure you don't just use it as a component behind a few actually 
> useful
> functions which should be exposed instead?

I think they would be useful when you work with arrays of structs 
with two elements such as complex numbers. For example to 
calculate a square of a complex array you could do:

for(size_t i=0; i < a.length; i += 2)
{
   float4 first = a[i];
   float4 second  = a[i + 1];
   float4 re = deinterleaveLow(first, second);
   float4 im = deinterleaveHigh(first, second);
   flaot4 re2 = re * re - im * im;
   float4 im2 = re * im
   im2 += im2;
   a[i] = interleaveLow(re2, im2);
   a[i + 1] = interleaveHigh(re2, im2);   }

Interleave and interleave can also be useful when you want to 
shuffle data in some custom way. You can't cover all possible 
permutations of elements over multiple vectors in a library 
(unless you do something like
A* search at compile time and generate code based on that - but 
that would probably be way to slow), but you can expose at least 
the capabilities that are common to most platforms, such as 
interleave and deinterleave.
Next ›   Last »
1 2
Top | Discussion index | About this forum | D home