SIMD benchmark (page 7) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » SIMD benchmark (page 7)

January 17, 2012

Re: SIMD benchmark

Posted by Walter Bright
in reply to Peter Alexander

Walter Bright

Posted in reply to Peter Alexander

On 1/17/2012 1:47 PM, Peter Alexander wrote:
> On 17/01/12 9:24 PM, Walter Bright wrote:
>> On 1/17/2012 1:20 PM, Peter Alexander wrote:
>>> As Manu said, you need something like __restrict (or a linear type
>>> system) to
>>> solve this problem.
>>
>> No, you don't. It can be done with a runtime check, like array bounds
>> checking is done.
>
> So you'd change it to this, even in release builds?

No. Like array bounds, if they overlap, an exception is thrown.

Remember, the D spec says that overlapping arrays are illegal.

January 17, 2012

Re: SIMD benchmark

Posted by Peter Alexander
in reply to Walter Bright

Peter Alexander

Posted in reply to Walter Bright

On 17/01/12 10:55 PM, Walter Bright wrote:
> On 1/17/2012 1:47 PM, Peter Alexander wrote:
>> On 17/01/12 9:24 PM, Walter Bright wrote:
>>> On 1/17/2012 1:20 PM, Peter Alexander wrote:
>>>> As Manu said, you need something like __restrict (or a linear type
>>>> system) to
>>>> solve this problem.
>>>
>>> No, you don't. It can be done with a runtime check, like array bounds
>>> checking is done.
>>
>> So you'd change it to this, even in release builds?
>
> No. Like array bounds, if they overlap, an exception is thrown.
>
> Remember, the D spec says that overlapping arrays are illegal.

The D spec says that overlapping arrays are illegal for vector ops. The foo(int[], int[], int[]) function does not use vector ops.

Or am I missing something really major?

For example, is this legal code?

int[100] a;
int[] b = a[0..100];
int[] c = a[10..90]; // Illegal? b and c overlap...

foreach (i; 0..80)
    c[i] = b[i]; // Illegal?

I know that b[] = c[] would be illegal, but that has nothing to do with the prior discussion.

January 17, 2012

Re: SIMD benchmark

Posted by Walter Bright
in reply to Peter Alexander

Walter Bright

Posted in reply to Peter Alexander

On 1/17/2012 3:23 PM, Peter Alexander wrote:
> On 17/01/12 10:55 PM, Walter Bright wrote:
>> On 1/17/2012 1:47 PM, Peter Alexander wrote:
>>> On 17/01/12 9:24 PM, Walter Bright wrote:
>>>> On 1/17/2012 1:20 PM, Peter Alexander wrote:
>>>>> As Manu said, you need something like __restrict (or a linear type
>>>>> system) to
>>>>> solve this problem.
>>>>
>>>> No, you don't. It can be done with a runtime check, like array bounds
>>>> checking is done.
>>>
>>> So you'd change it to this, even in release builds?
>>
>> No. Like array bounds, if they overlap, an exception is thrown.
>>
>> Remember, the D spec says that overlapping arrays are illegal.
>
> The D spec says that overlapping arrays are illegal for vector ops. The
> foo(int[], int[], int[]) function does not use vector ops.
>
> Or am I missing something really major?
>
> For example, is this legal code?
>
> int[100] a;
> int[] b = a[0..100];
> int[] c = a[10..90]; // Illegal? b and c overlap...

No, not illegal.

>
> foreach (i; 0..80)
> c[i] = b[i]; // Illegal?

No, not illegal.

> I know that b[] = c[] would be illegal, but that has nothing to do with the
> prior discussion.

Yes, b[]=c[] is illegal.

January 18, 2012

Re: SIMD benchmark

Posted by Peter Alexander
in reply to Walter Bright

Peter Alexander

Posted in reply to Walter Bright

On 17/01/12 11:34 PM, Walter Bright wrote:
> On 1/17/2012 3:23 PM, Peter Alexander wrote:
>> On 17/01/12 10:55 PM, Walter Bright wrote:
>>> On 1/17/2012 1:47 PM, Peter Alexander wrote:
>>>> On 17/01/12 9:24 PM, Walter Bright wrote:
>>>>> On 1/17/2012 1:20 PM, Peter Alexander wrote:
>>>>>> As Manu said, you need something like __restrict (or a linear type
>>>>>> system) to
>>>>>> solve this problem.
>>>>>
>>>>> No, you don't. It can be done with a runtime check, like array bounds
>>>>> checking is done.
>>>>
>>>> So you'd change it to this, even in release builds?
>>>
>>> No. Like array bounds, if they overlap, an exception is thrown.
>>>
>>> Remember, the D spec says that overlapping arrays are illegal.
>>
>> The D spec says that overlapping arrays are illegal for vector ops. The
>> foo(int[], int[], int[]) function does not use vector ops.
>>
>> Or am I missing something really major?
>>
>> For example, is this legal code?
>>
>> int[100] a;
>> int[] b = a[0..100];
>> int[] c = a[10..90]; // Illegal? b and c overlap...
>
> No, not illegal.
>
>>
>> foreach (i; 0..80)
>> c[i] = b[i]; // Illegal?
>
> No, not illegal.
>
>> I know that b[] = c[] would be illegal, but that has nothing to do
>> with the
>> prior discussion.
>
> Yes, b[]=c[] is illegal.

So, my original point still stands, you can't vectorise this function:

void foo(int[] a, int[] b, int[] c)
{
  foreach (i; 0..256)
    a[i] = b[i] + c[i];
}

Those slices are allowed to overlap, so this cannot be automatically vectorised (without inlining to get better context about those arrays).

Without inlining, you need something along the lines of __restrict or uniqueness typing.

January 18, 2012

Re: SIMD benchmark

Posted by Walter Bright
in reply to Peter Alexander

Walter Bright

Posted in reply to Peter Alexander

On 1/17/2012 4:19 PM, Peter Alexander wrote:
> So, my original point still stands, you can't vectorise this function:
>
> void foo(int[] a, int[] b, int[] c)
> {
> foreach (i; 0..256)
> a[i] = b[i] + c[i];
> }
>
> Those slices are allowed to overlap, so this cannot be automatically vectorised
> (without inlining to get better context about those arrays).
>
> Without inlining, you need something along the lines of __restrict or uniqueness
> typing.

No, you can rewrite it as:

   a[] = b[] + c[];

and you don't need __restrict or uniqueness. That's what the vector operations are for.

January 18, 2012

Re: SIMD benchmark

Posted by Timon Gehr
in reply to Walter Bright

Timon Gehr

Posted in reply to Walter Bright

On 01/18/2012 02:04 AM, Walter Bright wrote:
> On 1/17/2012 4:19 PM, Peter Alexander wrote:
>> So, my original point still stands, you can't vectorise this function:
>>
>> void foo(int[] a, int[] b, int[] c)
>> {
>> foreach (i; 0..256)
>> a[i] = b[i] + c[i];
>> }
>>
>> Those slices are allowed to overlap, so this cannot be automatically
>> vectorised
>> (without inlining to get better context about those arrays).
>>
>> Without inlining, you need something along the lines of __restrict or
>> uniqueness
>> typing.
>
> No, you can rewrite it as:
>
> a[] = b[] + c[];
>
> and you don't need __restrict or uniqueness. That's what the vector
> operations are for.

Are they really a general solution? How do you use vector ops to implement an efficient matrix multiply, for instance?

January 18, 2012

Re: SIMD benchmark

Posted by F i L
in reply to Timon Gehr

F i L

Posted in reply to Timon Gehr

Timon Gehr wrote:
> Are they really a general solution? How do you use vector ops to implement an efficient matrix multiply, for instance?

struct Matrix4
{
   float4 x, y, z, w;

   auto transform(Matrix4 mat)
   {
       Matrix4 rmat;

       float4 cx = {mat.x.x, mat.y.x, mat.z.x, mat.w.x};
       float4 cy = {mat.x.y, mat.y.y, mat.z.y, mat.w.y};
       float4 cz = {mat.x.z, mat.y.z, mat.z.z, mat.w.z};
       float4 cw = {mat.x.w, mat.y.w, mat.z.w, mat.w.w};

       float4 rx = {mat.x.x, mat.x.y, mat.x.z, mat.x.w};
       float4 ry = {mat.y.x, mat.y.y, mat.y.z, mat.y.w};
       float4 rz = {mat.z.x, mat.z.y, mat.z.z, mat.z.w};
       float4 rw = {mat.w.x, mat.w.y, mat.w.z, mat.w.w};

       rmat.x = cx * rx; // simd
       rmat.y = cy * ry; // simd
       rmat.z = cz * rz; // simd
       rmat.w = cw * rw; // simd

       return rmat;
   }
}

January 18, 2012

Re: SIMD benchmark

Posted by Timon Gehr
in reply to F i L

Timon Gehr

Posted in reply to F i L

On 01/18/2012 02:32 AM, F i L wrote:
> Timon Gehr wrote:
>> Are they really a general solution? How do you use vector ops to
>> implement an efficient matrix multiply, for instance?
>
> struct Matrix4
> {
> float4 x, y, z, w;
>
> auto transform(Matrix4 mat)
> {
> Matrix4 rmat;
>
> float4 cx = {mat.x.x, mat.y.x, mat.z.x, mat.w.x};
> float4 cy = {mat.x.y, mat.y.y, mat.z.y, mat.w.y};
> float4 cz = {mat.x.z, mat.y.z, mat.z.z, mat.w.z};
> float4 cw = {mat.x.w, mat.y.w, mat.z.w, mat.w.w};
>
> float4 rx = {mat.x.x, mat.x.y, mat.x.z, mat.x.w};
> float4 ry = {mat.y.x, mat.y.y, mat.y.z, mat.y.w};
> float4 rz = {mat.z.x, mat.z.y, mat.z.z, mat.z.w};
> float4 rw = {mat.w.x, mat.w.y, mat.w.z, mat.w.w};
>
> rmat.x = cx * rx; // simd
> rmat.y = cy * ry; // simd
> rmat.z = cz * rz; // simd
> rmat.w = cw * rw; // simd
>
> return rmat;
> }
> }

The parameter is just squared and returned?

Anyway, I was after a general matrix*matrix multiplication, where the operands can get arbitrarily large and where any potential use of __restrict is rendered unnecessary by array vector ops.

January 18, 2012

Re: SIMD benchmark

Posted by a
in reply to Timon Gehr

a

Posted in reply to Timon Gehr

On Wednesday, 18 January 2012 at 01:50:00 UTC, Timon Gehr wrote:

> Anyway, I was after a general matrix*matrix multiplication, where the operands can get arbitrarily large and where any potential use of __restrict is rendered unnecessary by array vector ops.

Here you go. But I agree there are use cases for restrict where vector operations don't help

void matmul(A,B,C)(A a, B b, C c, size_t n, size_t m, size_t l)
{
   for(size_t i = 0; i < n; i++)
   {
       c[i*l..i*l + l] = 0;
       for(size_t j = 0; j < m; j++)
           c[i*l..i*l + l] += a[i*m + j] * b[j*l..j*l + l];
   }
}

January 18, 2012

Re: SIMD benchmark

Posted by F i L
in reply to Timon Gehr

F i L

Posted in reply to Timon Gehr

Timon Gehr wrote:
> The parameter is just squared and returned?

No, sorry that code is all screwed up and missing a step.
My Matrix multiply code looks like this:

auto transform(U)(Matrix4!U m) if (isImplicitlyConvertible(U, T))
{
   return Matrix4 (
       Vector4 (
           (m.x.x*x.x) + (m.x.y*y.x) + (m.x.z*z.x) + (m.x.w*w.x),
           (m.x.x*x.y) + (m.x.y*y.Y) + (m.x.z*z.y) + (m.x.w*w.y),
           (m.x.x*x.z) + (m.x.y*y.z) + (m.x.z*z.z) + (m.x.w*w.z),
           (m.x.x*x.w) + (m.x.y*y.w) + (m.x.z*z.w) + (m.x.w*w.w)
       ),
       Vector4 (
           (m.y.x*x.x) + (m.y.y*y.x) + (m.y.z*z.x) + (m.y.w*w.x),
           (m.y.x*x.y) + (m.y.y*y.y) + (m.y.z*z.y) + (m.y.w*w.y),
           (m.y.x*x.z) + (m.y.y*y.z) + (m.y.z*z.Z) + (m.y.w*w.z),
           (m.y.x*x.w) + (m.y.y*y.w) + (m.y.z*z.w) + (m.y.w*w.w)
       ),
       Vector4 (
           (m.z.x*x.x) + (m.z.y*y.x) + (m.z.z*z.x) + (m.z.w*w.x),
           (m.z.x*x.Y) + (m.z.y*y.y) + (m.z.z*z.y) + (m.z.w*w.y),
           (m.z.x*x.z) + (m.z.y*y.z) + (m.z.z*z.z) + (m.z.w*w.z),
           (m.z.x*x.w) + (m.z.y*y.w) + (m.z.z*z.w) + (m.z.w*w.w)
       ),
       Vector4 (
           (m.w.x*x.x) + (m.w.y*y.x) + (m.w.z*z.x) + (m.w.w*w.x),
           (m.w.x*x.Y) + (m.w.y*y.y) + (m.w.z*z.y) + (m.w.w*w.y),
           (m.w.x*x.Z) + (m.w.y*y.z) + (m.w.z*z.z) + (m.w.w*w.z),
           (m.w.x*x.w) + (m.w.y*y.w) + (m.w.z*z.w) + (m.w.w*w.w)
       )
   );
}

Though my test with mono.simd before using identical C# code had to be converted to something more like my previous example in order for SIMD to kick in. IDK if D's compile is good enough to optimize the above code into SIMD ops, but I doubt it.


> Anyway, I was after a general matrix*matrix multiplication, where the operands can get arbitrarily large and where any potential use of __restrict is rendered unnecessary by array vector ops.

I don't know enough about simd to confidently discuss this, but I'd imagine there'd have to be quite a lot of compiler magic happening before arbitrarily sized matrix constructs could make use of simd.

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation