January 17, 2012
On 1/17/2012 1:47 PM, Peter Alexander wrote:
> On 17/01/12 9:24 PM, Walter Bright wrote:
>> On 1/17/2012 1:20 PM, Peter Alexander wrote:
>>> As Manu said, you need something like __restrict (or a linear type
>>> system) to
>>> solve this problem.
>>
>> No, you don't. It can be done with a runtime check, like array bounds
>> checking is done.
>
> So you'd change it to this, even in release builds?

No. Like array bounds, if they overlap, an exception is thrown.

Remember, the D spec says that overlapping arrays are illegal.
January 17, 2012
On 17/01/12 10:55 PM, Walter Bright wrote:
> On 1/17/2012 1:47 PM, Peter Alexander wrote:
>> On 17/01/12 9:24 PM, Walter Bright wrote:
>>> On 1/17/2012 1:20 PM, Peter Alexander wrote:
>>>> As Manu said, you need something like __restrict (or a linear type
>>>> system) to
>>>> solve this problem.
>>>
>>> No, you don't. It can be done with a runtime check, like array bounds
>>> checking is done.
>>
>> So you'd change it to this, even in release builds?
>
> No. Like array bounds, if they overlap, an exception is thrown.
>
> Remember, the D spec says that overlapping arrays are illegal.

The D spec says that overlapping arrays are illegal for vector ops. The foo(int[], int[], int[]) function does not use vector ops.

Or am I missing something really major?

For example, is this legal code?

int[100] a;
int[] b = a[0..100];
int[] c = a[10..90]; // Illegal? b and c overlap...

foreach (i; 0..80)
    c[i] = b[i]; // Illegal?

I know that b[] = c[] would be illegal, but that has nothing to do with the prior discussion.
January 17, 2012
On 1/17/2012 3:23 PM, Peter Alexander wrote:
> On 17/01/12 10:55 PM, Walter Bright wrote:
>> On 1/17/2012 1:47 PM, Peter Alexander wrote:
>>> On 17/01/12 9:24 PM, Walter Bright wrote:
>>>> On 1/17/2012 1:20 PM, Peter Alexander wrote:
>>>>> As Manu said, you need something like __restrict (or a linear type
>>>>> system) to
>>>>> solve this problem.
>>>>
>>>> No, you don't. It can be done with a runtime check, like array bounds
>>>> checking is done.
>>>
>>> So you'd change it to this, even in release builds?
>>
>> No. Like array bounds, if they overlap, an exception is thrown.
>>
>> Remember, the D spec says that overlapping arrays are illegal.
>
> The D spec says that overlapping arrays are illegal for vector ops. The
> foo(int[], int[], int[]) function does not use vector ops.
>
> Or am I missing something really major?
>
> For example, is this legal code?
>
> int[100] a;
> int[] b = a[0..100];
> int[] c = a[10..90]; // Illegal? b and c overlap...

No, not illegal.

>
> foreach (i; 0..80)
> c[i] = b[i]; // Illegal?

No, not illegal.

> I know that b[] = c[] would be illegal, but that has nothing to do with the
> prior discussion.

Yes, b[]=c[] is illegal.
January 18, 2012
On 17/01/12 11:34 PM, Walter Bright wrote:
> On 1/17/2012 3:23 PM, Peter Alexander wrote:
>> On 17/01/12 10:55 PM, Walter Bright wrote:
>>> On 1/17/2012 1:47 PM, Peter Alexander wrote:
>>>> On 17/01/12 9:24 PM, Walter Bright wrote:
>>>>> On 1/17/2012 1:20 PM, Peter Alexander wrote:
>>>>>> As Manu said, you need something like __restrict (or a linear type
>>>>>> system) to
>>>>>> solve this problem.
>>>>>
>>>>> No, you don't. It can be done with a runtime check, like array bounds
>>>>> checking is done.
>>>>
>>>> So you'd change it to this, even in release builds?
>>>
>>> No. Like array bounds, if they overlap, an exception is thrown.
>>>
>>> Remember, the D spec says that overlapping arrays are illegal.
>>
>> The D spec says that overlapping arrays are illegal for vector ops. The
>> foo(int[], int[], int[]) function does not use vector ops.
>>
>> Or am I missing something really major?
>>
>> For example, is this legal code?
>>
>> int[100] a;
>> int[] b = a[0..100];
>> int[] c = a[10..90]; // Illegal? b and c overlap...
>
> No, not illegal.
>
>>
>> foreach (i; 0..80)
>> c[i] = b[i]; // Illegal?
>
> No, not illegal.
>
>> I know that b[] = c[] would be illegal, but that has nothing to do
>> with the
>> prior discussion.
>
> Yes, b[]=c[] is illegal.

So, my original point still stands, you can't vectorise this function:

void foo(int[] a, int[] b, int[] c)
{
  foreach (i; 0..256)
    a[i] = b[i] + c[i];
}

Those slices are allowed to overlap, so this cannot be automatically vectorised (without inlining to get better context about those arrays).

Without inlining, you need something along the lines of __restrict or uniqueness typing.
January 18, 2012
On 1/17/2012 4:19 PM, Peter Alexander wrote:
> So, my original point still stands, you can't vectorise this function:
>
> void foo(int[] a, int[] b, int[] c)
> {
> foreach (i; 0..256)
> a[i] = b[i] + c[i];
> }
>
> Those slices are allowed to overlap, so this cannot be automatically vectorised
> (without inlining to get better context about those arrays).
>
> Without inlining, you need something along the lines of __restrict or uniqueness
> typing.

No, you can rewrite it as:

   a[] = b[] + c[];

and you don't need __restrict or uniqueness. That's what the vector operations are for.
January 18, 2012
On 01/18/2012 02:04 AM, Walter Bright wrote:
> On 1/17/2012 4:19 PM, Peter Alexander wrote:
>> So, my original point still stands, you can't vectorise this function:
>>
>> void foo(int[] a, int[] b, int[] c)
>> {
>> foreach (i; 0..256)
>> a[i] = b[i] + c[i];
>> }
>>
>> Those slices are allowed to overlap, so this cannot be automatically
>> vectorised
>> (without inlining to get better context about those arrays).
>>
>> Without inlining, you need something along the lines of __restrict or
>> uniqueness
>> typing.
>
> No, you can rewrite it as:
>
> a[] = b[] + c[];
>
> and you don't need __restrict or uniqueness. That's what the vector
> operations are for.

Are they really a general solution? How do you use vector ops to implement an efficient matrix multiply, for instance?
January 18, 2012
Timon Gehr wrote:
> Are they really a general solution? How do you use vector ops to implement an efficient matrix multiply, for instance?

struct Matrix4
{
   float4 x, y, z, w;

   auto transform(Matrix4 mat)
   {
       Matrix4 rmat;

       float4 cx = {mat.x.x, mat.y.x, mat.z.x, mat.w.x};
       float4 cy = {mat.x.y, mat.y.y, mat.z.y, mat.w.y};
       float4 cz = {mat.x.z, mat.y.z, mat.z.z, mat.w.z};
       float4 cw = {mat.x.w, mat.y.w, mat.z.w, mat.w.w};

       float4 rx = {mat.x.x, mat.x.y, mat.x.z, mat.x.w};
       float4 ry = {mat.y.x, mat.y.y, mat.y.z, mat.y.w};
       float4 rz = {mat.z.x, mat.z.y, mat.z.z, mat.z.w};
       float4 rw = {mat.w.x, mat.w.y, mat.w.z, mat.w.w};

       rmat.x = cx * rx; // simd
       rmat.y = cy * ry; // simd
       rmat.z = cz * rz; // simd
       rmat.w = cw * rw; // simd

       return rmat;
   }
}
January 18, 2012
On 01/18/2012 02:32 AM, F i L wrote:
> Timon Gehr wrote:
>> Are they really a general solution? How do you use vector ops to
>> implement an efficient matrix multiply, for instance?
>
> struct Matrix4
> {
> float4 x, y, z, w;
>
> auto transform(Matrix4 mat)
> {
> Matrix4 rmat;
>
> float4 cx = {mat.x.x, mat.y.x, mat.z.x, mat.w.x};
> float4 cy = {mat.x.y, mat.y.y, mat.z.y, mat.w.y};
> float4 cz = {mat.x.z, mat.y.z, mat.z.z, mat.w.z};
> float4 cw = {mat.x.w, mat.y.w, mat.z.w, mat.w.w};
>
> float4 rx = {mat.x.x, mat.x.y, mat.x.z, mat.x.w};
> float4 ry = {mat.y.x, mat.y.y, mat.y.z, mat.y.w};
> float4 rz = {mat.z.x, mat.z.y, mat.z.z, mat.z.w};
> float4 rw = {mat.w.x, mat.w.y, mat.w.z, mat.w.w};
>
> rmat.x = cx * rx; // simd
> rmat.y = cy * ry; // simd
> rmat.z = cz * rz; // simd
> rmat.w = cw * rw; // simd
>
> return rmat;
> }
> }

The parameter is just squared and returned?

Anyway, I was after a general matrix*matrix multiplication, where the operands can get arbitrarily large and where any potential use of __restrict is rendered unnecessary by array vector ops.
January 18, 2012
On Wednesday, 18 January 2012 at 01:50:00 UTC, Timon Gehr wrote:

> Anyway, I was after a general matrix*matrix multiplication, where the operands can get arbitrarily large and where any potential use of __restrict is rendered unnecessary by array vector ops.

Here you go. But I agree there are use cases for restrict where vector operations don't help

void matmul(A,B,C)(A a, B b, C c, size_t n, size_t m, size_t l)
{
   for(size_t i = 0; i < n; i++)
   {
       c[i*l..i*l + l] = 0;
       for(size_t j = 0; j < m; j++)
           c[i*l..i*l + l] += a[i*m + j] * b[j*l..j*l + l];
   }
}


January 18, 2012
Timon Gehr wrote:
> The parameter is just squared and returned?

No, sorry that code is all screwed up and missing a step.
My Matrix multiply code looks like this:

auto transform(U)(Matrix4!U m) if (isImplicitlyConvertible(U, T))
{
   return Matrix4 (
       Vector4 (
           (m.x.x*x.x) + (m.x.y*y.x) + (m.x.z*z.x) + (m.x.w*w.x),
           (m.x.x*x.y) + (m.x.y*y.Y) + (m.x.z*z.y) + (m.x.w*w.y),
           (m.x.x*x.z) + (m.x.y*y.z) + (m.x.z*z.z) + (m.x.w*w.z),
           (m.x.x*x.w) + (m.x.y*y.w) + (m.x.z*z.w) + (m.x.w*w.w)
       ),
       Vector4 (
           (m.y.x*x.x) + (m.y.y*y.x) + (m.y.z*z.x) + (m.y.w*w.x),
           (m.y.x*x.y) + (m.y.y*y.y) + (m.y.z*z.y) + (m.y.w*w.y),
           (m.y.x*x.z) + (m.y.y*y.z) + (m.y.z*z.Z) + (m.y.w*w.z),
           (m.y.x*x.w) + (m.y.y*y.w) + (m.y.z*z.w) + (m.y.w*w.w)
       ),
       Vector4 (
           (m.z.x*x.x) + (m.z.y*y.x) + (m.z.z*z.x) + (m.z.w*w.x),
           (m.z.x*x.Y) + (m.z.y*y.y) + (m.z.z*z.y) + (m.z.w*w.y),
           (m.z.x*x.z) + (m.z.y*y.z) + (m.z.z*z.z) + (m.z.w*w.z),
           (m.z.x*x.w) + (m.z.y*y.w) + (m.z.z*z.w) + (m.z.w*w.w)
       ),
       Vector4 (
           (m.w.x*x.x) + (m.w.y*y.x) + (m.w.z*z.x) + (m.w.w*w.x),
           (m.w.x*x.Y) + (m.w.y*y.y) + (m.w.z*z.y) + (m.w.w*w.y),
           (m.w.x*x.Z) + (m.w.y*y.z) + (m.w.z*z.z) + (m.w.w*w.z),
           (m.w.x*x.w) + (m.w.y*y.w) + (m.w.z*z.w) + (m.w.w*w.w)
       )
   );
}

Though my test with mono.simd before using identical C# code had to be converted to something more like my previous example in order for SIMD to kick in. IDK if D's compile is good enough to optimize the above code into SIMD ops, but I doubt it.


> Anyway, I was after a general matrix*matrix multiplication, where the operands can get arbitrarily large and where any potential use of __restrict is rendered unnecessary by array vector ops.

I don't know enough about simd to confidently discuss this, but I'd imagine there'd have to be quite a lot of compiler magic happening before arbitrarily sized matrix constructs could make use of simd.
1 2 3 4 5 6 7
Next ›   Last »