Array operations -- not actually so difficult - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » Array operations -- not actually so difficult

Thread overview

Array operations -- not actually so difficult
Dec 15, 2006 Don Clugston
Dec 15, 2006 Nox / Lux
Dec 15, 2006 Charlie
Dec 15, 2006 Don Clugston
Dec 15, 2006 Bill Baxter
Dec 15, 2006 Sean Kelly
Dec 15, 2006 Bill Baxter
Dec 15, 2006 Kirk McDonald
Dec 15, 2006 Bill Baxter
Dec 16, 2006 Don Clugston
Dec 15, 2006 Knud Sørensen
Dec 16, 2006 Don Clugston
Dec 16, 2006 Stewart Gordon

December 15, 2006

Array operations -- not actually so difficult

Posted by Don Clugston

Don Clugston

Array operations are one of the very few 2.0 features that haven't made it into 1.0. My experiments into expression templates suggest that it might not be very complicated after all.

Consider something like a dot product,

real a[], b[], c[];
real d = dot(a+b, c);

For performance, it would be better if the compiler actually *didn't* evaluate a+b. Instead, that should compile to:

real sum = 0;
for (int i=0; i<c.length; ++i) {
     sum+=(a[i]+b[i])*c[i];
}
d=sum;

In fact, we could create a fantastic library implementation of array operations with a tiny bit of syntactic sugar:
instead of generating an error message for array operations, look for an opXXX function, and treat it the same way the implicit array properties work.
----------
void opMulAssign(real [] a, real w)
{
    for(int i=0; i<a.length; ++i) { a[i]*=w; }
}

void main()
{
    real b[] = [1.0L, 2, 3, 4];
    b.opMulAssign(3.5); // OK.
    b *= 3.5; // Fails with "Error: 'b' is not a scalar, it is a real[]"
}
----------
So the message to Walter is, support for array operations will not require any change to the back end. They are just a bit of syntactic sugar at the front end.

December 15, 2006

Re: Array operations -- not actually so difficult

Posted by Nox / Lux
in reply to Don Clugston

Nox / Lux

Posted in reply to Don Clugston

Don Clugston wrote:
>Consider something like a dot product,
>
>real a[], b[], c[];
>real d = dot(a+b, c);

Hmm, speaking of array operations and dot product ... Seeing as this (dot product) is one of the added instructions of the upcoming SSE4, would it be possible to automatically make use of this instruction behind the scenes, compiling a normal version only if SSE4 is not supported?

December 15, 2006

Re: Array operations -- not actually so difficult

Posted by Charlie
in reply to Nox / Lux

Charlie

Posted in reply to Nox / Lux

Nox / Lux wrote:
> Don Clugston wrote:
>> Consider something like a dot product,
>>
>> real a[], b[], c[];
>> real d = dot(a+b, c);
> 
> Hmm, speaking of array operations and dot product ... Seeing as this (dot product)
> is one of the added instructions of the upcoming SSE4, would it be possible to
> automatically make use of this instruction behind the scenes, compiling a normal
> version only if SSE4 is not supported?

I think this is the main reason why it hasn't been implemented yet ? The backend can pull some tricks to optimize certain operations.

I would certainly like to see array ops implemented though, maybe this can be a 1.0 solution, and re-written for 2.0 ?

Charlie

December 15, 2006

Re: Array operations -- not actually so difficult

Posted by Don Clugston
in reply to Nox / Lux

Don Clugston

Posted in reply to Nox / Lux

Nox / Lux wrote:
> Don Clugston wrote:
>> Consider something like a dot product,
>>
>> real a[], b[], c[];
>> real d = dot(a+b, c);
> 
> Hmm, speaking of array operations and dot product ... Seeing as this (dot product)
> is one of the added instructions of the upcoming SSE4, would it be possible to
> automatically make use of this instruction behind the scenes, compiling a normal
> version only if SSE4 is not supported?

Since DMD has not support for 64 bit yet, and AFAIK none is planned yet, I think it will be a very long time before SSE4 is supported. But, it should be possible to link to BLAS routines that make use of the new instructions.

December 15, 2006

Re: Array operations -- not actually so difficult

Posted by Bill Baxter
in reply to Don Clugston

Bill Baxter

Posted in reply to Don Clugston

Don Clugston wrote:
> Array operations are one of the very few 2.0 features that haven't made it into 1.0. My experiments into expression templates suggest that it might not be very complicated after all.
> 
> Consider something like a dot product,
> 
> real a[], b[], c[];
> real d = dot(a+b, c);
> 
> For performance, it would be better if the compiler actually *didn't* evaluate a+b. Instead, that should compile to:
> 
> real sum = 0;
> for (int i=0; i<c.length; ++i) {
>      sum+=(a[i]+b[i])*c[i];
> }
> d=sum;
> 
> In fact, we could create a fantastic library implementation of array operations with a tiny bit of syntactic sugar:
> instead of generating an error message for array operations, look for an opXXX function, and treat it the same way the implicit array properties work.

How does that result in the above code for dot product?
Looks to me like if all you have is calling opXXX functions the above dot would still become two loops with a tmp for the a+b.

--bb

December 15, 2006

Re: Array operations -- not actually so difficult

Posted by Sean Kelly
in reply to Bill Baxter

Sean Kelly

Posted in reply to Bill Baxter

Bill Baxter wrote:
> Don Clugston wrote:
>> Array operations are one of the very few 2.0 features that haven't made it into 1.0. My experiments into expression templates suggest that it might not be very complicated after all.
>>
>> Consider something like a dot product,
>>
>> real a[], b[], c[];
>> real d = dot(a+b, c);
>>
>> For performance, it would be better if the compiler actually *didn't* evaluate a+b. Instead, that should compile to:
>>
>> real sum = 0;
>> for (int i=0; i<c.length; ++i) {
>>      sum+=(a[i]+b[i])*c[i];
>> }
>> d=sum;
>>
>> In fact, we could create a fantastic library implementation of array operations with a tiny bit of syntactic sugar:
>> instead of generating an error message for array operations, look for an opXXX function, and treat it the same way the implicit array properties work.
> 
> How does that result in the above code for dot product?
> Looks to me like if all you have is calling opXXX functions the above dot would still become two loops with a tmp for the a+b.

Not necessarily.  Consider:

    struct AddExp(T, U)
    {
        T lhs;
        U rhs;

        static AddExp!(T,U) opCall( T lhs, U rhs )
        in
        {
            assert( lhs.length == rhs.length );
            assert( is( typeof(T[0]) == typeof(U[0]) ) );
        }
        body
        {
            AddExp!(T,U) exp;
            exp.lhs = lhs;
            exp.rhs = rhs;
            return exp;
        }

        typeof(T[0]) opIndex( size_t i )
        {
            return lhs[i] + rhs[i];
        }

        typeof(T[0])[] opSlice( size_t b, size_t e )
        {
            typeof(T[0])[] sum = new typeof(T[0])[lhs.length];

            for( size_t i = 0; i < lhs.length; ++i )
            {
                sum[i] = lhs[i] + rhs[i];
            }
            return sum;
        }

        size_t length()
        {
            return lhs.length;
        }
    }


    AddExp!(T,U) opAdd(T, U)( T lhs, U rhs )
    {
        return AddExp!(T,U)( lhs, rhs );
    }


    T[] opAssign(T, U)( inout T[] lhs, U rhs )
    in
    {
        assert( is( typeof(T[0]) == typeof(U[0]) ) );
    }
    body
    {
       lhs = rhs[0 .. $];
       return lhs;
    }


    real dotp(T,U)( T lhs, U rhs )
    {
        real sum = 0.0;
        assert( lhs.length == rhs.length );
        for( size_t i = 0; i < lhs.length; ++i )
        {
            sum += lhs[i] * rhs[i];
        }
        return sum;
    }


    void main()
    {
        real[] a, b, c;
        a ~= 1.0; a ~= 2.0;
        b ~= 3.0; b ~= 4.0;
        c ~= 5.0; c ~= 6.0;

        // test opSlice
        auto   s = opAdd( a, b );
        real[] r = s[0 .. s.length];

        for( size_t i = 0; i < r.length; ++i )
        {
            printf( "%Lf = %Lf\n", r[i], s[i] );
        }

        // compound expression
        real res = dotp( opAdd( a, b ), c );

        printf( "\n%Lf\n", res );
    }

prints:

    4.000000 = 4.000000
    6.000000 = 6.000000

    56.000000

Thanks to the wonder of templates, a properly designed expression struct could be evaluated as-is rather than needing to be converted to an array before being passed to dotp().  If we were able to define free-floating operators as per Don's suggestion, the above would be darn slick :-)

By the way, I originally tried slicing using the standard syntax "s[0 .. $]" and got this error:

    test.d(84): Error: undefined identifier __dollar

I imagine this is a known issue?


Sean

December 15, 2006

Re: Array operations -- not actually so difficult

Posted by Bill Baxter
in reply to Sean Kelly

Bill Baxter

Posted in reply to Sean Kelly

Sean Kelly wrote:
> Bill Baxter wrote:
> 
>> Don Clugston wrote:
>>
>>> Array operations are one of the very few 2.0 features that haven't made it into 1.0. My experiments into expression templates suggest that it might not be very complicated after all.
>>>
>>> Consider something like a dot product,
>>>
>>> real a[], b[], c[];
>>> real d = dot(a+b, c);
>>>
>>> For performance, it would be better if the compiler actually *didn't* evaluate a+b. Instead, that should compile to:
>>>
>>> real sum = 0;
>>> for (int i=0; i<c.length; ++i) {
>>>      sum+=(a[i]+b[i])*c[i];
>>> }
>>> d=sum;
>>>
>>> In fact, we could create a fantastic library implementation of array operations with a tiny bit of syntactic sugar:
>>> instead of generating an error message for array operations, look for an opXXX function, and treat it the same way the implicit array properties work.
>>
>>
>> How does that result in the above code for dot product?
>> Looks to me like if all you have is calling opXXX functions the above dot would still become two loops with a tmp for the a+b.
> 
> 
> Not necessarily.  Consider:
> 
>     struct AddExp(T, U)
>     {
>         T lhs;
>         U rhs;
> 
>         static AddExp!(T,U) opCall( T lhs, U rhs )
>         in
>         {
>             assert( lhs.length == rhs.length );
>             assert( is( typeof(T[0]) == typeof(U[0]) ) );
>         }
>         body
>         {
>             AddExp!(T,U) exp;
>             exp.lhs = lhs;
>             exp.rhs = rhs;
>             return exp;
>         }
> 
>         typeof(T[0]) opIndex( size_t i )
>         {
>             return lhs[i] + rhs[i];
>         }
> 
>         typeof(T[0])[] opSlice( size_t b, size_t e )
>         {
>             typeof(T[0])[] sum = new typeof(T[0])[lhs.length];
> 
>             for( size_t i = 0; i < lhs.length; ++i )
>             {
>                 sum[i] = lhs[i] + rhs[i];
>             }
>             return sum;
>         }
> 
>         size_t length()
>         {
>             return lhs.length;
>         }
>     }
> 
> 
>     AddExp!(T,U) opAdd(T, U)( T lhs, U rhs )
>     {
>         return AddExp!(T,U)( lhs, rhs );
>     }
> 
> 
>     T[] opAssign(T, U)( inout T[] lhs, U rhs )
>     in
>     {
>         assert( is( typeof(T[0]) == typeof(U[0]) ) );
>     }
>     body
>     {
>        lhs = rhs[0 .. $];
>        return lhs;
>     }
> 
> 
>     real dotp(T,U)( T lhs, U rhs )
>     {
>         real sum = 0.0;
>         assert( lhs.length == rhs.length );
>         for( size_t i = 0; i < lhs.length; ++i )
>         {
>             sum += lhs[i] * rhs[i];
>         }
>         return sum;
>     }
> 
> 
>     void main()
>     {
>         real[] a, b, c;
>         a ~= 1.0; a ~= 2.0;
>         b ~= 3.0; b ~= 4.0;
>         c ~= 5.0; c ~= 6.0;
> 
>         // test opSlice
>         auto   s = opAdd( a, b );
>         real[] r = s[0 .. s.length];
> 
>         for( size_t i = 0; i < r.length; ++i )
>         {
>             printf( "%Lf = %Lf\n", r[i], s[i] );
>         }
> 
>         // compound expression
>         real res = dotp( opAdd( a, b ), c );
> 
>         printf( "\n%Lf\n", res );
>     }
> 
> prints:
> 
>     4.000000 = 4.000000
>     6.000000 = 6.000000
> 
>     56.000000
> 
> Thanks to the wonder of templates, a properly designed expression struct could be evaluated as-is rather than needing to be converted to an array before being passed to dotp().  If we were able to define free-floating operators as per Don's suggestion, the above would be darn slick :-)
> 
> By the way, I originally tried slicing using the standard syntax "s[0 .. $]" and got this error:
> 
>     test.d(84): Error: undefined identifier __dollar
> 
> I imagine this is a known issue?
> 
> 
> Sean

I put it on my D Rants page[1], but I don't think I ever filed it as a bug.

[1] http://www.prowiki.org/wiki4d/wiki.cgi?BillBaxter

--bb

December 15, 2006

Re: Array operations -- not actually so difficult

Posted by Kirk McDonald
in reply to Bill Baxter

Kirk McDonald

Posted in reply to Bill Baxter

Bill Baxter wrote:
> I put it on my D Rants page[1], but I don't think I ever filed it as a bug.
> 
> [1] http://www.prowiki.org/wiki4d/wiki.cgi?BillBaxter
> 
> --bb

On that page:

-----
It should be possible to set the init property for a typedef'ed or other user defined type. E.g.

   typedef int myint;
   myint.init = -1;
-----

You can do this:

import std.stdio;
typedef int INT = 20;
void main() {
    INT i;
    writefln(i); // prints 20
}

-- 
Kirk McDonald
Pyd: Wrapping Python with D
http://pyd.dsource.org

December 15, 2006

Re: Array operations -- not actually so difficult

Posted by Knud Sørensen
in reply to Don Clugston

Knud Sørensen

Posted in reply to Don Clugston

I expect for the array operation in 2.0 is something like vectorization. see http://www.all-technology.com/eigenpolls/dwishlist/index.php?it=10

Which allows you (in one line) to make multidimensional calculations on
an arbitrary number of multidimensional arrays,
but it still allows you to read exactly which indexed is used
and the exact calculation performed.

The syntax don't generate temporary arrays and allow the compiler to freely reorder nested loops.

Also the compiler will be free to support for SSE?
and GPUs in the future without your having to rewriting the
vectorization code.

I don't expect to see this in 1.0 I am just pointing out
that the array operation you suggest here seems fare from
my expectations for 2.0.

Knud

On Fri, 15 Dec 2006 10:40:16 +0100, Don Clugston wrote:

> Array operations are one of the very few 2.0 features that haven't made it into 1.0. My experiments into expression templates suggest that it might not be very complicated after all.
> 
> Consider something like a dot product,
> 
> real a[], b[], c[];
> real d = dot(a+b, c);
> 
> For performance, it would be better if the compiler actually *didn't* evaluate a+b. Instead, that should compile to:
> 
> real sum = 0;
> for (int i=0; i<c.length; ++i) {
>       sum+=(a[i]+b[i])*c[i];
> }
> d=sum;
> 
> In fact, we could create a fantastic library implementation of array
> operations with a tiny bit of syntactic sugar:
> instead of generating an error message for array operations, look for an
> opXXX function, and treat it the same way the implicit array properties
> work.
> ----------
> void opMulAssign(real [] a, real w)
> {
>      for(int i=0; i<a.length; ++i) { a[i]*=w; }
> }
> 
> void main()
> {
>      real b[] = [1.0L, 2, 3, 4];
>      b.opMulAssign(3.5); // OK.
>      b *= 3.5; // Fails with "Error: 'b' is not a scalar, it is a real[]"
> }
> ----------
> So the message to Walter is, support for array operations will not require any change to the back end. They are just a bit of syntactic sugar at the front end.

December 15, 2006

Re: Array operations -- not actually so difficult

Posted by Bill Baxter
in reply to Kirk McDonald

Bill Baxter

Posted in reply to Kirk McDonald

Kirk McDonald wrote:
> Bill Baxter wrote:
> 
>> I put it on my D Rants page[1], but I don't think I ever filed it as a bug.
>>
>> [1] http://www.prowiki.org/wiki4d/wiki.cgi?BillBaxter
>>
>> --bb
> 
> 
> On that page:
> 
> -----
> It should be possible to set the init property for a typedef'ed or other user defined type. E.g.
> 
>    typedef int myint;
>    myint.init = -1;
> -----
> 
> You can do this:
> 
> import std.stdio;
> typedef int INT = 20;
> void main() {
>     INT i;
>     writefln(i); // prints 20
> }

That's interesting.  Didn't know that.  Seems not as intuitive as setting .init directly, but now that I think about it, I guess there are issues with allowing the init value to be set anywhere other that at the point of type definition.

--bb

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation