April 08, 2012
On Sun, Apr 08, 2012 at 12:59:08PM -0500, Caligo wrote:
> On Tue, Apr 3, 2012 at 6:20 AM, Cristi Cobzarenco <cristi.cobzarenco@gmail.com> wrote:
[...]
> > Also I'm not sure how a case like this will be compiled, it may or may not allocate a temporary:
> >
> > a[] = b[] * c[] + d[] * 2.0;
> >
> > The expression templates in SciD mean there will be no temporary allocation in this call.
> >
> 
> Why are expression templates used?  Are you pretty much rewriting Eigen in D ?  I don't see why expression templates would be needed with D.

Expression templates are necessary to eliminate temporaries and optimize loops. The reason libraries like BLAS are so efficient is because they directly implement common operations like A = k*B + C. Without expression templates, a D implementation would create a temporary for k*B and then k*B + C and then assign the result, whereas expression templates allow you to directly compute the elements of A in a single loop over the matrix elements.


T

-- 
Change is inevitable, except from a vending machine.
April 08, 2012
On 8 April 2012 18:59, Caligo <iteronvexor@gmail.com> wrote:

> On Tue, Apr 3, 2012 at 6:20 AM, Cristi Cobzarenco <cristi.cobzarenco@gmail.com> wrote:
> >
> > The point of these is to have light-weight element wise operation
> support.
> > It's true that in theory the built-in arrays do this. However, this
> library
> > is built on top BLAS/LAPACK, which means operations on large matrices
> will
> > be faster than on D arrays.
> >
>
> I can't agree with building it on top of LAPACK or any other BLAS implementation, but perhaps I shouldn't complain because I'm not the one who's going to be implementing it.  I like the approach Eigen has taken where it offers its own BLAS implementation and, iirc, other BLAS libraries can be used as optional back-ends.
>

Yes, I agree with you. I already built naive implementations for BLAS & LAPACK functions as part of my last year project, using external libraries is optional (building with version=nodeps ensures absolutely no dependencies are needed). The argument still stands, though. If you _do_ use external BLAS libraries then using value arrays will _still_ be faster.


>
> > Also, as far as I know, D doesn't support
> > allocating dynamic 2-D arrays (as in not arrays of arrays), not to
> mention
> > 2-D slicing (keeping track of leading dimension).
> >
>
> I fail to see why there is any need for 2D arrays.  We need to make sure multidimensional arrays (matrices) have data in very good arrangements.  This is called tiling and it requires 1D arrays:  2D arrays are stored as 1D arrays together with an indexing mechanism.
>

That is precisely what I meant. We need wrappers around D arrays, because they, by themselves, do not support 2-D indexing. By providing a wrapper we allow the nice syntax of matrix[ a, b ]. This kind of wrapping is already in SciD by CowMatrix. I meant we shouldn't use D built-in arrays directly, not not at all. Also, as mentioned before, we can't use new for allocation, because we want the library to be GC-independent.

>
> > Also I'm not sure how a case like this will be compiled, it may or may
> not
> > allocate a temporary:
> >
> > a[] = b[] * c[] + d[] * 2.0;
> >
> > The expression templates in SciD mean there will be no temporary
> allocation
> > in this call.
> >
>
> Why are expression templates used?
>

As H. S. Teoh rightfully pointed out, it is important not to allocate
temporaries in matrix operations. You want to evaluate A = B * 3.0 + D *
2.0 (where .* is element-wise multiplication) as (in BLAS terms):
  copy( B, A );
  scal( 3.0, A );
  axpy( D, 2.0, A );

Or with naive expression template evaluation:
  for( i = 0 ; i < n ; ++ i ) {
     A[ i ] = B[ i ] * 3.0 + D * 2.0;
  }

The D compiler would instead evaluate this as:
  // allocate temporaries
  allocate( tmp1, A.length );
  allocate( tmp2, A.length );
  allocate( tmp3, A.length );

  // compute tmp1 = B * 3.0
  copy( B, tmp1 );
  scal( 3.0, tmp1 );

  // compute tmp2 = D * 2.0;
  copy( D, tmp2 );
  scal( 2.0, tmp2 );

  // compute tmp3 = tmp1 + tmp2;
  copy( tmp1, tmp3 );
  axpy( tmp2, 1.0, tmp1 );

  // copy tmp3 into A
  copy( tmp3, A );

Plenty of useless computation. Note this is not a fault of the compiler, it has no way of knowing which temporaries can be optimised away for user types (i.e. our matrices).

> Are you pretty much rewriting Eigen in D?

No. It is just the interface - and even that only to a certain extent - that will mimic Eigen. The core the library would be very similar to what I implemented for SciD last year, which, you will find, is very D-like and not at all like Eigen.


April 10, 2012
Hi, Cristi,
>From change log of D 2.059. I saw that uniform function call syntax
was implemented. I hope you can leverage this feature to make non member function calls look nicer. Another suggestion, please use shorter function name for example M.t() instead of M.transpose() so that long expression is easy to read.

Best,
Mo

On Mon, Apr 9, 2012 at 6:52 AM, Cristi Cobzarenco <cristi.cobzarenco@gmail.com> wrote:
> On 8 April 2012 18:59, Caligo <iteronvexor@gmail.com> wrote:
>>
>> On Tue, Apr 3, 2012 at 6:20 AM, Cristi Cobzarenco <cristi.cobzarenco@gmail.com> wrote:
>> >
>> > The point of these is to have light-weight element wise operation
>> > support.
>> > It's true that in theory the built-in arrays do this. However, this
>> > library
>> > is built on top BLAS/LAPACK, which means operations on large matrices
>> > will
>> > be faster than on D arrays.
>> >
>>
>> I can't agree with building it on top of LAPACK or any other BLAS implementation, but perhaps I shouldn't complain because I'm not the one who's going to be implementing it.  I like the approach Eigen has taken where it offers its own BLAS implementation and, iirc, other BLAS libraries can be used as optional back-ends.
>
>
> Yes, I agree with you. I already built naive implementations for BLAS & LAPACK functions as part of my last year project, using external libraries is optional (building with version=nodeps ensures absolutely no dependencies are needed). The argument still stands, though. If you _do_ use external BLAS libraries then using value arrays will _still_ be faster.
>
>>
>>
>> > Also, as far as I know, D doesn't support
>> > allocating dynamic 2-D arrays (as in not arrays of arrays), not to
>> > mention
>> > 2-D slicing (keeping track of leading dimension).
>> >
>>
>> I fail to see why there is any need for 2D arrays.  We need to make sure multidimensional arrays (matrices) have data in very good arrangements.  This is called tiling and it requires 1D arrays:  2D arrays are stored as 1D arrays together with an indexing mechanism.
>
>
> That is precisely what I meant. We need wrappers around D arrays, because they, by themselves, do not support 2-D indexing. By providing a wrapper we allow the nice syntax of matrix[ a, b ]. This kind of wrapping is already in SciD by CowMatrix. I meant we shouldn't use D built-in arrays directly, not not at all. Also, as mentioned before, we can't use new for allocation, because we want the library to be GC-independent.
>>
>>
>> > Also I'm not sure how a case like this will be compiled, it may or may
>> > not
>> > allocate a temporary:
>> >
>> > a[] = b[] * c[] + d[] * 2.0;
>> >
>> > The expression templates in SciD mean there will be no temporary
>> > allocation
>> > in this call.
>> >
>>
>> Why are expression templates used?
>
>
> As H. S. Teoh rightfully pointed out, it is important not to allocate
> temporaries in matrix operations. You want to evaluate A = B * 3.0 + D * 2.0
> (where .* is element-wise multiplication) as (in BLAS terms):
>   copy( B, A );
>   scal( 3.0, A );
>   axpy( D, 2.0, A );
>
> Or with naive expression template evaluation:
>   for( i = 0 ; i < n ; ++ i ) {
>      A[ i ] = B[ i ] * 3.0 + D * 2.0;
>   }
>
> The D compiler would instead evaluate this as:
>   // allocate temporaries
>   allocate( tmp1, A.length );
>   allocate( tmp2, A.length );
>   allocate( tmp3, A.length );
>
>   // compute tmp1 = B * 3.0
>   copy( B, tmp1 );
>   scal( 3.0, tmp1 );
>
>   // compute tmp2 = D * 2.0;
>   copy( D, tmp2 );
>   scal( 2.0, tmp2 );
>
>   // compute tmp3 = tmp1 + tmp2;
>   copy( tmp1, tmp3 );
>   axpy( tmp2, 1.0, tmp1 );
>
>   // copy tmp3 into A
>   copy( tmp3, A );
>
> Plenty of useless computation. Note this is not a fault of the compiler, it has no way of knowing which temporaries can be optimised away for user types (i.e. our matrices).
>
>> Are you pretty much rewriting Eigen in D?
>
> No. It is just the interface - and even that only to a certain extent - that will mimic Eigen. The core the library would be very similar to what I implemented for SciD last year, which, you will find, is very D-like and not at all like Eigen.
>
>
April 10, 2012
Thanks for the suggestions!

I don't think UFCS would help us. Our problem is that we can't do this:
triangular.d:
  struct TriangularMatrix {

  }

  void sum( T )( T x ) if( is( T : TriangularMatrix ) ) {

  }

diagonal.d:
  struct DiagonalMatrix {

  }

  void sum( T )( T x ) if( is( T : DiagonalMatrix ) ) {

  }

main.d:
import diagonal;
import triangular;

void bar() {
   TriangularMatrix a;
   Diagonal b;
   sum( a );  // this does not compile because sum() is ambiguous
   sum( b );  // nor does this
}

This, AFAIK, is deliberate to avoid name hijacking - ADL in C++ had its share of criticism. I doubt we will ever get this behaviour in D and that is perhaps a good thing. I may have misunderstood UFCS though - or what you meant by making non-member function calls look nicer - please correct me if that's the case.

Don't worry about long names, t() is already the way transposition is defined SciD. Moreover, it's a property so you can actually do "a.t * a" - convenience galore. I'm also considering of creating a submodule like std.linalg.short which defines aliases with short names for types and free functions - this will allow particularly numerics-heavy functions to be written more compactly. I'm not entirely sure it would be a good idea though as it may sacrifice readability where it's most needed.

---
Cristi Cobzarenco
BSc in Artificial Intelligence and Computer Science
University of Edinburgh
Profile: http://www.google.com/profiles/cristi.cobzarenco



On 10 April 2012 01:36, Michael Chen <sth4nth@gmail.com> wrote:

> Hi, Cristi,
> From change log of D 2.059. I saw that uniform function call syntax
> was implemented. I hope you can leverage this feature to make non
> member function calls look nicer. Another suggestion, please use
> shorter function name for example M.t() instead of M.transpose() so
> that long expression is easy to read.
>
> Best,
> Mo
>
> On Mon, Apr 9, 2012 at 6:52 AM, Cristi Cobzarenco <cristi.cobzarenco@gmail.com> wrote:
> > On 8 April 2012 18:59, Caligo <iteronvexor@gmail.com> wrote:
> >>
> >> On Tue, Apr 3, 2012 at 6:20 AM, Cristi Cobzarenco <cristi.cobzarenco@gmail.com> wrote:
> >> >
> >> > The point of these is to have light-weight element wise operation
> >> > support.
> >> > It's true that in theory the built-in arrays do this. However, this
> >> > library
> >> > is built on top BLAS/LAPACK, which means operations on large matrices
> >> > will
> >> > be faster than on D arrays.
> >> >
> >>
> >> I can't agree with building it on top of LAPACK or any other BLAS implementation, but perhaps I shouldn't complain because I'm not the one who's going to be implementing it.  I like the approach Eigen has taken where it offers its own BLAS implementation and, iirc, other BLAS libraries can be used as optional back-ends.
> >
> >
> > Yes, I agree with you. I already built naive implementations for BLAS & LAPACK functions as part of my last year project, using external
> libraries
> > is optional (building with version=nodeps ensures absolutely no
> dependencies
> > are needed). The argument still stands, though. If you _do_ use external BLAS libraries then using value arrays will _still_ be faster.
> >
> >>
> >>
> >> > Also, as far as I know, D doesn't support
> >> > allocating dynamic 2-D arrays (as in not arrays of arrays), not to
> >> > mention
> >> > 2-D slicing (keeping track of leading dimension).
> >> >
> >>
> >> I fail to see why there is any need for 2D arrays.  We need to make sure multidimensional arrays (matrices) have data in very good arrangements.  This is called tiling and it requires 1D arrays:  2D arrays are stored as 1D arrays together with an indexing mechanism.
> >
> >
> > That is precisely what I meant. We need wrappers around D arrays, because they, by themselves, do not support 2-D indexing. By providing a wrapper
> we
> > allow the nice syntax of matrix[ a, b ]. This kind of wrapping is
> already in
> > SciD by CowMatrix. I meant we shouldn't use D built-in arrays directly,
> not
> > not at all. Also, as mentioned before, we can't use new for allocation, because we want the library to be GC-independent.
> >>
> >>
> >> > Also I'm not sure how a case like this will be compiled, it may or may
> >> > not
> >> > allocate a temporary:
> >> >
> >> > a[] = b[] * c[] + d[] * 2.0;
> >> >
> >> > The expression templates in SciD mean there will be no temporary
> >> > allocation
> >> > in this call.
> >> >
> >>
> >> Why are expression templates used?
> >
> >
> > As H. S. Teoh rightfully pointed out, it is important not to allocate temporaries in matrix operations. You want to evaluate A = B * 3.0 + D *
> 2.0
> > (where .* is element-wise multiplication) as (in BLAS terms):
> >   copy( B, A );
> >   scal( 3.0, A );
> >   axpy( D, 2.0, A );
> >
> > Or with naive expression template evaluation:
> >   for( i = 0 ; i < n ; ++ i ) {
> >      A[ i ] = B[ i ] * 3.0 + D * 2.0;
> >   }
> >
> > The D compiler would instead evaluate this as:
> >   // allocate temporaries
> >   allocate( tmp1, A.length );
> >   allocate( tmp2, A.length );
> >   allocate( tmp3, A.length );
> >
> >   // compute tmp1 = B * 3.0
> >   copy( B, tmp1 );
> >   scal( 3.0, tmp1 );
> >
> >   // compute tmp2 = D * 2.0;
> >   copy( D, tmp2 );
> >   scal( 2.0, tmp2 );
> >
> >   // compute tmp3 = tmp1 + tmp2;
> >   copy( tmp1, tmp3 );
> >   axpy( tmp2, 1.0, tmp1 );
> >
> >   // copy tmp3 into A
> >   copy( tmp3, A );
> >
> > Plenty of useless computation. Note this is not a fault of the compiler,
> it
> > has no way of knowing which temporaries can be optimised away for user
> types
> > (i.e. our matrices).
> >
> >> Are you pretty much rewriting Eigen in D?
> >
> > No. It is just the interface - and even that only to a certain extent -
> that
> > will mimic Eigen. The core the library would be very similar to what I implemented for SciD last year, which, you will find, is very D-like and
> not
> > at all like Eigen.
> >
> >
>


April 10, 2012
On 04/10/2012 03:24 AM, Cristi Cobzarenco wrote:
> Thanks for the suggestions!
>
> I don't think UFCS would help us. Our problem is that we can't do this:
> triangular.d:
>    struct TriangularMatrix {
>
>    }
>
>    void sum( T )( T x ) if( is( T : TriangularMatrix ) ) {
>
>    }
>
> diagonal.d:
>    struct DiagonalMatrix {
>
>    }
>
>    void sum( T )( T x ) if( is( T : DiagonalMatrix ) ) {
>    }
>
> main.d:
> import diagonal;
> import triangular;
>
> void bar() {
>     TriangularMatrix a;
>     Diagonal b;
>     sum( a );  // this does not compile because sum() is ambiguous
>     sum( b );  // nor does this
> }

There are no ambiguities in that example, and if ambiguities occur, they can always be fixed manually.

>
> This, AFAIK, is deliberate to avoid name hijacking - ADL in C++ had its
> share of criticism. I doubt we will ever get this behaviour in D and
> that is perhaps a good thing. I may have misunderstood UFCS though - or
> what you meant by making non-member function calls look nicer - please
> correct me if that's the case.
>
> Don't worry about long names, t() is already the way transposition is
> defined SciD. Moreover, it's a property so you can actually do "a.t * a"
> - convenience galore. I'm also considering of creating a submodule like
> std.linalg.short which defines aliases with short names for types and
> free functions - this will allow particularly numerics-heavy functions
> to be written more compactly. I'm not entirely sure it would be a good
> idea though as it may sacrifice readability where it's most needed.
>
> ---
> Cristi Cobzarenco
> BSc in Artificial Intelligence and Computer Science
> University of Edinburgh
> Profile: http://www.google.com/profiles/cristi.cobzarenco
>

If you change 'Diagonal' to 'DiagonalMatrix', this compiles fine.

April 10, 2012
Timon is, of course, right. I got a bit confused when trying to simplify in a hurry. What I meant was actually something like this:

ops.d:
import std.stdio;

int sum( T )( T mat ){
 writeln("ops.sum");
// return reduce!"a+b"( 0, mat );
 return 1;
}

int numelems( T )( T mat ) {
// return mat.rows * mat.columns;
 return 1;
}

int mean( T )( T mat ) {
return sum( mat ) / numelems( mat );
}

diagonal.d:
import ops;
import std.stdio;

struct DiagonalMatrix {
 }

int sum( T : DiagonalMatrix )( T mat ) {
writeln( "diagonal.sum" );
// return reduce!"a+b"( 0, mat.diagonal() );
 return 1;
}

main.d:
import ops;
import diagonal;

void main() {
 DiagonalMatrix mat;
sum( mat );   // this will not compile, but if removed mean
 mean( mat ); // will use ops.sum, not diagonal.sum anyway
}

The first problem could, in theory, be fixed using a member flag, something
like enum specializedSum = true; and ops.sum could be redefined as: int
sum( T )( T mat ) if( !T.specializedSum ) {}. But in this case mean() would
still try to use  ops.sum which will fail. All of this can be easily fixed
by providing member functions, and having the free functions call the
member ones when they exist.

---
Cristi Cobzarenco
BSc in Artificial Intelligence and Computer Science
University of Edinburgh
Profile: http://www.google.com/profiles/cristi.cobzarenco



On 10 April 2012 10:35, Timon Gehr <timon.gehr@gmx.ch> wrote:

> On 04/10/2012 03:24 AM, Cristi Cobzarenco wrote:
>
>> Thanks for the suggestions!
>>
>> I don't think UFCS would help us. Our problem is that we can't do this:
>> triangular.d:
>>   struct TriangularMatrix {
>>
>>   }
>>
>>   void sum( T )( T x ) if( is( T : TriangularMatrix ) ) {
>>
>>   }
>>
>> diagonal.d:
>>   struct DiagonalMatrix {
>>
>>   }
>>
>>   void sum( T )( T x ) if( is( T : DiagonalMatrix ) ) {
>>   }
>>
>> main.d:
>> import diagonal;
>> import triangular;
>>
>> void bar() {
>>    TriangularMatrix a;
>>    Diagonal b;
>>    sum( a );  // this does not compile because sum() is ambiguous
>>    sum( b );  // nor does this
>> }
>>
>
> There are no ambiguities in that example, and if ambiguities occur, they can always be fixed manually.
>
>
>> This, AFAIK, is deliberate to avoid name hijacking - ADL in C++ had its share of criticism. I doubt we will ever get this behaviour in D and that is perhaps a good thing. I may have misunderstood UFCS though - or what you meant by making non-member function calls look nicer - please correct me if that's the case.
>>
>> Don't worry about long names, t() is already the way transposition is defined SciD. Moreover, it's a property so you can actually do "a.t * a" - convenience galore. I'm also considering of creating a submodule like std.linalg.short which defines aliases with short names for types and free functions - this will allow particularly numerics-heavy functions to be written more compactly. I'm not entirely sure it would be a good idea though as it may sacrifice readability where it's most needed.
>>
>> ---
>> Cristi Cobzarenco
>> BSc in Artificial Intelligence and Computer Science
>> University of Edinburgh
>> Profile: http://www.google.com/**profiles/cristi.cobzarenco<http://www.google.com/profiles/cristi.cobzarenco>
>>
>>
> If you change 'Diagonal' to 'DiagonalMatrix', this compiles fine.
>
>


April 23, 2012
Unfortunately, my proposal was not picked up this year. I might try to work on these ideas this summer anyway so I would still be very much interested in ideas and feedback, but I will probably have much less time if I'll be working somewhere else.

---
Cristi Cobzarenco
BSc in Artificial Intelligence and Computer Science
University of Edinburgh
Profile: http://www.google.com/profiles/cristi.cobzarenco



On 10 April 2012 12:31, Cristi Cobzarenco <cristi.cobzarenco@gmail.com>wrote:

> Timon is, of course, right. I got a bit confused when trying to simplify in a hurry. What I meant was actually something like this:
>
> ops.d:
> import std.stdio;
>
> int sum( T )( T mat ){
>  writeln("ops.sum");
> // return reduce!"a+b"( 0, mat );
>  return 1;
> }
>
> int numelems( T )( T mat ) {
> // return mat.rows * mat.columns;
>  return 1;
> }
>
> int mean( T )( T mat ) {
> return sum( mat ) / numelems( mat );
> }
>
> diagonal.d:
> import ops;
> import std.stdio;
>
> struct DiagonalMatrix {
>  }
>
> int sum( T : DiagonalMatrix )( T mat ) {
> writeln( "diagonal.sum" );
> // return reduce!"a+b"( 0, mat.diagonal() );
>  return 1;
> }
>
> main.d:
> import ops;
> import diagonal;
>
> void main() {
>  DiagonalMatrix mat;
> sum( mat );   // this will not compile, but if removed mean
>  mean( mat ); // will use ops.sum, not diagonal.sum anyway
> }
>
> The first problem could, in theory, be fixed using a member flag,
> something like enum specializedSum = true; and ops.sum could be redefined
> as: int sum( T )( T mat ) if( !T.specializedSum ) {}. But in this case
> mean() would still try to use  ops.sum which will fail. All of this can be
> easily fixed by providing member functions, and having the free functions
> call the member ones when they exist.
>
> ---
> Cristi Cobzarenco
> BSc in Artificial Intelligence and Computer Science
> University of Edinburgh
> Profile: http://www.google.com/profiles/cristi.cobzarenco
>
>
>
> On 10 April 2012 10:35, Timon Gehr <timon.gehr@gmx.ch> wrote:
>
>> On 04/10/2012 03:24 AM, Cristi Cobzarenco wrote:
>>
>>> Thanks for the suggestions!
>>>
>>> I don't think UFCS would help us. Our problem is that we can't do this:
>>> triangular.d:
>>>   struct TriangularMatrix {
>>>
>>>   }
>>>
>>>   void sum( T )( T x ) if( is( T : TriangularMatrix ) ) {
>>>
>>>   }
>>>
>>> diagonal.d:
>>>   struct DiagonalMatrix {
>>>
>>>   }
>>>
>>>   void sum( T )( T x ) if( is( T : DiagonalMatrix ) ) {
>>>   }
>>>
>>> main.d:
>>> import diagonal;
>>> import triangular;
>>>
>>> void bar() {
>>>    TriangularMatrix a;
>>>    Diagonal b;
>>>    sum( a );  // this does not compile because sum() is ambiguous
>>>    sum( b );  // nor does this
>>> }
>>>
>>
>> There are no ambiguities in that example, and if ambiguities occur, they can always be fixed manually.
>>
>>
>>> This, AFAIK, is deliberate to avoid name hijacking - ADL in C++ had its share of criticism. I doubt we will ever get this behaviour in D and that is perhaps a good thing. I may have misunderstood UFCS though - or what you meant by making non-member function calls look nicer - please correct me if that's the case.
>>>
>>> Don't worry about long names, t() is already the way transposition is defined SciD. Moreover, it's a property so you can actually do "a.t * a" - convenience galore. I'm also considering of creating a submodule like std.linalg.short which defines aliases with short names for types and free functions - this will allow particularly numerics-heavy functions to be written more compactly. I'm not entirely sure it would be a good idea though as it may sacrifice readability where it's most needed.
>>>
>>> ---
>>> Cristi Cobzarenco
>>> BSc in Artificial Intelligence and Computer Science
>>> University of Edinburgh
>>> Profile: http://www.google.com/**profiles/cristi.cobzarenco<http://www.google.com/profiles/cristi.cobzarenco>
>>>
>>>
>> If you change 'Diagonal' to 'DiagonalMatrix', this compiles fine.
>>
>>
>


1 2
Next ›   Last »