View mode: basic / threaded / horizontal-split · Log in · Help
May 24, 2004
High performance linear algebra?
Hi,

I'm wondering a few things related to this..

1. Does the operator overload mechanism in D disallow writing high 
performance linear algebra classes (Point, Vector, Matrix)?

2. How inlinable (i.e. with DMD) would such classes be?

3. What about the fact that operators always return copies?

4. What about temporaries?  Are they magically optimized away?  :)

5. The current trend in C++ is to use template expressions and template 
metaprogramming to make something like this:

	Vector a, b, c;
	c = a + b * 0.5f;

compile into something like this:

	c[0] = a[0] + b[0] * 0.5f
	c[1] = a[1] + b[1] * 0.5f
	c[2] = a[2] + b[2] * 0.5f

Note the lack of temporaries here.  Is this a possibility in D?  How far 
does DMD go in this direction?

6. The current trend in C++ is also Boost, which is fugly.  Seems like 
lots of people want this included in C++ (a mistake IMO).  Any plans on 
including similar math functionality into D from the beginning - rather 
than as an ugly afterthought?  This could make D a real option in the 
scientific (and game!) programming domains.

--Steve
May 24, 2004
Re: High performance linear algebra?
This is one of my major concerns as well. It seems there is quite some way
to go in that respect, but I'm yet trying to understand what would have to
be done to get what is needed. As I see it, there are several issues:

1) Native multi-dimensional arrays. While not strictly necessary for
implementing a linear algebra library, they would certainly help and are
necessary for numerics in general.

2) Array expressions. They are in the specification but not yet implemented.
I have the feeling that the current specification is by far not flexible
enough. There are elementwise operations on arrays, but I see no way to
extend these in a library. Rather than pursuing this limited solution, I
would rather try to find a truly extendable concept for vector operations.
Basic vector operations should be handled by the compiler, but the user
should be able to extend vector operations.

3) Expression templates: These would be a way to do all the high-performance
vector operations in the library (avoiding temporary copies of arrays) as
it is done in Blitz++, Pooma, Boost and other high-performance C++
libraries. This solution clearly is inferior to true vectorizing capability
in the language, since the compiler cannot optimize it for a given machine.
Anyhow, it should still be possible in D, if the language wants to stand up
against C++. Currently, Expression Templates do not seem to be possible, as
two important features are missing:

* implicit instantiation of function templates
* overloadable assignment expression ("OpAssign")

The first one has been discussed over and over again. Walter wants to avoid
the complexity of it. The second one has not come up on this list lately
but has probably been discussed as well. In both cases, we'll probably have
to wait until someone either manages to convince Walter or finds an
alternative solution.

Answering to your specific questions:

> 1. Does the operator overload mechanism in D disallow writing high
> performance linear algebra classes (Point, Vector, Matrix)?

No problem there. The language spec contains a minor flaw concerning
non-commuting products, but it should be easy to repair.

> 2. How inlinable (i.e. with DMD) would such classes be?

Again, no problem. Inlining works just like in C++.

> 3. What about the fact that operators always return copies?

You would return a struct which is handled as value. If you want the actual
data handled as reference (like, if you have a large array) that struct can
simply encapsulate a pointer.

> 4. What about temporaries?  Are they magically optimized away?  :)
> 
> 5. The current trend in C++ is to use template expressions and template
> metaprogramming to make something like this:
> 
> Vector a, b, c;
> c = a + b * 0.5f;
> 
> compile into something like this:
> 
> c[0] = a[0] + b[0] * 0.5f
> c[1] = a[1] + b[1] * 0.5f
> c[2] = a[2] + b[2] * 0.5f
> 
> Note the lack of temporaries here.  Is this a possibility in D?  How far
> does DMD go in this direction?

These two questions go along and show the core problem: Expression templates
in C++ offer a tricky way to avoid these costly temporary copies. In D,
expression templates are not possible yet. Array expressions are not
extensible (i.e. cannot be passed to or returned from functions without
creating temporaries)

> 6. The current trend in C++ is also Boost, which is fugly.  Seems like
> lots of people want this included in C++ (a mistake IMO).  Any plans on
> including similar math functionality into D from the beginning - rather
> than as an ugly afterthought?  This could make D a real option in the
> scientific (and game!) programming domains.

Boost is a huge collection of very different modules for very different
purposes. What they have in common is, that they use template
meta-programming, with is a rather new programming technique in C++.
Compilers are only starting to support this technique and Boost is still
taking them to the limit. Using it in C++ is rather awkward, since many of
the language features were not designed with template meta-programming in
mind.

In D, templates were completely redesigned, taking much of the experience
gathered with C++. Meta-programming should be possible in a much cleaner
way here, and work will probably continue in that direction. Mixins, for
example - one of the newest language enhancements - seem to be far more
powerful than expected.

This all taken together: Whatever is happening in Boost for C++ will
probably happen in the standard library for D already. It will take some
time until the language is understood well enough to see how the standard
library should be designed, but it will certainly exploit the possibilities
of D better than the current C++ standard library does for C++.
May 25, 2004
Re: High performance linear algebra?
In article <c8ti50$anc$1@digitaldaemon.com>, Stephen Waits says...

>1. Does the operator overload mechanism in D disallow writing high 
>performance linear algebra classes (Point, Vector, Matrix)?

I'd say so, yes. I'm doing something similar with multiprecision arithmetic.
Basically you have two choices - use a struct or use a class. I chose to use a
class. Pros and cons are:

struct - PRO: copied by value, so a = b; followed by ++b; will not increment a.
CON: copied by value, therefore slow. CON: cannot have constructors or
destructors

class - PRO: copied by reference, therefore fast. CON: copied by reference, so a
= b; followed by ++b will increment a.

My solution was to use a class, but to disallow the ASSIGNING operator
overloads. So, I allow b = b + 1, but not ++b; This uses more temporaries, but I
don't believe that neccesarily makes things slower - this is D after all. It has
fast allocators and a garbage collector.

An interesting alternative option I had considered was to use a struct
CONTAINING a reference. That way I could implement copy-by-value semantics with
copy-by-reference efficiency. (This takes some thought, but it's do-able). But I
wanted constructors and destructors, so I didn't.



>2. How inlinable (i.e. with DMD) would such classes be?

D doesn't have the inline keyword, so just assume the compiler will make the
most efficient choice.


>3. What about the fact that operators always return copies?

They don't. "return this;" is a perfectly good statement. In my own library, if
you try to do a = a + 0; or b = b * 1; you'll get back the same reference you
fed in.



>4. What about temporaries?  Are they magically optimized away?  :)

I think they are magically left lying around on the heap until the garbage
collector comes along.


>5. The current trend in C++ is to use template expressions and template 
>metaprogramming to make something like this:
>
>	Vector a, b, c;
>	c = a + b * 0.5f;
>
>compile into something like this:
>
>	c[0] = a[0] + b[0] * 0.5f
>	c[1] = a[1] + b[1] * 0.5f
>	c[2] = a[2] + b[2] * 0.5f
>
>Note the lack of temporaries here.  Is this a possibility in D?  How far 
>does DMD go in this direction?

As noted above, D will magically inline stuff if it's sufficiently efficient to
do so, not otherwise. So you MIGHT get that, or you might get a function call.
But if the overhead of a function call is insignificant compared with the
overhead of + then who cares?


Arcane Jill
May 25, 2004
Re: High performance linear algebra?
In article <c8trcb$q3q$1@digitaldaemon.com>, Norbert Nemec says...

>* implicit instantiation of function templates
>* overloadable assignment expression ("OpAssign")
>
>The first one has been discussed over and over again. Walter wants to avoid
>the complexity of it. The second one has not come up on this list lately

opAssign?

Well, I recently suggested the := operator (colon equals), intended to implement
copy-by-value, and defined as:

a := b;
If a is a class, equivalent to a = new A(b), where A is the type of a.
If a is a struct or primative type, equivalent to a = b;

In the case of a struct, though, it would actally make sense for opAssign to
overload either = or := (not sure which one). Kinda pointless with classes
though, since that's what constructors are for. All we need is a better syntax
for calling them.

Come to think of it - it would make MORE sense to allow structs to have
constructors (but not destructors). That way, my proposed := could have the same
meaning for structs and classes, and opAssign would be completely unnecessary.

Arcane Jill
May 25, 2004
Re: High performance linear algebra?
Arcane Jill wrote:

> In article <c8trcb$q3q$1@digitaldaemon.com>, Norbert Nemec says...
> 
>>* implicit instantiation of function templates
>>* overloadable assignment expression ("OpAssign")
>>
>>The first one has been discussed over and over again. Walter wants to
>>avoid the complexity of it. The second one has not come up on this list
>>lately
> 
> opAssign?

Doesn't exist yet. My idea would be to have it only for structs.
High-performance numerics would probably not make use of classes at all.

My concept of this is not finished yet, so I won't go into details at this
point.

> Well, I recently suggested the := operator (colon equals), intended to
> implement copy-by-value, and defined as:
> 
> a := b;
> If a is a class, equivalent to a = new A(b), where A is the type of a.
> If a is a struct or primative type, equivalent to a = b;

I'm not sure := would be a good idea. As you say it would mostly be
syntactic sugar for "= new ...()" which is not too hard to type considering
how often it is needed. I think the danger of confusing "=" and ":=" would
outweigh the gain by saving a little on typing.

> In the case of a struct, though, it would actally make sense for opAssign
> to overload either = or := (not sure which one). Kinda pointless with
> classes though, since that's what constructors are for. All we need is a
> better syntax for calling them.

Exactly my idea: = on struct would default to elementwise copying unless
there is a opAssign defined in the struct.

> Come to think of it - it would make MORE sense to allow structs to have
> constructors (but not destructors).

True, that might even be the better alternative.

But still, I'm not convinced := would be a good idea...

Ciao,
Nobbi
May 25, 2004
Re: High performance linear algebra?
Arcane Jill wrote:

> In article <c8ti50$anc$1@digitaldaemon.com>, Stephen Waits says...
> 
>>1. Does the operator overload mechanism in D disallow writing high
>>performance linear algebra classes (Point, Vector, Matrix)?
> 
> I'd say so, yes. I'm doing something similar with multiprecision
> arithmetic. Basically you have two choices - use a struct or use a class.
> I chose to use a class. Pros and cons are:
> 
> struct - PRO: copied by value, so a = b; followed by ++b; will not
> increment a. CON: copied by value, therefore slow. CON: cannot have
> constructors or destructors
> 
> class - PRO: copied by reference, therefore fast. CON: copied by
> reference, so a = b; followed by ++b will increment a.

The speed really depends on the size of the data and the use. When inlining
functions, much of the copying of data can be optimized away, while the
dereferencing will stay. Therefore structs will often be faster than
classes, even if the data is a little larger than just one pointer. (Not
too large, of course)

>>4. What about temporaries?  Are they magically optimized away?  :)
> 
> I think they are magically left lying around on the heap until the garbage
> collector comes along.
> 
> 
>>5. The current trend in C++ is to use template expressions and template
>>metaprogramming to make something like this:
>>
>>Vector a, b, c;
>>c = a + b * 0.5f;
>>
>>compile into something like this:
>>
>>c[0] = a[0] + b[0] * 0.5f
>>c[1] = a[1] + b[1] * 0.5f
>>c[2] = a[2] + b[2] * 0.5f
>>
>>Note the lack of temporaries here.  Is this a possibility in D?  How far
>>does DMD go in this direction?
> 
> As noted above, D will magically inline stuff if it's sufficiently
> efficient to do so, not otherwise. So you MIGHT get that, or you might get
> a function call. But if the overhead of a function call is insignificant
> compared with the overhead of + then who cares?

Judging by these comments, I am not sure whether you, AJ, are aware of the
issues of "High performance linear algebra". Creating objects on the heap
is fairly efficient, but for hard-core number-crunching numerics it would
be a killer. Temporaries like those mentioned by Stephen cannot simply be
optimized away by a compiler, but are a fundamental problem unless you play
some really serious tricks. Maybe you want to have a read at
oonumerics.org: A good starting point would be the documentation of
blitz++:
       http://www.oonumerics.org/blitz/
There are a few papers that really describe the issues of combining the
expressiveness of object oriented programming with the need for utmost
performance.
May 25, 2004
Re: High performance linear algebra?
Norbert Nemec wrote:
> 
> be a killer. Temporaries like those mentioned by Stephen cannot simply be
> optimized away by a compiler, but are a fundamental problem unless you play
> some really serious tricks. Maybe you want to have a read at

This is very true..

Thanks guys for the responses.  A few more comments.

Ideally, D would allow us to easily take advantage of vectorized 
instructions.  SIMD is your friend here, whether it be AltiVec, SSE, 
MMX, 3DNow, or whatever.

This has been mostly achieved in C++ through the various template 
molestations.  But it's ugly.  REALLY UGLY.

I plan on doing some experiments in D in the near future to determine 
the feasibility of this.  At first glance, I suspect the performance 
won't be there.  In this day and age, nobody is willing to give up their 
pretty algebraic interfaces to linear algebra libraries..  VA = VB + BC 
instead of VecAdd(VA,VB,VC).  We want the nice algebraic interface AND 
the performance.  The cherry on top would be if we could get some 
elegance to go along it!

D, being somewhat beautifully designed, and still in its infancy, has a 
real opportunity here.  It could be THE language which displaces FORTRAN 
and C++ in the scientific and math programming communities - if this 
type of thing is considered and BUILT INTO the language from day one. 
The problem with LA libs in C++ is that they're a hack, and an 
afterthought, and it's a crap shoot as to how any given compiler will 
behave.  Ick...

As for my own personal interest?  I program games for a living, and 
would very seriously consider D for at least some modules on a future 
title - if the performance were there.

Thanks,
Steve

P.S. As an interesting side note, Codeplay has a C++ compiler for 
several platforms with vectorization in mind. 
(http://www.codeplay.com/).  They also have PS2 (both EE & VU) versions, 
but it never seemed to catch on.  Instead most people are apparently 
using "nasty C++ tricks" [my term].
May 25, 2004
Re: High performance linear algebra?
Hi

I where working on some array suggestions here
http://dlanguage.netunify.com/56

it is not nearly finished but 
when we are talking on the subject 
I better publish it.

It would be nice if you could help
out by adding ideas, links and options.

So, we have all suggestions for D high performance linear algebra
in one place.

Knud
June 07, 2004
Re: High performance linear algebra?
Just thought I'd see if anyone else has any more thoughts on this?  It's 
something that's really bugging me about D, and I see no solutions 
popping up.

Thanks,
Steve
June 07, 2004
Re: High performance linear algebra?
Stephen Waits wrote:

> Just thought I'd see if anyone else has any more thoughts on this?  It's
> something that's really bugging me about D, and I see no solutions
> popping up.

I've been spending quite some thought on it, but there are a number of
individual steps that have to be taken one-by-one:

1) Multidimensional arrays (proposal is online, but it will take probably a
while to get it to some usable state)

2) Array expressions - somehow I have the feeling, the current specs (which
are not yet implemented at all) are by far not powerful enough to offer
serious vector programming. Especially, vectorization has to be possible
across functions:

----------------
real[] add(real[] a,real[] b) {
       return a + b;
}

real[] a = ...;
real[] b = ...;
real[] c = a + b;
----------------

I have no idea whether such a thing is possible at all, and whether it can
be inlined without a temporary array.

3) Alternatively: array expressions, which would allow to do everything in
the library. For these, a number of features would be needed. Most
importantly: implicit instantiation of functions.

4) Once we are there, a library will have to be written to use these
vectorization techniques for linear algebra.

Of course, such a library could be written, but the lack of (1) would make
it rather clumsy to use and the lack of either (2) or (3) will not allow
vectorization of complex expressions. (i.e. data has always to be stored in
temporary arrays to return it from functions.)
« First   ‹ Prev
1 2
Top | Discussion index | About this forum | D home