January 30, 2005
Hi there,

triggered by a message from Walter about implicit function template instantiation, I just sat down and wrote some example code for expression templates in D. The result is attached.

Short summary: expression template programming is not yet possible in D. Many details show that the template mechanism in general is very promising compared to the C++, but implicit function template instantiation is such a crucial feature for ET that the whole attempt is doomed.

======================================================================

For those not acquainted with the expression template technique:

The idea is to handle expressions at compile time in such a way that, p.e. array operations can be done in a very efficient way without need for temporary copies.

Traditionally, when implementing an array class, you would do

-----------------
struct array {
 float[] data;

 [...]

 array opAdd(array other) {
  array res;
  res.data = new float[SIZE];
  for(int i=0;i<SIZE;i++)
   res.data[i] = (*this).data[i] + other.data[i];
  return res;
 }

 [...]
}
-----------------

This is as simple as inefficient: Doing
 a = b+c+d+e
with all quantities being arrays means that you create four temporary arrays
on the heap. It is much more efficient to write an explicit loop by hand.

The expression template technique in C++ allows to write a library that transforms the above expression into such an efficient loop at compile time.

=================================================================

The attached code is a first attempt to do the same thing in D. Unless somebody can point me to completely new ideas solving my problems, this first crude attempt shows two distinct points:

-----------
* without implicit function template instantiation, D is no powerful enough, yet to allow writing such a library in any sensible manner.

-----------
* structs in D show several deficiencies:

- structs should be able to have real constructors. The 'static opCall' trick is a nice hack but not a solution. In this case, the struct carries a reference to a dynamic array which need to be initialized. This is not possible by either 'static opCall' or static initializers.

- opAssign is badly needed for structs. Lacking any mechanism for overloading implicit casting, at least assignment should be overloadable. Maybe assignment from an identical type should still be restricted to the default plain-copying. But for assignment between different types, opAssign should be possible

- opCast is a toy but not a solution. For one, it should apply to implicit casting as well, for the other, it needs to be selective of the destination type.

Maybe a powerful implicit casting mechanism would relieve the need of
opAssign. Anyway: opCast can always only describe casting *to* a type,
never *from* it, so extensibility of existing code is restriced if only the
former can be overloaded.
===============================
For the moment, I'll leave it at this. There certainly is enough to swallow
and discuss so far...

Ciao,
Norbert


January 31, 2005
Norbert Nemec wrote:
<snip>
> Traditionally, when implementing an array class, you would do
<snip code snippet>
> This is as simple as inefficient: Doing 
>  a = b+c+d+e
> with all quantities being arrays means that you create four temporary arrays on the heap.

Actually, there are only three temporary arrays in this case - the fourth becomes the new referent of a.

> It is much more efficient to write an explicit loop by hand.

When we finally get array operations, presumably the compiler will optimise the temporaries away to this level.

Of course, if function calls are involved then this might be harder....

> The expression template technique in C++ allows to write a library that transforms the above expression into such an efficient loop at compile time.

A nice idea.  Maybe the underlying implementation of array operations could make use of this.

Of course, don't forget multiplication, negation and others, not to mention operations of an array with a scalar.  I *guess* the extension to these is fairly straightforward....

<snip>
> - opCast is a toy but not a solution. For one, it should apply to implicit casting as well, for the other, it needs to be selective of the destination type.
<snip>

I imagine that it was a conscious decision not to have programmer-defined implicit casting, in order to avoid the complication of both implementing it and understanding some more complex expressions. 
 But your second point here remains perfectly valid.

Stewart.

-- 
My e-mail is valid but not my primary mailbox.  Please keep replies on the 'group where everyone may benefit.
January 31, 2005
Stewart Gordon wrote:

> Norbert Nemec wrote:
> <snip>
>> Traditionally, when implementing an array class, you would do
> <snip code snippet>
>> This is as simple as inefficient: Doing
>>  a = b+c+d+e
>> with all quantities being arrays means that you create four temporary
>> arrays on the heap.
> 
> Actually, there are only three temporary arrays in this case - the fourth becomes the new referent of a.

Correct: three temporaries, since opAdd is called three times.

>> It is much more efficient to write an explicit loop by hand.
> 
> When we finally get array operations, presumably the compiler will optimise the temporaries away to this level.

This is yet another subject. True - once we have array operations as part of the language, the need for an array class is lifted. Anyhow: arrays are just one of many uses of expression templates. Take a look at the boost++ library and you will find many more. I believe both are needed in D: array/vector-operations as well as the possibility for expression template programming.

B.t.w: I still believe that array operations as they are currently described in the specs are ill-defined and not implementable. We had the topic a while ago and I'll probably pick it up again sometimes in the future. For now, I think array operations should be purged from the specs and picked up again after 1.0 is out of the door.

>> The expression template technique in C++ allows to write a library that transforms the above expression into such an efficient loop at compile time.
> 
> A nice idea.  Maybe the underlying implementation of array operations could make use of this.

Array operations as part of the language are a completely different subject. Havint the compiler to computations at compile time is nothing exciting. The key of expression templates is, that these compile-time computations are specified in the library.

> I imagine that it was a conscious decision not to have
> programmer-defined implicit casting, in order to avoid the complication
> of both implementing it and understanding some more complex expressions.
>   But your second point here remains perfectly valid.

Same thing as for implicit function template instantiations. Of course, the language becomes simpler to define and implement if the programmer has to do things explicitely. It is just that it prohibits certain programming styles.

The key point of expression template programming is, that types are recursive structures of templates. The compiler can easily handle such a tree. Having to write any of these types explicitely in the program code is extremely nasty.

I'm pretty sure that Walter did this kind of decisions intentionally. I just don't think he considered expression template programming when he did so.

January 31, 2005
Norbert Nemec wrote:
> Stewart Gordon wrote:
> 
>> Norbert Nemec wrote: <snip>
>> 
>>> Traditionally, when implementing an array class, you would do
>> 
>> <snip code snippet>
>> 
>>> This is as simple as inefficient: Doing
>>> a = b+c+d+e with all quantities being arrays means that you create four temporary arrays on the heap.
>> 
>> Actually, there are only three temporary arrays in this case - the fourth becomes the new referent of a.
> 
> Correct: three temporaries, since opAdd is called three times.

Oops, miscounted again.

There are only two temporaries - b+c and b+c+d (assuming LTR evaluation, which isn't guaranteed).  OTOH, b+c+d+e is a permanent since it is assigned to a by reference.

>>> It is much more efficient to write an explicit loop by hand.
>> 
>> When we finally get array operations, presumably the compiler will optimise the temporaries away to this level.
> 
> This is yet another subject. True - once we have array operations as part of the language, the need for an array class is lifted.

We already do.  We just don't have it as part of the compiler.

> Anyhow: arrays are just one of many uses of expression templates. Take a look at the boost++ library and you will find many more. I believe both are needed in D: array/vector-operations as well as the possibility for expression template programming.
> 
> B.t.w: I still believe that array operations as they are currently described in the specs are ill-defined and not implementable. We had the topic a while ago and I'll probably pick it up again sometimes in the future. For now, I think array operations should be purged from the specs and picked up again after 1.0 is out of the door.
<snip>

What would be the point of that?  Moreover, if this were to happen, should "Numerical programmers" be removed from the "Who D is for" list?

Stewart.

-- 
My e-mail is valid but not my primary mailbox.  Please keep replies on the 'group where everyone may benefit.
January 31, 2005
"Norbert Nemec" <Norbert@Nemec-online.de> wrote in message news:ctj46q$f52$1@digitaldaemon.com...
> Hi there,
>
> triggered by a message from Walter about implicit function template instantiation, I just sat down and wrote some example code for expression templates in D. The result is attached.
>
> Short summary: expression template programming is not yet possible in D.
> Many details show that the template mechanism in general is very promising
> compared to the C++, but implicit function template instantiation is such
> a
> crucial feature for ET that the whole attempt is doomed.

I haven't looked at the details of your code but it reminds me of a similar task I faced when writing the D wrappers around the GMP library (GMP is a C library for performing arithmentic on arbitrary precision floating point and arbitrary size ints). It was similar because I wanted to support opAdd and opMul and friends and I also had data on the heap with these potentially large ints and floats. So what I ended up doing was make a decision that the interface use a free list to recycle temporary values. The impact of this is that the interface is single threaded (by default - users can maintain their own object pools if they want) and that the user has to do a little work to make sure they play nicely with the interface. The benefit is that the code (the library code and the user code) is very simple and any error message you get are easy to debug. There is very little magic. Plus when I ran the D version vs the C++ expression template version for some random computation like

 for( int k=0; k< 10000; k++)
  assign(x, x*x - 2*(x+1) );

the D version was *faster* than the C++ version by a large amount (3.2 sec vs 2.7 sec). That could be specific for the GMP C++ expression templates - I don't know if my experience generalizes to other expression template libraries.

For more details about the interface the link to the GMP D wrapper is http://home.comcast.net/~benhinkle/gmp-d/

-Ben


January 31, 2005
Stewart Gordon wrote:
>>>> It is much more efficient to write an explicit loop by hand.
>>> 
>>> When we finally get array operations, presumably the compiler will optimise the temporaries away to this level.
>> 
>> This is yet another subject. True - once we have array operations as part of the language, the need for an array class is lifted.
> 
> We already do.  We just don't have it as part of the compiler.

See the point below. I don't consider the current specification of array operations well-defined. Proving this point is impossible since the wording is too unclear to really understand what is meant.

Unless somebody proves me wrong by producing a working implementation (thereby demonstrating what the words in the specs mean exactly) I consider that section of the specs as non-existant.

>> B.t.w: I still believe that array operations as they are currently described in the specs are ill-defined and not implementable. We had the topic a while ago and I'll probably pick it up again sometimes in the future. For now, I think array operations should be purged from the specs and picked up again after 1.0 is out of the door.
> <snip>
> 
> What would be the point of that?  Moreover, if this were to happen, should "Numerical programmers" be removed from the "Who D is for" list?

I am a numerical programmer as well - otherwise I would not rant so much about numerical issues.

I think that D has great potential as a numeric language in the future, but currently, numerical programmers are still better of using other languages. (Of course - depending on the kind of work you do: if you need neither top performance nor special libraries, D has everything you need already and many benefits in areas outside of numerics as well.)

I personally cannot use D yet for my work, but I have the hope that it will one day have everything needed to reach surpass Fortran and C++.

I particular, I believe that powerful multidimensional arrays as part of the language are essential for numerical work. Walter has clearly stated that these will have to wait for 2.0 so at this point, detailed discussion about them is moot.

The array operations as they are specified are not very powerful, cannot be extended and do not fit well with multidimensional arrays. I have some idea about how it could be solved differently, but over the past months, I never found the time to really work it out and write it down completely.

Currently, the issue does not seem of general interest, but if people start discussing these issues again, I will certainly join in and maybe we'll arrive at some results that can then be implemented whenever Walter decides to do so.

Ciao,
Norbert
January 31, 2005
Ben Hinkle wrote:

> the D version was *faster* than the C++ version by a large amount (3.2 sec vs 2.7 sec). That could be specific for the GMP C++ expression templates - I don't know if my experience generalizes to other expression template libraries.

Expression templates certainly are not a technique to solve all problems. Before resorting to expression templates, one should make sure to precisely understand the performance bottleneck, otherwise the chance to pull down performance is just as good as to get a gain.

Specifically, the need for avoiding temporaries is not only the problem of memory allocations but also that of exploiting the cache in the right way. If the array that you handle become bigger then the cache (several MB, depending on the architecture) you have to be very careful to work locally as much as possible.

For modern architectures, cache optimization may easily give you a 10-fold performance gain for common data-intensive numerical computations.

In D, much of that optimization might be possible within the compiler *if we have multidimensional arrays and vectorized expressions* - otherwise the compiler just does not have enough possibilities to change the order of expressions.

Alternatively, these optimizations could be done in the library as it is done with boost++ in C++. For that, expression templates are needed.

I think that D should go both ways as complements. It will probably always depend on the details of the problem, which of the options might be the better solution.

January 31, 2005
In article <ctlugk$1cqa$1@digitaldaemon.com>, Norbert Nemec says...
>
>Currently, the issue does not seem of general interest, but if people start discussing these issues again, I will certainly join in and maybe we'll arrive at some results that can then be implemented whenever Walter decides to do so.

I fully acknowledge that I am a complete idiot when it comes to "numerics programming".  From posts that you, and others, have written on the topic, I find myself on top of a conceptual iceberg.  I'm curious about the topic, so I'm doing my best to educate myself enough to understand where D comes up short. So far, I haven't a clue what's beneath the water.

So far, all I gather is that D lacks a certain kind of expressiveness, and that mere compiler optimizations, GC and array slicing aren't enough to get the critical performance needed for certain tasks.  Is this correct?  Are there any resources you reccmomend that I could read so I can be more informed?

I don't speak for everyone here, but perhaps the lack of interest in the topic really comes from a general lack of understanding of the problem at hand (and hence its importance for folks like yourself).  At least, that's still my situation.  I'm not at all disinterested, I just don't know enough to contribute helpfully.

- EricAnderton at yahoo
January 31, 2005
"pragma" <pragma_member@pathlink.com> wrote in message news:ctm2se$1ie6$1@digitaldaemon.com...
> In article <ctlugk$1cqa$1@digitaldaemon.com>, Norbert Nemec says...
> >
> >Currently, the issue does not seem of general interest, but if people
start
> >discussing these issues again, I will certainly join in and maybe we'll arrive at some results that can then be implemented whenever Walter
decides
> >to do so.
>
> I fully acknowledge that I am a complete idiot when it comes to "numerics programming".  From posts that you, and others, have written on the topic,
I
> find myself on top of a conceptual iceberg.  I'm curious about the topic,
so I'm
> doing my best to educate myself enough to understand where D comes up
short. So
> far, I haven't a clue what's beneath the water.
>
> So far, all I gather is that D lacks a certain kind of expressiveness, and
that
> mere compiler optimizations, GC and array slicing aren't enough to get the critical performance needed for certain tasks.  Is this correct?  Are
there any
> resources you reccmomend that I could read so I can be more informed?
>
> I don't speak for everyone here, but perhaps the lack of interest in the
topic
> really comes from a general lack of understanding of the problem at hand
(and
> hence its importance for folks like yourself).  At least, that's still my situation.  I'm not at all disinterested, I just don't know enough to
contribute
> helpfully.
>

It is the same with me. Don't know much about the topic, but realize that it
is
important and am interested in discussions about it, but will spend more
time
reading the discussions than writing.

> - EricAnderton at yahoo


February 01, 2005
pragma wrote:
> I fully acknowledge that I am a complete idiot when it comes to "numerics
> programming".  From posts that you, and others, have written on the topic, I
> find myself on top of a conceptual iceberg.  I'm curious about the topic, so I'm
> doing my best to educate myself enough to understand where D comes up short. So
> far, I haven't a clue what's beneath the water.
> 
> So far, all I gather is that D lacks a certain kind of expressiveness, and that
> mere compiler optimizations, GC and array slicing aren't enough to get the
> critical performance needed for certain tasks.  Is this correct?  Are there any
> resources you reccmomend that I could read so I can be more informed?

I'm kinda in the same page, but I wouldn't say I can see the whole top of the iceberg :D.

> 
> I don't speak for everyone here, but perhaps the lack of interest in the topic
> really comes from a general lack of understanding of the problem at hand (and
> hence its importance for folks like yourself).  At least, that's still my
> situation.  I'm not at all disinterested, I just don't know enough to contribute
> helpfully.

Maybe D hasn't drawn that much attention from the numerical community.

>
> - EricAnderton at yahoo

_______________________
Carlos Santander Bernal
« First   ‹ Prev
1 2 3
Top | Discussion index | About this forum | D home