View mode: basic / threaded / horizontal-split · Log in · Help
April 22, 2004
Re: D in scientific computing
Touche! <G>

All I can say is DTL 0.1 will be out as soon as Walter can find the bandwidth to
give me the lang/comp changes I need. It will contain sequence containers. I'm
hoping that this can be within a week, but it's hard to say at this point.

DTL 0.2 will be out once I get feedback on things from people, and will probably
contain some tree and/or associative containers.

Once the basic lang/comp support is there for what I'm trying to do, I see no
reason why the library cannot evolve quickly, and with the input of other
contributors.


"Stephan Wienczny" <wienczny@web.de> wrote in message
news:c66vig$2n7s$1@digitaldaemon.com...
> Matthew wrote:
>
> > I agree with that.
> >
> > Want to write them?
> >
>
> >
> I actually started to months ago, but then wanted to wait until DTL
> finishes ;-)
April 22, 2004
Re: noalias, restrict, and Fortran optimizations
Walter wrote:
> "Norbert Nemec" <Norbert.Nemec@gmx.de> wrote in message
>> Object references on the other hand are far easier to check: objects
>> cannot overlap, so two references are either equal or noalias.
> 
> This is, unfortunately, not true when you get into interfaces. It's also
> possible for pointers into class objects, as well as arrays referencing
> into class objects.

OK, at that point it really gets messy. Anyway: if a interface reference
points into an object, it should certainly be possible to recover a pointer
to the object itself? This, of course, adds a little overhead to the
checking algorithm, but in debugging mode that should still be acceptable.

>> If you want to
>> allow the compiler to optimize a routine for nonaliased arguments, just
>> put in a precondition prohibiting references to identical objects.
> 
> Historically, adding in special keywords for such optimizations has not
> worked out well. That's why I was thinking of making it implicit for D
> function parameters.

True, the language would be simpler that way. Anyway: 

* You will no only have to think about function arguments but also about
references that are stored in objects. Every time the sourcecode handles
two references, it should be possible to tell the compiler that they are
not aliased. And for this, it I would suggest a builtin function "bool
nonaliased(x,y)" that takes two references and checks whether they refer to
disjunct portions of the memory. Then you just put an
"assert(nonaliased(x,y))" before critical portions of the code and the
compiler can happily optimize.

* even for function arguments: there certainly are plenty of cases, where it
makes perfect sense to pass two references to the same object to some
function. I wonder whether it is worth giving up all of these to be able to
optimize in certain cases?

* if you take my proposal and assume that references may be aliased in
general, but allow to give a powerful means (like the above mentioned
"nonaliased"-builtin) to specify exactly where the compiler is allowed to
optimize, then you don't restrict anyone unnecessarily. And still, authors
of timecritical code can examine and specify exactly what they mean be
"nonaliased".

>> * accept that pointers may always be aliased to anything and don't try to
>> optimize too much there.
> 
> This is a good idea, but I'm concerned it may not be sufficient.

Guess then, that is a misunderstanding: by "pointers" I mean raw, C-like
pointers. You already agreed that these should be allowed to alias
anything. Whoever uses pointers just has to accept that they don't get
utmost optimization.

>> * prohibit aliased object references by explicit preconditions/assertions
> 
> Having the compiler insert runtime checks for debug builds is a good idea.
> Unfortunately, as you pointed out, adding runtime checks for aliased
> pointers is impossible.

Again: forget about pointers. They may alias anything and cannot efficiently
be checked. If anyone wants full optimization, they should use arrays,
slices and object references that have enough semantics for the compiler to
check for aliasing.
April 22, 2004
Re: D in scientific computing
lacs wrote:

> Add primitive-units-checking to D and a lot of people who work with 
> numbers will fall in love with D.
<snip>

If only we had a rational number type built in, implementing by 
templates would be straightforward.  Just assign each primitive unit to 
a different prime number.

Otherwise, we could define types to represent units (possibly based on a 
rational number implementation) and values.  Of course, this would move 
unit checking to the runtime and so wouldn't be good for 
performance-critical apps.

The other approach is to define a struct for each unit (primitive and in 
combination) that the program is going to use.  Operations would be 
defined to take and return the right types.  This would be compile-time 
checking, but require quite some repetitive code to be written.

> As you migth have guessed, the notation float<meter> has nothing to 
> do with c++ templates and wouldnt interfere with it since it can only 
> be used with primitives.
<snip>

Not sure about that.  Unless we're going to restrict units to a list of 
ad-hoc keywords, it'll break CFG just as well as the C++ template syntax 
does by itself.

Stewart.

-- 
My e-mail is valid but not my primary mailbox, aside from its being the
unfortunate victim of intensive mail-bombing at the moment.  Please keep
replies on the 'group where everyone may benefit.
April 22, 2004
Re: noalias, restrict, and Fortran optimizations
"Norbert Nemec" <Norbert.Nemec@gmx.de> wrote in message
news:c67t93$170u$1@digitaldaemon.com...
> * You will no only have to think about function arguments but also about
> references that are stored in objects. Every time the sourcecode handles
> two references, it should be possible to tell the compiler that they are
> not aliased. And for this, it I would suggest a builtin function "bool
> nonaliased(x,y)" that takes two references and checks whether they refer
to
> disjunct portions of the memory. Then you just put an
> "assert(nonaliased(x,y))" before critical portions of the code and the
> compiler can happily optimize.

If I'm understanding this correctly, it has the same problem that the
"restrict" and "noalias" keywords in C have - it's too confusing to users to
use correctly, as well as being aesthetically not so pleasing. I think it
would be better to have the compiler assume they are not aliased (since that
is by far the usual case) and have to say when they are not aliased. Also, a
runtime check that they really are not aliased might be appropriate in debug
mode.

Now, since aliasing is sadly allowed in C functions, I was thinking:

   extern (C) int func(int a[], int b[])     // a and b may be aliased
   extern (D) int func(int a[], int b[])    // a and be must be disjoint
April 22, 2004
Re: noalias, restrict, and Fortran optimizations
Walter wrote:
> "Norbert Nemec" <Norbert.Nemec@gmx.de> wrote in message
> news:c67t93$170u$1@digitaldaemon.com...
>> * You will no only have to think about function arguments but also about
>> references that are stored in objects. Every time the sourcecode handles
>> two references, it should be possible to tell the compiler that they are
>> not aliased. And for this, it I would suggest a builtin function "bool
>> nonaliased(x,y)" that takes two references and checks whether they refer
> to
>> disjunct portions of the memory. Then you just put an
>> "assert(nonaliased(x,y))" before critical portions of the code and the
>> compiler can happily optimize.
> 
> If I'm understanding this correctly, it has the same problem that the
> "restrict" and "noalias" keywords in C have - it's too confusing to users
> to use correctly, as well as being aesthetically not so pleasing. I think
> it would be better to have the compiler assume they are not aliased (since
> that is by far the usual case) and have to say when they are not aliased.
> Also, a runtime check that they really are not aliased might be
> appropriate in debug mode.
> 
> Now, since aliasing is sadly allowed in C functions, I was thinking:
> 
>     extern (C) int func(int a[], int b[])     // a and b may be aliased
>     extern (D) int func(int a[], int b[])    // a and be must be disjoint

Yes, it may be confusing to users, but then - nobody has to use the feature.
Only those people trying to squeeze out performance. People writing numeric
libraries etc. - they will gladly accept the fine tuning capabilites.

As I said: a simple solution as in Fortran will not buy you much. References
can not only come through function arguments but also from object members.
Saying that function arguments may not be aliased only covers part of the
problem.

The fact that makes Fortran 77 so highly optimizable without much language
overhead is, that it doesn't have pointers or references at all. Aliasing
could *only* come through function arguments. So once this is prohibited,
the fortran compiler can simply assume that *nothing whatsoever* is
aliased.

I don't know how Fortran 95 handles this issue, but I guess that, as soon as
you use pointers, performance goes down.

In D, since we have references everywhere, any real solution to the aliasing
problem will get a bit more complex.
April 28, 2004
Re: noalias, restrict, and Fortran optimizations
In article <c66irg$216j$1@digitaldaemon.com>, Walter says...
>
>
>"Norbert Nemec" <Norbert.Nemec@gmx.de> wrote in message
>news:c65fmi$2js$1@digitaldaemon.com...
>> From what I have read so far, D really has the potential to close the gap
>> between C++ and Fortran and in this way gain a huge share in the
>> scientific/high-performance area of computing. Anyway, to have any chance
>> to go there, much care has to be taken now.
>>
>> People unfamiliar with numeric programming often wonder why Fortran still
>> has such a huge share among scientists. Many scientists still use
>Fortran77
>> and even those who have moved to Fortran95 only use it for its modern
>> syntax, never touching the advanced concepts of it. And this is not only
>> because they don't know better, but also because it is extremely hard to
>> match the performance of Fortran77! (OK, 99% of the reason might actually
>> be the lazyness to learn a different language and the existing code-base,
>> but still people usual argue based on the superior performance of the
>> language)
>
>What Fortran has over C is the 'noalias' on function parameters which allows
>for aggressive optimization. What I'm thinking of is writing the spec for D
>functions so that parameters are always 'noalias' (for extern (C) functions
>this would not apply).
>
>What do you think?
>
>For reference: http://www.lysator.liu.se/c/restrict.html

I am a programmer, working in a scientific area (bioinformatics).  I think it
would be bad to imply no-aliasing, because it trades safety for performance.

A lot of the code here is written by biologists and/or statisticians (some of
whom are quite brilliant, but only a few are trained as programmers).  They are
going to go nuts trying to find bugs like this.

If you work on the "heavy lifting" code that really needs performance, you
generally understand about the cache line size etc; you can be trusted to know
to use the "restrict" keyword.  If you are a cytologist, writing new statistics
functions, you will not know enough to use the "may-alias" keyword.  The burden
of knowing that the tradeoffs has to be on the performance-guru; not on the
other guy.

Kevin
April 29, 2004
Re: D in scientific computing
On 2004-04-21 13:33:46 +0200, Norbert Nemec <Norbert.Nemec@gmx.de> said:

> Stewart Gordon wrote:
>> The features I like in F90 (and which are useful for SP) are built-in
>> vector arithmetic and aggregate functions.  The former is in the D
>> language, it just needs to be finally put into the compiler.  I've
>> briefly suggested the latter....
>> http://www.digitalmars.com/drn-bin/wwwnews?D/21671
> 
> I don't really know how much of that has to be in the language. It is nice
> to have arrays as part of the language, but vectors and matrices should
> rather be defined in the library. There are plenty of efficient ways to
> deal with arrays, which can be implemented in a very efficient way by
> optimizing (perhaps also parallelizing) compilers.
> 
> Vector arithmetic actually gives arrays special matrix semantics which is
> not what you would want in general. By far not every array is a matrix so
> why should it behave like one?
> 
> vector arithmetic and aggregate functions may just as well be defined in the
> library. Just encapsulate arrays in your own class and give it all the
> semantics you need, without forcing everyone else to get that semantics
> when his arrays are something completely different.
I agree that matrices, which are basically mathematical tools from 
linear algebra, do not belong in the core language. But powerful 
multidimensional arrays are important, if you are to avoid the mess 
that C++ is when it comes to high-performance programming.

To be effective, you must be able to create multidimensional static 
arrays at run time on the heap. As far as I can tell, you can only set 
the size of dynamic arrays at run time at the moment, but the data in 
these arrays are not contiguous in memory, and thus not good for most 
high-performane computting.

We really need to be able to do this, to get a continuous 2d array:

int n = 10, m = 20;
double[][] a = new double[n][m];

Drew McCormack
April 29, 2004
Re: D in scientific computing
On 2004-04-21 20:53:10 +0200, "Walter" <walter@digitalmars.com> said:

> 
> "Norbert Nemec" <Norbert.Nemec@gmx.de> wrote in message
> news:c65fmi$2js$1@digitaldaemon.com...
>> From what I have read so far, D really has the potential to close the gap
>> between C++ and Fortran and in this way gain a huge share in the
>> scientific/high-performance area of computing. Anyway, to have any chance
>> to go there, much care has to be taken now.
> 
> I agree totally with you, which is why D has several features with a
> numerics focus. If there are any I missed, I want to know about it.

As you would have gathered, had you read my previous posts, I am 
strongly in favor of more powerful multidimensional arrays. This should 
not be left to library writers like in C++, in my view. A library 
writer can, in theory, write a powerful and fast class, but it holds 
back adoption in practice because you get problems compiling the 
library on different platforms, and performance may sometimes be subpar 
for a given platform. It also often leads to ugly syntax. Basically, it 
leaves too much to chance to be left to a library.

Here are some more for the wish list:

It is important to be able to create a contiguous multidimensional 
array on the heap at run time. There are very few occasions where I 
know how big an array is at compile time.

Elementwise operations are important too, but I think these are already 
on the list of things to do. Note that they should work with arrays of 
any shape, eg,

double[10][5] a, b;
double[][] c;
// Initialize a and b here
..
c = a + b;

Something which I missed when I was working yesterday with D was array 
literals. These don't work:

double[] a = { 1, 2, 3, 4 };
double[4] a = { 1, 2, 3, 4 };

It would be good if they did, and even that array literals were allowed 
in other contexts:

funcToDoSomething( someArgument, {1, 2, 3, 4} );

It would be nice if slicing worked for multidimensional arrays:

double [][] c = a[4..5][16,.19];

Of course, there are any number of elementwise functions you can think 
of (eg max, min, sin etc), but these would belong in the library. I am 
more concerned to get powerful multidimensional arrays. It is possible 
to build the other stuff yourself, but if you have to write a 
multidimensional array class that is high performance, you end up back 
in the C++ expression template moras.

Drew McCormack
Free University, Amsterdam
April 29, 2004
Re: D in scientific computing
Drew McCormack wrote:
> double[] a = { 1, 2, 3, 4 };
> double[4] a = { 1, 2, 3, 4 };
> 
> It would be good if they did, and even that array literals were allowed 
> in other contexts:
> 
> funcToDoSomething( someArgument, {1, 2, 3, 4} );
> 
> Drew McCormack
> Free University, Amsterdam
> 

http://digitalmars.com/d/arrays.html#bounds

Scroll down to "Array Initialization"... it seems you use brackets. 
(meaning the kind in my name - [ and ].)

It doesn't work, of course, for associative arrays though.. sadly... I'm 
not sure if that's meant to change or not.

-[Unknown]
April 29, 2004
Re: D in scientific computing
"Drew McCormack" <drewmccormack@mac.com> wrote in message
news:c6q5h7$2u5k$1@digitaldaemon.com...
> On 2004-04-21 13:33:46 +0200, Norbert Nemec <Norbert.Nemec@gmx.de> said:
>
> > Stewart Gordon wrote:
> >> The features I like in F90 (and which are useful for SP) are built-in
> >> vector arithmetic and aggregate functions.  The former is in the D
> >> language, it just needs to be finally put into the compiler.  I've
> >> briefly suggested the latter....
> >> http://www.digitalmars.com/drn-bin/wwwnews?D/21671
> >
> > I don't really know how much of that has to be in the language. It is nice
> > to have arrays as part of the language, but vectors and matrices should
> > rather be defined in the library. There are plenty of efficient ways to
> > deal with arrays, which can be implemented in a very efficient way by
> > optimizing (perhaps also parallelizing) compilers.
> >
> > Vector arithmetic actually gives arrays special matrix semantics which is
> > not what you would want in general. By far not every array is a matrix so
> > why should it behave like one?
> >
> > vector arithmetic and aggregate functions may just as well be defined in the
> > library. Just encapsulate arrays in your own class and give it all the
> > semantics you need, without forcing everyone else to get that semantics
> > when his arrays are something completely different.
> I agree that matrices, which are basically mathematical tools from
> linear algebra, do not belong in the core language. But powerful
> multidimensional arrays are important, if you are to avoid the mess
> that C++ is when it comes to high-performance programming.

Have you looked at the very efficient (though poorly documented) multidimensional
arrays in STLSoft (http://stlsoft.org/)? There's fixed_array for
fixed-dimensional arrays (up to four) with dimension extents variable at runtime,
and frame_array (soon to be renamed to something more meaningful) which have a
fixed number of dimensions and fixed extents (a thin veneer for STL compatility
over built-in arrays).

I reckon they're about as close to the bone as C++ will let you get. I'd be
interested in hearing your opinions of the implementations.

Their storage is contiguous
1 2 3 4
Top | Discussion index | About this forum | D home