Dynamic rectangular arrays?

Apr 24, 2004

Norbert Nemec

Apr 24, 2004

Walter

Apr 24, 2004

Apr 24, 2004

Apr 24, 2004

Apr 28, 2004

Apr 28, 2004

Apr 27, 2004

Apr 27, 2004

Hi there, As far as I can see, there is no way to have rectangular arrays if you don't know their size at compile time? This definitely is a limitation! Of course, there is some overhead, if you have to store the shape of an array and calculate column offsets at run-time, but it should still be much more efficient than to use arrays of pointers like in C. My proposal, how this could be achived would be an extension of array pointers to something like: struct array3d(T) { T *ptr; size_t range0; size_t stride0; size_t range1; size_t stride1; size_t range2; size_t stride2; }; Here you would have as many rang/stride pair as the array has dimensions. If the compiler knows stride to be 1, it can optimize it away, reducing everything to the current 1d arrays. The range arguments are only needed for range checking. Element access only needs access to the ptr and the stride arguments. Additional benefits: * slicing possible not only to blocks but also to arbitrary sub-grids. * simple implementation of fortran-arrays and even reverse memory layouts Of course, all of this could be done in the library, but then - since we already have arrays as language elements, it would make sense to make truely multidimensional arrays part of the language as well. As for the syntax: I would propose to use A[a,b] as way to express rectangular multidimensional arrays and leave A[a][b] for ragged arrays as we have them now. Of course, the compiler could still optimize ragged arrays with fixed size, but that would be another matter. Array pointers, of course have to have a dimension known at compile time. If their range is dynamic, the type would, of course, be something like int[,] (This will, of course, look ugly for 17-dimensional arrays, but who needs these?) Ciao, Nobbi

"Norbert Nemec" <Norbert.Nemec@gmx.de> wrote in message news:c6d8br$1un2$1@digitaldaemon.com... > As far as I can see, there is no way to have rectangular arrays if you don't > know their size at compile time? That's correct. > This definitely is a limitation! Of > course, there is some overhead, if you have to store the shape of an array > and calculate column offsets at run-time, but it should still be much more > efficient than to use arrays of pointers like in C. > > My proposal, how this could be achived would be an extension of array pointers to something like: > > struct array3d(T) { > T *ptr; > size_t range0; > size_t stride0; > size_t range1; > size_t stride1; > size_t range2; > size_t stride2; > }; > > Here you would have as many rang/stride pair as the array has dimensions. If > the compiler knows stride to be 1, it can optimize it away, reducing everything to the current 1d arrays. > > The range arguments are only needed for range checking. Element access only > needs access to the ptr and the stride arguments. > > Additional benefits: > * slicing possible not only to blocks but also to arbitrary sub-grids. > * simple implementation of fortran-arrays and even reverse memory layouts > > Of course, all of this could be done in the library, but then - since we already have arrays as language elements, it would make sense to make truely multidimensional arrays part of the language as well. > > As for the syntax: I would propose to use A[a,b] as way to express rectangular multidimensional arrays and leave A[a][b] for ragged arrays as we have them now. Of course, the compiler could still optimize ragged arrays with fixed size, but that would be another matter. > > Array pointers, of course have to have a dimension known at compile time. If > their range is dynamic, the type would, of course, be something like > int[,] > > (This will, of course, look ugly for 17-dimensional arrays, but who needs > these?) Your idea is good, and will work, but at the moment you can achieve nearly the same thing by declaring a class with [] operator overloads.

Walter wrote: > Your idea is good, and will work, but at the moment you can achieve nearly the same thing by declaring a class with [] operator overloads. I'm trying just this right now. Let's see what come from it.

Norbert Nemec wrote: >Walter wrote: > > >>Your idea is good, and will work, but at the moment you can achieve nearly >>the same thing by declaring a class with [] operator overloads. >> >> > >I'm trying just this right now. Let's see what come from it. > > Could be something that goes into DTL... -- -Anderson: http://badmama.com.au/~anderson/

"J Anderson" <REMOVEanderson@badmama.com.au> wrote in message news:c6ddcb$2blh$1@digitaldaemon.com... > Norbert Nemec wrote: > > >Walter wrote: > > > > > >>Your idea is good, and will work, but at the moment you can achieve nearly the same thing by declaring a class with [] operator overloads. > >> > >> > > > >I'm trying just this right now. Let's see what come from it. > > > > > Could be something that goes into DTL... I think that should be pretty straightforward. I have two efficient (space/speed) and expressive rectangular array templates in STLSoft (http://stlsoft.org/) and I see no reason why I can't borrow them into DTL.

April 27, 2004

Re: Dynamic rectangular arrays?

Posted by Drew McCormack
in reply to Norbert Nemec

Permalink

Drew McCormack

Posted in reply to Norbert Nemec

Permalink


On 2004-04-24 10:31:48 +0200, Norbert Nemec <Norbert.Nemec@gmx.de> said:

> Hi there,
> 
> As far as I can see, there is no way to have rectangular arrays if you don't
> know their size at compile time? This definitely is a limitation! Of
> course, there is some overhead, if you have to store the shape of an array
> and calculate column offsets at run-time, but it should still be much more
> efficient than to use arrays of pointers like in C.
As someone coming from scientific programming, where Fortran is still king, I would like to chime in here. D is offering me hope that I may one day be able to leave Fortran behind, so I don't want you guys to mess it up at the last hurdle.

Fortran is a pretty rotten language, but if it gets one thing right, it is arrays. Fortran 90 that is. If D basically implements the Fortran 90 approach, leaving the extremely primitive C approach largely behind, I think it could capture the imagination of a lot of scientific developers. And be a much more powerful language to boot.

Since I assume most people here are from the C/C++ side of programming, here is some stuff you can do with fortran 90 arrays (in D-type syntax)

1) Create them on the stack, with a size not known at compile time (called automatic arrays)

void someFunc( int n ) {
	float [n] array;
}

2) Create them on the heap with a size not known at compile time
	int n, m;
 	n = 5;
	m = 10;
	float [] array = new float[n][m];

3) Query the size of any dimension
	float [10][5] array;
	array.size(1); // this is equal to 10
	array.size(2); // this is equla to 5

4) Have lower and upper bounds
	float [2..10][2..4] array;
	array.lowerbound(1); // this is 2
	array.upperbound(2); // this is 4

5) Slice arbitrary dimensions
	float [10][5] array;
	array[2..8][1..4];  // Gives a new rank 2 array

6) Include strides in array slices. These can also be negative.
	float[10][5] array;
	array[2..4:2][1..4:3]; // This is rank 2 array with indexes { { [2,1], [2,4] } { [4,1],[4,4] } }
	array[4..2:-2][1..4:3]; // This is rank 2 array with { { [4,1], [4,4] } { [2,1],[2,4] } }

7) Some powerful functions, like reductions, matrix multiplication, and 'broadcasting'

	float [10][5] matrix, resultmat;
	float [5] vector, resultvec;
	resultvec[] = matrixmult( matrix, vector );
	resultmat[][] = broadcast( vector, 1, 10 ); // broadcast expands vector into matrix by duplicating

Another way to do broadcasting is used in the python library Numpy: NewAxis. It is particularly elegant. It creates temporary arrays which are broadcasts of other arrays:

	matrixmultiply(  vector1[][newaxis], vector2[]  );
		// Turns vector into a temporay matrix, and multiplies by vector2

What I don't want, is to have to rely of my own array class and templates. That is just the C++ situation, and leads to all sorts of problems. Better to make your arrays powerful, and just put some more advanced functions (like matrixmultiply, for example) in the library.

Drew McCormack

Drew McCormack wrote: > 4) Have lower and upper bounds > float [2..10][2..4] array; > array.lowerbound(1); // this is 2 > array.upperbound(2); // this is 4 I clearly vote against this: it adds to the runtime-overhead. Even if it might be convenient in some cases, it never is necessary and may even make the code harder to read. > 7) Some powerful functions, like reductions, matrix multiplication, and 'broadcasting' Here, the question is how much has to be in the language, and how much should be put into the library instead. > What I don't want, is to have to rely of my own array class and templates. That is just the C++ situation, and leads to all sorts of problems. Better to make your arrays powerful, and just put some more advanced functions (like matrixmultiply, for example) in the library. Just my position. Arrays are just too basic to leave them to the library.

On 2004-04-24 11:56:50 +0200, Norbert Nemec <Norbert.Nemec@gmx.de> said: > Walter wrote: >> Your idea is good, and will work, but at the moment you can achieve nearly >> the same thing by declaring a class with [] operator overloads. > > I'm trying just this right now. Let's see what come from it. Did you try this Norbert? Any luck? Drew McCormack

Drew McCormack wrote: > On 2004-04-24 11:56:50 +0200, Norbert Nemec <Norbert.Nemec@gmx.de> said: > >> Walter wrote: >>> Your idea is good, and will work, but at the moment you can achieve nearly the same thing by declaring a class with [] operator overloads. >> >> I'm trying just this right now. Let's see what come from it. > > Did you try this Norbert? Any luck? I have something working, but it needs some more cleanup before I can put it up for download with a clear conscience. Maybe I'll find the time for that tonight, otherwise, it will have to wait for next week.

Forums