Thread overview
NDim-slicing
Mar 10, 2006
Norbert Nemec
Mar 10, 2006
pragma
Mar 10, 2006
pragma
Mar 12, 2006
Norbert Nemec
Mar 12, 2006
Ivan Senji
March 10, 2006
Hi there,

after two years of silence, I just found a little time to return my thoughts to D and the issues of NDim-arrays I started back then. (No, I never completely forgot about D, I just had other things that I had to give priority.)

My design proposal for NDim arrays needs a complete overhaul before discussion of details makes sense. However, there is one cucial issue that needs to be solved and I have no idea how it could be done:

Currently we have OpIndex and OpIndexAssign fit for overloading multidimensional indexing. OpSlice and OpSliceAssign, however, work only for one dimension. I've been thinking how to generalize these, but it turns out to be extremely tricky. First though is: Why not simply extend OpSlice to take N pairs of boundaries? But then, what about mixed expressions like 'a[1,4..7]' ?

Working with Python, I use expressions like that all the time, and I think it is crucial that Ndim-arrays in D allow this kind of slicing as well. For native Ndim arrays, it should be straightforward to handle such expressions in the compiler. However, how should they be overloaded?!?

Python goes the way of taking a tuple of objects as indices where '4..7' would then be translated into a "slice object". The indexing-overloading routine then goes through all the objects at run-time and decides whether each one is a slicing or an indexing operation. All of this is rather simple with dynamic typing and accepting the run-time overhead.

In D, this is not an option. Indexing/Slicing has to be type-safe and the resulting type has to be known at compile time.

One rather clumsy solution that I came up so far is to split the resolve the
expression into two consecutive function calls:
a[1,4..7]
would become
a.OpIndexPartial(0,1).OpSlice(4,7)
where the first function is defined as
OpIndexPartial(int dim,int idx)
and returns an array one rank lower.

The assignment expression:
a[1,4..7] = b[0..3]
would turn into
a.OpIndexPartial(0,1).OpSliceAssign(b[0..3],4,7)

The ugly thing about this solution is, that some kind of marshal-object is needed which is returned from OpIndexAssign.


Another possibility would be to change
OpSlice(size_t start, size_t end)
into
OpIndex(size_t[2] slc)
and interpret a complex expression like
a[1,3..5,2,6..3]
as a call to
OpIndex(size_t idx0, size_t[2] slc1, size_t idx2, size_t[2] slc3)
I don't know, however, whether D templates would allow to implement the long
list of routines in some comfortable way.

All the really clean solution that I could think of would demand for tuple-types and list-handling at compile time, both things that have been discussed before but are at best a topic for the very far future.

Any ideas?


March 10, 2006
Perhaps I'm missing the boat, or maybe its just that long since you posted the original NDim array proposal, but what were the advantages to adding NDim arrays in D?

The reason why I ask is that aside from using rectangular arrays for the purposes of reducing computational and storage overhead, for very specific tasks (bitmap manipulation and matrices come to mind), the current soluiton in D works quite well.  After all, it doesn't get any more simple, reliable and easy to implement than limiting every array to exactly one dimension.  It's also quite elegant in that by disallowing rectangular arrays, the problems with wielding all of the variations in slicing and indexing become far less bothersome: this last fact is unintentionally implied by your most recent post.

I'm genuinely curious since you've obviously put a lot of effort behind all this, and I'm very much in the dark as to why.  Could you please enlighten me? :)

Thanks,


- EricAnderton at yahoo
March 10, 2006
Open mouth, insert foot.

Norbert, all,
Please disregard my last post and critique regarding rectangular arrays.  It
seems that I need to read the docs a little more carefully next time as D does
indeed support them. :(

- EricAnderton at yahoo
March 12, 2006
pragma wrote:
> Open mouth, insert foot.
> 
> Norbert, all,
> Please disregard my last post and critique regarding rectangular arrays.  It
> seems that I need to read the docs a little more carefully next time as D does
> indeed support them. :(

I think you should take that foot out of your mouth again. The questions in the last message were not that pointless after all:

Indeed the documentation talks about something called "rectangular arrays". However this concept only exists for static arrays, i.e. for arrays, where the size of all dimensions is known at compile time.

What is seriously missing are dynamic rectangular arrays.

Your first question, why these are seriously missing can be answered in several ways

* reducing computational and storage overhead
using "dynamical arrays of dynamical arrays" means that every access to
the data takes two dereferencing operations. Furthermore all the rows of
the arrays have to be allocated separately, contributing to memory
fragmentation.

* greatly enhancing the comfort of handling NDim data
especially in the field of numerical computation -- which is a core part
not only of scientific computing, but also also of game programming,
image processing and many other fields of programming -- handling
NDim-data is the daily bread of the programmer. It may not be an issue
to every programmer, but there certainly is a huge group of potential D
users who will consider NDim-array handling a crucial language feature.

* preparing the path towards vectorized expressions
this goes beyond "reducing overhead" towards "enabling tremendous
optimization potential".

The first two points might be solvable by an array library, but I believe that NDim arrays are such a core feature that they deserve native support. When we last discussed this issue two years ago, most people -- including Walter -- agreed that this should eventually become part of the language, even though it did not get highest priority.

The "Future plans" page of the D homepage contains the point "Array literal expressions" (which is, what I generalized under the term "vectorized expressions") -- I believe that the issue of multidimensional arrays should be solved first, before touching the issue of array expressions. Both depend strongly no each other. Doing one step without thinking of the others will make it a lot more difficult in the end.

> 
> - EricAnderton at yahoo

pragma wrote:
> Perhaps I'm missing the boat, or maybe its just that long since you
posted the
> original NDim array proposal, but what were the advantages to adding
NDim arrays
> in D?
>
> The reason why I ask is that aside from using rectangular arrays for the purposes of reducing computational and storage overhead, for very
specific tasks
> (bitmap manipulation and matrices come to mind), the current soluiton
in D works
> quite well.  After all, it doesn't get any more simple, reliable and
easy to
> implement than limiting every array to exactly one dimension.  It's
also quite
> elegant in that by disallowing rectangular arrays, the problems with
wielding
> all of the variations in slicing and indexing become far less
bothersome: this
> last fact is unintentionally implied by your most recent post.
>
> I'm genuinely curious since you've obviously put a lot of effort
behind all
> this, and I'm very much in the dark as to why.  Could you please
enlighten me?
> :)
>
> Thanks,
>
>
> - EricAnderton at yahoo
March 12, 2006
Norbert Nemec wrote:
> pragma wrote:
> 
>>Open mouth, insert foot.
>>
>>Norbert, all, Please disregard my last post and critique regarding rectangular arrays.  It
>>seems that I need to read the docs a little more carefully next time as D does
>>indeed support them. :(
> 
> 
> I think you should take that foot out of your mouth again. The questions
> in the last message were not that pointless after all:
> 
> Indeed the documentation talks about something called "rectangular
> arrays". However this concept only exists for static arrays, i.e. for
> arrays, where the size of all dimensions is known at compile time.
> 
> What is seriously missing are dynamic rectangular arrays.

I am badly missing this. A feature too fundamental to be missing (imo).