Negative array indices? (page 3)

May 11, 2004

Re: Negative array indices?

Posted by Harvey Stroud
in reply to Norbert Nemec

Permalink

Harvey Stroud

Posted in reply to Norbert Nemec

Permalink

----- Original Message ----- 
From: "Norbert Nemec" <Norbert.Nemec@gmx.de>
Newsgroups: digitalmars.D
Sent: Monday, May 10, 2004 10:48 AM
Subject: Re: Negative array indices?

> Harvey Stroud wrote:
>
> > It seems to me that supporting the negative array index of C for the
sake
> > of backward compatibility goes against the design philosophy for D...
>
> For pointers, negative indices actually make sense. If you allow indexing
of
> raw pointers (which I think is a good idea) then prohibiting negative indices would be strange. For arrays, negative indices are, of cause, caught by the range checking mechanism.

 I think I should have read the language spec more before posting, as I was
assuming from the following that -ve indices were valid for arrays:

  "The reason it's a very bad idea is that array subscripting
  in C and C++ and D can be done with signed integers because it is legal
_and  meaningful_ to pass a -ve subscript to mean prior to the given base
(pointer  and/or array)."

Of course, this isn't quite the case with arrays as runtime bounds checking won't allow this, although whether switching off this mechanism via a compiler switch would circumvent this I'm not sure.

I can see why the introduction of -ve indices to have a different behaviour would impose (slight) overhead on the runtime, and while this overhead must be already present with bounds checking, at least the latter is optional and may be compiled out. With -ve indexing implying a different semantic the checking would always have to remain regardless, unless it was only allowed for (compile time detectable) literals, which is bad as it wouldn't be orthogonal.

>
> Raw pointers, of course, are error prone. Anyway it's the philosophy of D
to
> give the developer all the tools to shoot himself in the foot, but make it clear what the dangerous tools are, and encourage him to avoid these tools completely.

Yup, I completely agree. If the programmer still wants the raw power of pointers then let them have it. Btw, I wasn't suggesting that -ve indexing for such pointers should be prohibited - that would just be wacky.

>
> > Introducing a special operator ($) to denote the length strikes me as ungainly, making the code more perl-like, but perhaps that's just my dislike of none C symbols.
>
> That's just personal taste. $ has no meaning in D so far, and it is a
plain
> ASCII character. Why not put it to use?
>

Agreed, just my preference.  I think what I don't like about it is that it's an arbitrary symbol denoting some magic value. To the uninitiated it looks odd. Ok, it'd wouldn't take long to get used to but still, it seems a step in the direcion perl has taken in using such arbitrary symbols, and look how unreadable that is. Probably a very minor point though.

> B.t.w: in the suggested meaning, $ would not be a normal operator at all, but something special that does not exist in D so far: a "zero-ary operator" or however you want to call it.
>
> > Has anybody given any thought to an [optional] stride value:
> >
> > int[] x = a[1..10 : 2];    // Gets every other element of the array
>
> See my multidimension array proposal at
>
>
http://homepages.uni-regensburg.de/~nen10015/documents/D-multidimarray.html
>
> it contains strided slices and much more.
>

Wow, there's a lot to chew over in that doc!  I've only had chance to skim it so far but it looks like a lot of good thought's gone into it. I really like the notation of the indices being within the same set of brackets a[m,n] for rectangular arrays as this suggests a tighter coupling of the array elements than the a[n][m] notation for dynamic arrays; both notations are appropriate to reflect the underlying nature of the data types. I look forward to seeing your next draft.

Cheers,
Harvey.

Harvey Stroud wrote: > ----- Original Message ----- > From: "Norbert Nemec" <Norbert.Nemec@gmx.de> >> >> See my multidimension array proposal at >> >>http://homepages.uni-regensburg.de/~nen10015/documents/D-multidimarray.html >> >> it contains strided slices and much more. > > Wow, there's a lot to chew over in that doc! I've only had chance to skim it so far but it looks like a lot of good thought's gone into it. I really like the notation of the indices being within the same set of brackets a[m,n] for rectangular arrays as this suggests a tighter coupling of the array elements than the a[n][m] notation for dynamic arrays; both notations are appropriate to reflect the underlying nature of the data types. I look forward to seeing your next draft. Thanks. The basic idea still is rather simple, but explaining it in detail really took more effort than I myself would have expected. I would suggest to wait for the next version of the proposal before reading it in detail. I have a number of changes to make already, and running it through a spellchecker might also improve readability...

Norbert Nemec wrote: > Stewart Gordon wrote: <snip> > The current situation is: dynamic arrays actually are references to the heap. Two arrays may reference the same portion of the heap, so changing one will change the other. Anyhow, the language does everything to obscure this fact and make it rather hard to predict, when it happens. Unless you really know the details, you will often call .dup without need, If you want to guarantee that it's a separate copy, of course you'd call dup. Of course, a decent compiler would coalesce two statements int[] qwert = yuiop.dup; qwert.length = asdfg; into a single allocation operation. > and, in the other way, you will have trouble if you trust that two arrays refer to the same space. To which someone might say, "Don't do that then!" At the moment I can see little use for wanting to access one array by what's effectively another, longer array. <snip> > Conversion to Fortran arrays is already trivially possible. (A few convenience functions might make it even more comfortable.) Everything else should be easy to implement. <snip> True, if the strides remain those for a new array. But if you've been playing with strided/block/diagonal slicing, then unless Fortran arrays support striding on this level, you'd need to do some rearrangement. Stewart. -- My e-mail is valid but not my primary mailbox, aside from its being the unfortunate victim of intensive mail-bombing at the moment. Please keep replies on the 'group where everyone may benefit.

Stewart Gordon wrote: > Norbert Nemec wrote: > >> and, in the other way, you will have trouble if you trust that two arrays refer to the same space. > > To which someone might say, "Don't do that then!" > > At the moment I can see little use for wanting to access one array by what's effectively another, longer array. For strings, it might not be very useful. For arrays in general, though, there are many cases where it really is extremely useful. Imagine a 1GB array in memory, maybe representing a huge multidimensional matrix or whatever. You would really want to be able to handle multiple references to portions of that data in a comfortable way without the risk of suddenly getting a copy unintentionally. > <snip> >> Conversion to Fortran arrays is already trivially possible. (A few convenience functions might make it even more comfortable.) Everything else should be easy to implement. > <snip> > > True, if the strides remain those for a new array. But if you've been playing with strided/block/diagonal slicing, then unless Fortran arrays support striding on this level, you'd need to do some rearrangement. Of course. If a given fortran routine expects data aligned in memory in a given way, you might need to copy the data to that alignment before passing a reference to the fortran routine. Anyhow: if D is able to handle arrays in arbitrary alignment and striding, you may often be able to handle the data in Fortran alignment for a long time without necessary conversions. Example: get an array from Fortran, use a D-library function on it, pass it back to Fortran. No conversion necessary, because the D library can easily handle the array no matter how it is aligned in memory, because the alignment information is fully enclosed in the array reference with minimal (with good optimization: neglectible) overhead in terms of access time. Furthermore: writing a wrapper for a Fortran library, the wrapper can do all necessary conversions automatically, without doing any unnecessary conversions back and forth.

Norbert Nemec wrote: <snip> > For strings, it might not be very useful. For arrays in general, though, > there are many cases where it really is extremely useful. Imagine a 1GB > array in memory, maybe representing a huge multidimensional matrix or > whatever. You would really want to be able to handle multiple references to > portions of that data in a comfortable way without the risk of suddenly > getting a copy unintentionally. <snip> You can, if you allocate the matrix first and then start creating windows of it. Slice references don't unintentionally turn into copies. (Of course, I'm not sure what happens if you increase the length of a slice reference, but if that's an issue you'd avoid it anyway for this kind of work.) As long as the matrix doesn't grow, you're safe. If the matrix wants to be variable in size, you can still treat it as being one size (a reasonable maximum, whatever that may be) for allocation purposes. Of course, if no maximum is reasonable, or you bump into an unreasonable circumstance, you'd need to deal with reallocation whether the .length property is there and assignable or not. Stewart. -- My e-mail is valid but not my primary mailbox, aside from its being the unfortunate victim of intensive mail-bombing at the moment. Please keep replies on the 'group where everyone may benefit.

Stewart Gordon wrote: > Norbert Nemec wrote: > > <snip> >> For strings, it might not be very useful. For arrays in general, though, there are many cases where it really is extremely useful. Imagine a 1GB array in memory, maybe representing a huge multidimensional matrix or whatever. You would really want to be able to handle multiple references to portions of that data in a comfortable way without the risk of suddenly getting a copy unintentionally. > <snip> > > You can, if you allocate the matrix first and then start creating > windows of it. Slice references don't unintentionally turn into copies. > (Of course, I'm not sure what happens if you increase the length of a > slice reference, but if that's an issue you'd avoid it anyway for this > kind of work.) As long as the matrix doesn't grow, you're safe. Guess, it is just a question of documenting clearly what happens. It should just be absolutely clear which operations might copy data. By now, I have even been convinced to cut the paragraph about making .length read only. Anyhow: it definitely has to be documented in which way it works, what exactly .dup does, etc. > If the matrix wants to be variable in size, you can still treat it as being one size (a reasonable maximum, whatever that may be) for allocation purposes. Of course, if no maximum is reasonable, or you bump into an unreasonable circumstance, you'd need to deal with reallocation whether the .length property is there and assignable or not. OK. Guess, I'll just accept that the behaviour upsizing by assigning to .length is not predictable if you don't know where the array reference came from. B.t.w.: assigning to range[] in my multidimensional arrays is even more tricky, since you have to consider the full shape to see whether upsizing in place might be possible. I'm still not sure whether it might be best to allow assignment to length in one-dimensional arrays, but leave the range property read-only.

Forums