Arrays, Slices, Cases

enum properties
Feb 28, 2002 Sean L. Palmer
Feb 28, 2002 Pavel Minayev
Mar 01, 2002 Sean L. Palmer
Mar 01, 2002 Pavel Minayev
Mar 01, 2002 OddesE
Mar 01, 2002 Pavel Minayev
Mar 03, 2002 OddesE

February 22, 2002

Posted by Karl Bochert

Permalink

Karl Bochert

Permalink

Having too much time on my hands, I submit the following summary of my viewpoint.  At the very least it shows how I will rationalize the method that D uses.

I would welcome any other rationale that would make D's choices seem more natural.

(After 20 yrs of C, I still have fencepost problems! :-)

Concerning Arrays, slices, and cases.

1) Ordinal Arrays
     Arrays are ordered set of elements accessed by their position
     in the set.
     An array with N elements has a first element and an
     N'th element.
     A slice of the entire array is arr[1..N].
     A slice excluding the ends is arr[2..N-1].

  end-inclusion:
      a slice can be thought of as the result of a procedure
      that (somehow) extracts the range, similar to:
          result = slice (&arr[start], &arr[end]);
      Obviously, end is included.

      a slice can be thought of as the result of a loop that
	  'extracts' elements of the array, similar to:
         for (i = start, i <= end; ++i)  result[1+i-start] = arr[i];
      Obviously, like the for loop,  'end' is included.
	  Cases are also end-inclusive:
      case [1]:
      case [2 to 4]:
      case [5]:

  end-exclusion:
      Why would anyone do that?

-----------

2) Cardinal Arrays
    Arrays are chunks of storage accessed by the offset from
    their start.
    An array with N elements has a zero'th element and an N-1'th
    element.
    A slice of the entire array is arr[0..N-1].

    end-inclusion:
      A slice excluding the ends is arr[1..N-1].
      a slice can be thought of as the result of a procedure
      that (somehow) extracts the range, similar to:
          result = slice (&arr[start], &arr[end]);
      Obviously, end is included.
	  Cases are also end-inclusive:
      case [1]:
      case [2 to 4]:
      case [5]:


    end-exclusion:
      A slice excluding the ends is arr[1..N-2].
      a slice can be thought of as the result of a loop that
      'extracts' elements of the array, similar to:
         for (i = start, i < end; ++i) result[i-start] = arr[i];
      Obviously, like the for loop, end is excluded.
      Cases are also end-exclusive:
         case [1]:
         case [2 to 5]:
         case [5]:
      Cases are end-inclusive (different than slices)
         case [1]:
         case [2 to 4]:
         case [5]:


D is:
  Cardinal arrays, end-exclusive, case-inclusive?

I am for:
  Ordinal arrays, end-inclusive, case-inclusive.
  simpler and more consistant.

Karl Bochert

February 22, 2002

Re: Arrays, Slices, Cases

Posted by not here
in reply to Karl Bochert

Permalink

not here

Posted in reply to Karl Bochert

Permalink

On Fri, 22 Feb 2002 00:43:28 GMT, Karl Bochert <kbochert@ix.netcom.com> wrote:
> 
> Having too much time on my hands, I submit the following summary of my viewpoint.  At the very least it shows how I will rationalize the method that D uses.
> 

[big snip]

> 
> D is:
>   Cardinal arrays, end-exclusive, case-inclusive?
> 
> I am for:
>   Ordinal arrays, end-inclusive, case-inclusive.
>   simpler and more consistant.

Hi Karl,
whether one uses either a index or an offset to reference an array element, is often influenced by
what we have been exposed to already. However, I'd like to approach the issue in a different manner.

To me an 'index' is 1-based and is a normal way that people think about enumerated elements. If you go up to somebody with a list of items and asked them to number them, the person would normally start with the number 1.

An 'offset' is 0-based and is the normal way that computers get access to memory. Addr + offset gives another address - the start of the element in question.

I believe that programming languages have a primary aim of helping people describe their algorithms. In other words, programming languages are for people and not computers - that's why we have compilers. So, I would hold that 1-based array referencing is the normal way for people to describe what they are trying to do.

Furthermore, an index has the connotation that the entire element is being referenced, whereas an offset is better thought of referencing the start of an element. Thus a slice reference of say [2..4] seems to say to me that the slice encompases element#2, element#3, and element#4. That is the whole of each of these elements. The fact that the length of this slice is 3 is obvious because all of the elements are being referenced.

If we were using offsets in slice notation, then [2..4] would be saying that the slice starts from the start of element #3 and ends at the start of element #5. This represents all of element#3 and all of element#4, but not any of element#5, thus has a length of 2. But this is not how people normally view the world.

I vote with Karl on this one.  Besides, calculating the length of an index notation slice is not beyond us, especially if we can do "  myArray[x..y].length "

Now consider the way we might remove an element from a dynamic array.

Given that 'pos' references the element to be removed...

using Index Notation
   A1 = A1[1..pos-1] ~ A1[pos+1 .. A1.length]

using Offset Notation
   A1 = A1[0..pos] ~ A1[pos+1 .. A1.length-1]

Not a lot of difference really. Personally, I find that the index notation is more clearly telling the reader that I am trying to exclude the 'pos' element but include everything else.

-------
cheers.

February 22, 2002

Re: Arrays, Slices, Cases

Posted by Karl Bochert
in reply to not here

Permalink

Karl Bochert

Posted in reply to not here

Permalink

> Hi Karl,
> ...
> To me an 'index' is 1-based and is a normal way that people think about enumerated elements. If you
> go up to somebody with a list of items and asked them to number them, the person would normally
> start with the number 1.
> 
> An 'offset' is 0-based and is the normal way that computers get access to memory. Addr + offset gives another address - the start of the element in question.
> 
Exactly - its an issue of computer arrays vs. human arrays.

> 
> Furthermore, an index has the connotation that the entire element is being
> referenced, whereas an  offset is better thought of referencing the
> start of an element.
> 
Hadn't occured to me, but you are right.

> ...
>using Index Notation
>   A1 = A1[1..pos-1] ~ A1[pos+1 .. A1.length]
>
>using Offset Notation
>   A1 = A1[0..pos] ~ A1[pos+1 .. A1.length-1]

Didn't you get that wrong?

using Offset Notation (exclusive)
    A1 = A1[0..pos] ~ A1[pos+1 .. A1.length]

using Offset Notation (inclusive)
    A1 = A1[0..pos-1] ~ A1[pos+1 .. A1.length-1]

(I think thats right -- I had to draw little boxes
on a sheet of graph paper)

 I have this sneaky feeling that the reason D uses offsets instead of
indexes is to be backward- compatible with C, and therefore
more familiar.


Karl Bochert

February 22, 2002

Re: Arrays, Slices, Cases

Posted by not here
in reply to Karl Bochert

Permalink

not here

Posted in reply to Karl Bochert

Permalink

On Fri, 22 Feb 2002 05:49:50 GMT, Karl Bochert <kbochert@ix.netcom.com> wrote:
> > Hi Karl,
> > ...
> > To me an 'index' is 1-based and is a normal way that people think about enumerated elements. If
you
> > go up to somebody with a list of items and asked them to number them, the person would normally start with the number 1.
> > 
> > An 'offset' is 0-based and is the normal way that computers get access to memory. Addr + offset gives another address - the start of the element in question.
> > 
> Exactly - its an issue of computer arrays vs. human arrays.
> 

I really think that too many language designers forget that its people that have to actually use them, and not computers. The "user-interface" for most programming languages is sub-optimal. Often the language encourages hard-to-comprehend syntax thus making it easier for people to make mistakes.

> > ...
> >using Index Notation
> >   A1 = A1[1..pos-1] ~ A1[pos+1 .. A1.length]
> >
> >using Offset Notation
> >   A1 = A1[0..pos] ~ A1[pos+1 .. A1.length-1]
> 
> Didn't you get that wrong?
> 
> using Offset Notation (exclusive)
>     A1 = A1[0..pos] ~ A1[pos+1 .. A1.length]
> 
> using Offset Notation (inclusive)
>     A1 = A1[0..pos-1] ~ A1[pos+1 .. A1.length-1]
> 
> (I think thats right -- I had to draw little boxes
> on a sheet of graph paper)

Ooops. You are right. I did get the 'index' code wrong. That might be example of its inherent non- user-friendly interface ;-)

    A1 = A1[0..pos] ~ A1[pos+1 .. A1.length]

is what I should have coded. To the average person, knowing the 'pos' refers to the element being removed, this code looks wrong as it seems to be including A1[pos]!

>  I have this sneaky feeling that the reason D uses offsets instead of
> indexes is to be backward- compatible with C, and therefore
> more familiar.
> 

More familar to whom? C/C++ coders? One would have hoped that D might be used as a replacement for C/C++ and thus newbies can learn a "better" language and not have to be backward compatible. Also, reading the D Overview we find under the things to drop from C/C++

"C source code compatibility. Extensions to C that maintain source compatiblity have already been
done (C++ and ObjectiveC). Further work in this	area is hampered by so much legacy code it is
unlikely that significant improvements can be made."

----
cheers.

February 22, 2002

Re: Arrays, Slices, Cases

Posted by Sean L. Palmer
in reply to not here

Permalink

Sean L. Palmer

Posted in reply to not here

Permalink

"not here" <not.known@this.address.com> wrote in message news:1103_1014364009@news.digitalmars.com...
> On Fri, 22 Feb 2002 05:49:50 GMT, Karl Bochert <kbochert@ix.netcom.com>
wrote:
> > > To me an 'index' is 1-based and is a normal way that people think
about enumerated elements. If
> you
> > > go up to somebody with a list of items and asked them to number them,
the person would normally
> > > start with the number 1.
> > >
> > > An 'offset' is 0-based and is the normal way that computers get access
to memory. Addr + offset
> > > gives another address - the start of the element in question.
> > >
> > Exactly - its an issue of computer arrays vs. human arrays.

> I really think that too many language designers forget that its people
that have to actually use
> them, and not computers. The "user-interface" for most programming
languages is sub-optimal. Often
> the language encourages hard-to-comprehend syntax thus making it easier
for people to make mistakes.

I don't know what planet you guys are from... go use BASIC or something if you want arrays that start at position 1 instead of 0.

Computer arrays start at 0.  Every programmer needs to learn this right away.  It's very fundamental, and trying to "humanize" it just results in a language that requires suboptimal code generation.

I personally think they should teach people about zero earlier on in school, then we wouldn't have this problem.  How would you like that?  ;)

Sean

February 22, 2002

Re: Arrays, Slices, Cases

Posted by Pavel Minayev
in reply to not here

Permalink

Pavel Minayev

Posted in reply to not here

Permalink

>     A1 = A1[0..pos] ~ A1[pos+1 .. A1.length]
>
> is what I should have coded. To the average person, knowing the 'pos'
refers to the element being
> removed, this code looks wrong as it seems to be including A1[pos]!

To the average C/C++/C#/Java programmer, it looks just as it should.

> More familar to whom? C/C++ coders? One would have hoped that D might be
used as a replacement for
> C/C++ and thus newbies can learn a "better" language and not have to be
backward compatible. Also,
> reading the D Overview we find under the things to drop from C/C++

I believe Walter said that D is not a language for the beginners. BASIC, or even Pascal would be better for this purpose. D is a practical language for practical programmers, and I don't think it's the best idea to sacrifice speed to gain such a subtle simplicity, IMO...

> "C source code compatibility. Extensions to C that maintain source
compatiblity have already been
> done (C++ and ObjectiveC). Further work in this area is hampered by so
much legacy code it is
> unlikely that significant improvements can be made."

"compatibility" means ability to compile code from that language. This is what D is not for. But there are many programmers that know only C (or C++, or C#, or Java - the same language family) - and those people expect to find a common environment to start coding quick, without having to learn everything from scratch. Arrays are indexed from 0, every C programmer should remember that better than his own name - why disappoint them? 0-based indexing is a tradition too old to change it - it's better to live with it, especially since it's not hard to get used to it.

February 22, 2002

Re: Arrays, Slices, Cases

Posted by Pavel Minayev
in reply to Sean L. Palmer

Permalink

Pavel Minayev

Posted in reply to Sean L. Palmer

Permalink

"Sean L. Palmer" <spalmer@iname.com> wrote in message news:a552bk$17v$1@digitaldaemon.com...

> I personally think they should teach people about zero earlier on in
school,
> then we wouldn't have this problem.  How would you like that?  ;)

Great idea! So, I have two cars, the zeroeth is red and the first is blue =)

February 22, 2002

Re: Arrays, Slices, Cases

Posted by not here
in reply to Sean L. Palmer

Permalink

not here

Posted in reply to Sean L. Palmer

Permalink

On Fri, 22 Feb 2002 01:18:54 -0800, "Sean L. Palmer" <spalmer@iname.com> wrote:
> 
> I don't know what planet you guys are from... go use BASIC or something if you want arrays that start at position 1 instead of 0.
> 
> Computer arrays start at 0.  Every programmer needs to learn this right away.  It's very fundamental, and trying to "humanize" it just results in a language that requires suboptimal code generation.
> 
> I personally think they should teach people about zero earlier on in school, then we wouldn't have this problem.  How would you like that?  ;)

This sounds a lot like "Well thank you, Ma'am, but quite frankly, that's not how we do things around these parts".

I would have thought that with D, we have a chance to break free of the computer-centric way of doing things and instead design a language that makes life easier for coders at every possible chance. If people all around the world, in all cultures (except it seems, vetern coders), count off things starting with one, why should we have to "retrain" them to start thinking as if they are a computer.

Yes, I know that computer arrays start at 0. Just like my high school ruler also started at zero. But that first inch is still inch #1 and not inch #0.

If one is truely concerned with suboptimal code generation, we would all be still creating hand-crafted assembler (or even machine code) programs. All we are talking about here is sometimes generating a "subtract one"  opcode or similar, and todays, let alone tomorrow's, computers are very, very fast.

Isn't a compiler a tool? A tool for people to use? To make our lives easier? So let our compilers take what is normal for people and convert it for computer usage, rather than having the language make people convert what is normal for them into computer-ese.

-------
cheers.

February 22, 2002

Re: Arrays, Slices, Cases

Posted by not here
in reply to Pavel Minayev

Permalink

not here

Posted in reply to Pavel Minayev

Permalink

On Fri, 22 Feb 2002 16:09:36 +0300, "Pavel Minayev" <evilone@omen.ru> wrote:
> >     A1 = A1[0..pos] ~ A1[pos+1 .. A1.length]
> >
> > is what I should have coded. To the average person, knowing the 'pos'
> refers to the element being
> > removed, this code looks wrong as it seems to be including A1[pos]!
> 
> To the average C/C++/C#/Java programmer, it looks just as it should.

Should I infer then that the average D programmer is always going to be an average C/C++/C#/Java programmer too? Is this a short-sighted attidude for the future of D? Can we not expect COBOL coders to come over the fence? If not, why not?

> > More familar to whom? C/C++ coders? One would have hoped that D might be
> used as a replacement for
> > C/C++ and thus newbies can learn a "better" language and not have to be
> backward compatible. Also,
> > reading the D Overview we find under the things to drop from C/C++
> 
> I believe Walter said that D is not a language for the beginners. BASIC, or even Pascal would be better for this purpose.

I assume that Walter is referring to people who are just learning to program. I would have thought that the fewer new things that people have to learn, the sooner they can become productive. If this is so, then it would appear than a design goal for D is to assume new comers to D will be existing C/etc coders so they don't have to learn too many new things. Oh well, maybe we are condemed to repeat history.

> D is a practical language
> for practical programmers,

.meaning that Basic and Pascal are NOT practical languages, and their users are NOT practical? Ummm. Sounds a little xenophobic to me.

> and I don't think it's the best idea to
> sacrifice speed to gain such a subtle simplicity, IMO...

Why is that we spend hours of coding time to optimise a few micro-seconds into a program? We no longer live in the age when computer time is more expensive than people time. It seems you are willing to sacrifice coders time rather than computer time. I don't think it's the best idea to sacrifice coding speed, IMO.

> > "C source code compatibility. Extensions to C that maintain source
> compatiblity have already been
> > done (C++ and ObjectiveC). Further work in this area is hampered by so
> much legacy code it is
> > unlikely that significant improvements can be made."
> 
> "compatibility" means ability to compile code from that language. This is what D is not for. But there are many programmers that know only C (or C++, or C#, or Java - the same language family) - and those people expect to find a common environment to start coding quick, without having to learn everything from scratch. Arrays are indexed from 0, every C programmer should remember that better than his own name - why disappoint them?

Heaven forbid that we should try to retrain C coders! Everyone knows that we are sacrosanct and must be protected.
Every good C programmer knows how useful the macro preprocessor is (oops, that's not in D is it?)
Every good C++ programmer knows how useful multiple inheritance can be (ooops, that not in D is it?)
Every good C++/Java/C# programer knows how useful namespaces can be (ooops, that's not in D is it?)
Every C programmer can type #include files in their sleep (ooops, that's not in D is it?)

Yes I know these are a little unfair. But what I'm trying to get across is that D will already force C coders to learn/unlearn things. So why not have 1-based indexes, just like we use in the real world.

> 0-based indexing is a tradition
> too old to change it - it's better to live with it, especially
> since it's not hard to get used to it.

Hey we got tradition! You can't mess that baby. Sure it makes things a bit harder but you'll soon get used to that.

Is this the same as saying "We can't do that new thing because its not what we currently do"?

I could just as equally say "1-based indexing is not hard to get used to, seeing you already do it everywhere else except when you are thinking like a computer."

-----
cheers

February 22, 2002

Re: Arrays, Slices, Cases

Posted by Roberto Mariottini
in reply to Pavel Minayev

Permalink

Roberto Mariottini

Posted in reply to Pavel Minayev

Permalink

"Pavel Minayev" <evilone@omen.ru> ha scritto nel messaggio news:a55g08$7e8$1@digitaldaemon.com...
> >     A1 = A1[0..pos] ~ A1[pos+1 .. A1.length]
> >
> > is what I should have coded. To the average person, knowing the 'pos'
> refers to the element being
> > removed, this code looks wrong as it seems to be including A1[pos]!
>
> To the average C/C++/C#/Java programmer, it looks just as it should.

So I'm not an "average"  C/C++/Java programmer, therefore I use them only
since 1991/93/96.
I know how slicing currently works in D, but I had to double check to
understand.

[...]
> But there are many programmers that know
> only C (or C++, or C#, or Java - the same language family) - and
> those people expect to find a common environment to start coding
> quick, without having to learn everything from scratch. Arrays are
> indexed from 0, every C programmer should remember that better than
> his own name - why disappoint them? 0-based indexing is a tradition
> too old to change it - it's better to live with it, especially
> since it's not hard to get used to it.

I always wondered why C and derivatives don't have a way to define a start
index like Pascal does. To me it seems better to leave to the compiler the
task
to subtract the start index from the actual index. In C you write:

int occurrencies['Z'-'A'];
for (i = 0; i < size; ++i)
{
    ++occurrencies[s[i]-'A'];
}

Here the task to subtract 'A' to every indexing is left to the programmer.

Maybe the compiler could live with an optional initial index to subtract
every time
the array is accessed.

Ciao

Top | Forum index | About this forum

Forums