Thread overview
Char[] confusing
Mar 02, 2009
Qian Xu
Mar 02, 2009
Lutger
Mar 02, 2009
Qian Xu
Mar 03, 2009
grauzone
Mar 03, 2009
Rainer Deyke
Mar 03, 2009
Robert Fraser
Mar 03, 2009
bearophile
Mar 06, 2009
Stewart Gordon
Mar 02, 2009
BCS
Mar 02, 2009
Denis Koroskin
March 02, 2009
Hi,

I am confusing with getting sub-string of a char[].

------------------------- code ---------------------------------
module main;

import tango.io.Console;
import tango.text.convert.Integer;

void main()
{
  char[] s = "ABCDE"; // 5 chars
  int len = s.length;
  Cout("s='" ~ s ~ "', length=" ~ toString(len)).newline;
  Cout("s[" ~ toString(len-1) ~ "]= " ~ s[len-1]).newline;
  Cout("s[0 .. " ~ toString(len-1) ~ "]= " ~ s[0 .. len-1]).newline;
  Cout("s[0 .. " ~ toString(len) ~ "]= " ~ s[0 .. len]).newline;
  Cout("s[1 .. " ~ toString(len-1) ~ "]= " ~ s[1 .. len-1]).newline;
  Cout("s[1 .. " ~ toString(len) ~ "]= " ~ s[1 .. len]).newline;
}
------------------------- code ---------------------------------

The result is (dmd + windowsxp)

s='ABCDE', length=5
s[4]= E
s[0 .. 4]= ABCD
s[0 .. 5]= ABCDE
s[1 .. 4]= BCD
s[1 .. 5]= BCDE

-------------------------------------------------------

My question is: why s[4]=E, but s[0..4]=ABCD (without E)







-- 
Xu, Qian (stanleyxu)
 http://stanleyxu2005.blogspot.com
March 02, 2009
Qian Xu wrote:

> Hi,
> 
> I am confusing with getting sub-string of a char[].
> 
> ------------------------- code ---------------------------------
> module main;
> 
> import tango.io.Console;
> import tango.text.convert.Integer;
> 
> void main()
> {
>    char[] s = "ABCDE"; // 5 chars
>    int len = s.length;
>    Cout("s='" ~ s ~ "', length=" ~ toString(len)).newline;
>    Cout("s[" ~ toString(len-1) ~ "]= " ~ s[len-1]).newline;
>    Cout("s[0 .. " ~ toString(len-1) ~ "]= " ~ s[0 .. len-1]).newline;
>    Cout("s[0 .. " ~ toString(len) ~ "]= " ~ s[0 .. len]).newline;
>    Cout("s[1 .. " ~ toString(len-1) ~ "]= " ~ s[1 .. len-1]).newline;
>    Cout("s[1 .. " ~ toString(len) ~ "]= " ~ s[1 .. len]).newline;
> }
> ------------------------- code ---------------------------------
> 
> The result is (dmd + windowsxp)
> 
> s='ABCDE', length=5
> s[4]= E
> s[0 .. 4]= ABCD
> s[0 .. 5]= ABCDE
> s[1 .. 4]= BCD
> s[1 .. 5]= BCDE
> 
> -------------------------------------------------------
> 
> My question is: why s[4]=E, but s[0..4]=ABCD (without E)

s[4] means the fifth element of s[]
s[0..4] is a slice from the first to the fifth, but not including the fifth
element. The last element in a slice is always one past the end of that
slice.

March 02, 2009
Reply to Qian,

> Hi,
> 
> I am confusing with getting sub-string of a char[].
> 
[..]
> 
> My question is: why s[4]=E, but s[0..4]=ABCD (without E)
> 

Having the fist number be included and the second not works better than the other options.

Consider what would have to change to make these work for the the other options:

arr[0 .. n] and arr[n .. arr.length] cover the full array

arr[0 .. 0] is empty

arr[0 .. arr.length] is the full array


March 02, 2009
Lutger wrote:
> s[4] means the fifth element of s[]
> s[0..4] is a slice from the first to the fifth, but not including the fifth element. The last element in a slice is always one past the end of that slice. 
> 

Thank you both.
I have to do math in mind in order to keep my code correct ;-)


IMO, it does not make any sense.
s[start..end]

  end - is neither the count of characters in the array slice,
           nor the end index of the slice.
        it is just the index after the real end character.

-- 
Xu, Qian (stanleyxu)
 http://stanleyxu2005.blogspot.com
March 02, 2009
On Tue, 03 Mar 2009 02:03:23 +0300, BCS <ao@pathlink.com> wrote:

> Reply to Qian,
>
>> Hi,
>>  I am confusing with getting sub-string of a char[].
>>
> [..]
>>  My question is: why s[4]=E, but s[0..4]=ABCD (without E)
>>
>
> Having the fist number be included and the second not works better than the other options.
>
> Consider what would have to change to make these work for the the other options:
>
> arr[0 .. n] and arr[n .. arr.length] cover the full array
>
> arr[0 .. 0] is empty
>
> arr[0 .. arr.length] is the full array
>
>

That and also

assert(arr[a..b].length == (b - a)); // evaluates to true always

March 03, 2009
Qian Xu wrote:
> Lutger wrote:
>> s[4] means the fifth element of s[]
>> s[0..4] is a slice from the first to the fifth, but not including the fifth element. The last element in a slice is always one past the end of that slice.
> 
> Thank you both.
> I have to do math in mind in order to keep my code correct ;-)
> 
> 
> IMO, it does not make any sense.
> s[start..end]
> 
>   end - is neither the count of characters in the array slice,
>            nor the end index of the slice.
>         it is just the index after the real end character.
> 

Think of it as "everything in the string before this."
March 03, 2009
grauzone wrote:
> Think of it as "everything in the string before this."

I tend to think of a indexes as referring to the positions between the characters instead of the characters themselves.

"ABCD" -> 0 'A' 1 'B' 2 'C' 3 'D' 4

's[a..b]' = the elements betweens positions 'a' and 'b'. 's[a]' = the element to the right of position 'a'.


-- 
Rainer Deyke - rainerd@eldwood.com
March 03, 2009
Qian Xu wrote:
> Lutger wrote:
>> s[4] means the fifth element of s[]
>> s[0..4] is a slice from the first to the fifth, but not including the fifth element. The last element in a slice is always one past the end of that slice.
> 
> Thank you both.
> I have to do math in mind in order to keep my code correct ;-)
> 
> 
> IMO, it does not make any sense.
> s[start..end]
> 
>   end - is neither the count of characters in the array slice,
>            nor the end index of the slice.
>         it is just the index after the real end character.

It's the same way it is in Python, etc. Also, it makes a lot of things easier, i.e.:

s[0..x] is x characters long
s[x..$] is everything from index x to the end of the string
March 03, 2009
Qian Xu:
> I have to do math in mind in order to keep my code correct ;-) IMO, it does not make any sense.

At the beginning you have to think a bit about it, but you quickly learn it, and you find it's the best way to design it :-)
Several languages use this same convention.

It allows you to split an array in two parts with very little troubles: s[0 .. $] == s[0 .. n] ~ s[n .. $]

It's especially good when all the language uses this idea, for example, in D2, this loops ten times, and x never becomes 10:
foreach (x; 0 .. 10) {...}

Bye,
bearophile
March 06, 2009
Qian Xu wrote:
<snip>
> IMO, it does not make any sense.
> s[start..end]
> 
>   end - is neither the count of characters in the array slice,
>            nor the end index of the slice.
>         it is just the index after the real end character.

It might help to think of the indexes in a slice as numbering the boundaries between the array elements, rather than the elements themselves.

Stewart.