July 23, 2007
Bill Baxter wrote:
> Don Clugston wrote:
>> I think the convention "first_element .. last_element+1" cannot be extended to negative and floating-point numbers without creating an inconsistency. Which is quite unfortunate.

I'm not sure of the problem with negative integers is. Even for negative integers x, the identity still holds, that the following two expressions are equivalent:

     a <= x
     a <  x+1

But the floating point issue is a bummer. And it's also a bit silly for chars. To test whether c is a digit, you would have to write:

   c in ['0'..'9'+1]

which looks a little silly.

> 
> In Python the last_element..first_element inclusive case is handled by omissions.
>   10:0:-1  --- 10 downto 1, inclusive -- doesn't include 0
>   10::-1 --- 10 downto last one - includes 0
> 
> For D I guess that might become
>    10..$:-1
> but $ would have to become something context sensitive, rather than just a synonym for .length.  Which I guess is the same as saying you'd have to introduce an inconsistency, or at least a less strict form of consistency, to the interpretation of $.
> 

To me, it isn't obvious that $==0 in your example. But I think the real value of $ is in multi-dimensional arrays, because without it you would get something like:

  int[,,] a = ...;
  int[,,] my_slice = a[1..$, 1..$, 1..$];
  int[,,] my_slice_ugly = a[1..a.length[0], 1..a.length[1], 1..a.length[2]];

To support that, I would use Andrei's suggested grammar, but instead of $ translating into a.length, the compiler should first try a.length(0) or a.length(1), etc, where the parameter is the parameter number where the $ occurs. (It's a hack, I know, but I think it's better than $ generating a delegate...)


  -- Reiner
July 23, 2007
Bill Baxter wrote:
> Reiner Pope wrote:
>> Bill Baxter wrote:
>>> Jarrett Billingsley wrote:
>>>> "Xinok" <xnknet@gmail.com> wrote in message news:f80qof$2n0l$1@digitalmars.com...
>>>>
>>>>> foreach(i; 0..100)
>>>>
>>>> This is almost identical to the syntax in MiniD:
>>>>
>>>> for(i: 0 .. 100)
>>>>
>>>> It could be done with for or foreach; I just chose for because normally you use for loops to iterate over ranges of integers.
>>>>
>>>> You can also come up with a pretty simple short-term solution that'll be fairly efficient (though not as efficient as if the compiler were aware of this kind of loop intrinsically) by making a struct 'range' which has a static opCall to construct a range and an opApply to iterate over the values, so that it'd look like:
>>>>
>>>> foreach(i; range(100))
>>>>
>>>> Which isn't terrible at all. 
>>>
>>> And it has the advantage of being more extensible.  And for allowing ranges to be treated as first class entities that can be passed around and manipulated.  But no, instead we get another one-trick pony.
>>>
>>> --bb
>> That was my first thought, too.
>>
>> In the "Array Slice Ranges" thread, several people mentioned first-class ranges:
>>
>> http://www.digitalmars.com/pnews/read.php?server=news.digitalmars.com&group=digitalmars.D&artnum=43865 
>>
>> http://www.digitalmars.com/pnews/read.php?server=news.digitalmars.com&group=digitalmars.D&artnum=43904 
>>
>> http://www.digitalmars.com/pnews/read.php?server=news.digitalmars.com&group=digitalmars.D&artnum=43905 
>>
>> http://www.digitalmars.com/pnews/read.php?server=news.digitalmars.com&group=digitalmars.D&artnum=43954 
>>
>>
>> Your implementation, Bill, seems to be just right, and gives you foreach over ranges for free.
> 
> Thanks.  I think Oskar Linde had a nice version too.  I seem to remember thinking there were a few things in his that I should borrow.
> 
>> What's wrong with adding that to the language, but templated and with nice syntax?
>>
>> type name                                 literal
>> int..int  (range of int)                  1..5
>> int..double   (range of int to double)    1..5.0
>> int..int:int  (stepped range)             5..1:-1
>>
>> (I'm not sure of the use of mixed-type ranges, but this seems the most intuitive syntax. Since most ranges are probably of one type, how about allowing a symbol to denote "same type again". Any of the following could mean int..int:   int..#,   int.._, int..$)
> 
> Having two different types for it seems odd.  Just plain int.. would make more sense to me.  I really like that 5..1:-1 syntax though!  Was that mentioned before?  Something about all the colons in Pythons range literals always makes me uneasy.  a:b:c -- is that a to c stepping by b?  Or a to b stepping by c?  In Python it's the latter.  In Matlab I think it's the former.  Which is probably why I always feel a little uneasy when I see it.  But a..b:c is much clearer!  Obviously it's from a to b, so c must be a step.  And the colon looking like the two dots stood on end -- lovely.

I never knew about the Python or Matlab syntax. 5..1:-1 is from Norbert Nemec's multidimensional array proposal, and it makes so much sense. :)

But I don't know about the declaration syntax of the type. The most obvious and the nicest-looking is definitely 'int..int'. But using that suggests that 'int..double' should be allowed, which doesn't really make much sense, given that operations on ranges will probably be mostly indexing, iterating through the range, and testing whether an element is contained in that range, each of which require one characteristic type.

So the characteristic type of the range should only be said once. But I don't like int.. because of what it implies:

  int..:
  int:  // a stepped range from here to infinity; but it looks like case:
  ..int: // I dunno: reverse iteration?

You really need something to hold the number's place, But nothing comes to mind, other than (the ugly) #
   int..#
   int..#:#
   ..int:#

Mind you, I think it allows a nice syntax for what I was grasping in a different post with the wacky question-mark syntax (int..int:int?). You need to be able to specify which promotions may be done implicitly, and with what default values. I think the easiest way is to specify the default values as part of the type:

   int..#:1   (a range from lo to hi, with step 1 unless specified)
   int..5:#
   int=3..#   (lo has a default value of 3)

One range, A, could only be implicitly converted to another if it every field in A was included in B (so we don't lose information) and all the fields in B missing from A have default values (so it's not implicitly converted by mistake).


Just my train of thought,

   Reiner
July 23, 2007
Reiner Pope skrev:

> To me, it isn't obvious that $==0 in your example. But I think the real value of $ is in multi-dimensional arrays, because without it you would get something like:
> 
>   int[,,] a = ...;
>   int[,,] my_slice = a[1..$, 1..$, 1..$];
>   int[,,] my_slice_ugly = a[1..a.length[0], 1..a.length[1], 1..a.length[2]];
> 
> To support that, I would use Andrei's suggested grammar, but instead of $ translating into a.length, the compiler should first try a.length(0) or a.length(1), etc, where the parameter is the parameter number where the $ occurs. (It's a hack, I know, but I think it's better than $ generating a delegate...)

The way I have handled multidimensional slices is to make ranges including $ distinct types, like:

http://www.csc.kth.se/~ol/indextypes.d

All those distinct types might be overkill, but saves some unnecessary parameter passing and calls to .length(x).

If $ in index expressions could behave equivalent to "end" does in that sample, it would be great.

Having $ translate into a.length would mean range expressions containing $ could never become first class citizens. With the types in indextypes.d one can write:

auto a = range(0, end-1);
auto b = range(end-10, end);
auto c = 7;

auto B = A[a,b,c];

It would be neat to have something at least close to this with built in ranges.

-- 
Oskar
July 23, 2007
Bill Baxter wrote:
> Don Clugston wrote:
>> Reiner Pope wrote:
>>> Bill Baxter wrote:
>>>> Jarrett Billingsley wrote:
>>>>> "Xinok" <xnknet@gmail.com> wrote in message news:f80qof$2n0l$1@digitalmars.com...
>>>>>
>>>>>> foreach(i; 0..100)
>>>>>
>>>>> This is almost identical to the syntax in MiniD:
>>>>>
>>>>> for(i: 0 .. 100)
>>>>>
>>>>> It could be done with for or foreach; I just chose for because normally you use for loops to iterate over ranges of integers.
>>>>>
>>>>> You can also come up with a pretty simple short-term solution that'll be fairly efficient (though not as efficient as if the compiler were aware of this kind of loop intrinsically) by making a struct 'range' which has a static opCall to construct a range and an opApply to iterate over the values, so that it'd look like:
>>>>>
>>>>> foreach(i; range(100))
>>>>>
>>>>> Which isn't terrible at all. 
>>>>
>>>> And it has the advantage of being more extensible.  And for allowing ranges to be treated as first class entities that can be passed around and manipulated.  But no, instead we get another one-trick pony.
>>>>
>>>> --bb
>>> That was my first thought, too.
>>>
>>> In the "Array Slice Ranges" thread, several people mentioned first-class ranges:
>>>
>>> http://www.digitalmars.com/pnews/read.php?server=news.digitalmars.com&group=digitalmars.D&artnum=43865 
>>>
>>> http://www.digitalmars.com/pnews/read.php?server=news.digitalmars.com&group=digitalmars.D&artnum=43904 
>>>
>>> http://www.digitalmars.com/pnews/read.php?server=news.digitalmars.com&group=digitalmars.D&artnum=43905 
>>>
>>> http://www.digitalmars.com/pnews/read.php?server=news.digitalmars.com&group=digitalmars.D&artnum=43954 
>>>
>>>
>>> Your implementation, Bill, seems to be just right, and gives you foreach over ranges for free.
>>>
>>> What's wrong with adding that to the language, but templated and with nice syntax?
>>>
>>> type name                                 literal
>>> int..int  (range of int)                  1..5
>>> int..double   (range of int to double)    1..5.0
>>> int..int:int  (stepped range)             5..1:-1
>>>
>>> (I'm not sure of the use of mixed-type ranges, but this seems the most intuitive syntax. Since most ranges are probably of one type, how about allowing a symbol to denote "same type again". Any of the following could mean int..int:   int..#,   int.._, int..$)
>>
>> I don't think it make sense to have mixed type ranges. The normal promotion rules should apply. However...
>>
>> Floating-point ranges are tricky. Should they be open-ended, or closed-ended? 
> 
> Both Matlab and Numpy have floating point ranges.   Matlab ranges are always inclusive, so 1:2.1:7.3 gives you 1.0, 3.1, 5.2, 7.3.  Python ranges are always non-inclusive, so it gives you 1.0,3.1,5.2.
> 
>> Consider
>> -real.infinity..real.infinity
>> Are the infinities part of the range? If not, how do you specify a range which includes infinity?
> 
> Does it matter that much?  I suppose it would be cool if it did something really consistent, but Numpy just craps out and gives you an empty list, and Matlab raises an error "Maximum variable size allowed by the program is exceeded".

I think that if you can't specify a range including an infinity, then floating point ranges don't make sense. Especially, I really don't like the idea that -real.infinity..real.infinity would include -infinity, but not +infinity.

I've had a use for floating-point ranges: specifying domain and range of functions, where infinity is fairly common. When else would you use them?

Besides, "first_element .. last_element-1" doesn't work for (say)
0.00001 .. 0.00003;
it has to be first_element..nextDown(lastElement).
The MatLab method (closed ranges) is a nicer fit to IEEE arithmetic.
In fact, I'd even say that half-open ranges are only ideal for unsigned numbers.

But we probably don't want 1.0..5.0 to contain 5.0 when 1..5 doesn't contain 5.
July 23, 2007
Don Clugston wrote:
> Bill Baxter wrote:
>> Don Clugston wrote:
>>
>>> Consider
>>> -real.infinity..real.infinity
>>> Are the infinities part of the range? If not, how do you specify a range which includes infinity?
>>
>> Does it matter that much?  I suppose it would be cool if it did something really consistent, but Numpy just craps out and gives you an empty list, and Matlab raises an error "Maximum variable size allowed by the program is exceeded".
> 
> I think that if you can't specify a range including an infinity, then floating point ranges don't make sense. Especially, I really don't like the idea that -real.infinity..real.infinity would include -infinity, but not +infinity.

Hm... what if I wanted a range that included ulong.max?  Is there any way to do that either?  I don't suppose ulong.max+1 works in that case?
  I know it would be a tad weird because infinity+1 == infinity, but perhaps this is one case where the semantics should just be consistent with everything else.


Sean
July 23, 2007
Reiner Pope wrote:
> Bill Baxter wrote:
>> Don Clugston wrote:
>>> I think the convention "first_element .. last_element+1" cannot be extended to negative and floating-point numbers without creating an inconsistency. Which is quite unfortunate.
> 
> I'm not sure of the problem with negative integers is. Even for negative integers x, the identity still holds, that the following two expressions are equivalent:
> 
>      a <= x
>      a <  x+1
> 
> But the floating point issue is a bummer. And it's also a bit silly for chars. To test whether c is a digit, you would have to write:
> 
>    c in ['0'..'9'+1]
> 
> which looks a little silly.

Perhaps there should be an operator for inclusive vs. exclusive ranges.  Something like:

c in ['0' -> '9']

Not ideal, I know.


Sean
July 23, 2007
Sean Kelly wrote:
> Don Clugston wrote:
>> Bill Baxter wrote:
>>> Don Clugston wrote:
>>>
>>>> Consider
>>>> -real.infinity..real.infinity
>>>> Are the infinities part of the range? If not, how do you specify a range which includes infinity?
>>>
>>> Does it matter that much?  I suppose it would be cool if it did something really consistent, but Numpy just craps out and gives you an empty list, and Matlab raises an error "Maximum variable size allowed by the program is exceeded".
>>
>> I think that if you can't specify a range including an infinity, then floating point ranges don't make sense. Especially, I really don't like the idea that -real.infinity..real.infinity would include -infinity, but not +infinity.
> 
> Hm... what if I wanted a range that included ulong.max?  Is there any way to do that either?

Probably not. The [a..b) definition of a range is great as long as you only use ranges for array slicing, but it doesn't generalise well to other use cases.

To say x must be between -5.0 and +5.0, inclusive (mathematically [-5.0, 5.0]), using the existing semantics, you'd have to say:

if (x in -5.0 .. nextUp(5.0)) ...

and -5.0 to 5.0 exclusive (mathematically (-5.0, 5.0)) is:
if (x in nextUp(-5.0) .. 5.0) ...

Both of these cases are going to be far more common than [-5.0, 5.0) which should be as common as (-5.0, 5.0] which requires the monstrosity:
if (x in nextUp(-5.0)..nextDown(5.0)) ...

(and nextUp isn't even in Phobos - you have to use Tango <g>).

  I don't suppose ulong.max+1 works in that case?
No.
>   I know it would be a tad weird because infinity+1 == infinity, but perhaps this is one case where the semantics should just be consistent with everything else.

How can you store infinity + 1?

Also, it won't work for 0.0001 .. 0.0003. You don't actually want to add 1.
July 23, 2007
Don Clugston wrote:
> Sean Kelly wrote:
>> Don Clugston wrote:
>>> Bill Baxter wrote:
>>>> Don Clugston wrote:
>>>>
>>>>> Consider
>>>>> -real.infinity..real.infinity
>>>>> Are the infinities part of the range? If not, how do you specify a range which includes infinity?
>>>>
>>>> Does it matter that much?  I suppose it would be cool if it did something really consistent, but Numpy just craps out and gives you an empty list, and Matlab raises an error "Maximum variable size allowed by the program is exceeded".
>>>
>>> I think that if you can't specify a range including an infinity, then floating point ranges don't make sense. Especially, I really don't like the idea that -real.infinity..real.infinity would include -infinity, but not +infinity.
>>
>> Hm... what if I wanted a range that included ulong.max?  Is there any way to do that either?
> 
> Probably not. The [a..b) definition of a range is great as long as you only use ranges for array slicing, but it doesn't generalise well to other use cases.
> 
> To say x must be between -5.0 and +5.0, inclusive (mathematically [-5.0, 5.0]), using the existing semantics, you'd have to say:
> 
> if (x in -5.0 .. nextUp(5.0)) ...
> 
> and -5.0 to 5.0 exclusive (mathematically (-5.0, 5.0)) is:
> if (x in nextUp(-5.0) .. 5.0) ...
> 
> Both of these cases are going to be far more common than [-5.0, 5.0) which should be as common as (-5.0, 5.0] which requires the monstrosity:
> if (x in nextUp(-5.0)..nextDown(5.0)) ...

Makes me feel like we were better off without foreachable ranges in the first place.  It's obviously possible to do:

foreach( f; inclusive( -float.infinity, float.infinity ) ) {}

And it is potentially more meaningful as well, given that we can't use the mathematical notation [] vs [), etc.  Also, just like the new foreachable ranges, the above syntax evaluates both the begin and end arguments only once and doesn't require the user to explicitly specify a type.

> (and nextUp isn't even in Phobos - you have to use Tango <g>).
> 
>   I don't suppose ulong.max+1 works in that case?
> No.
>>   I know it would be a tad weird because infinity+1 == infinity, but perhaps this is one case where the semantics should just be consistent with everything else.
> 
> How can you store infinity + 1?

You can't :-)  For some reason I thought it would overflow to infinity and "just work", but you're right.  The value has to be stored.

> Also, it won't work for 0.0001 .. 0.0003. You don't actually want to add 1.

Yup.  Personally, I'd much rather have a separate, explicitly defined range syntax than this new foreach feature, or just leave things as-is.  But then I'm not terribly fond of basically any new features introduced in 2.0, so I suppose this is just par for the course.


Sean
July 23, 2007
Don Clugston wrote:
> Bill Baxter wrote:
>> Don Clugston wrote:
>>> Reiner Pope wrote:
>>>> Bill Baxter wrote:
>>>>> Jarrett Billingsley wrote:
>>>>>> "Xinok" <xnknet@gmail.com> wrote in message news:f80qof$2n0l$1@digitalmars.com...
>>>>>>
>>>>>>> foreach(i; 0..100)
>>>>>>
>>>>>> This is almost identical to the syntax in MiniD:
>>>>>>
>>>>>> for(i: 0 .. 100)
>>>>>>
>>>>>> It could be done with for or foreach; I just chose for because normally you use for loops to iterate over ranges of integers.
>>>>>>
>>>>>> You can also come up with a pretty simple short-term solution that'll be fairly efficient (though not as efficient as if the compiler were aware of this kind of loop intrinsically) by making a struct 'range' which has a static opCall to construct a range and an opApply to iterate over the values, so that it'd look like:
>>>>>>
>>>>>> foreach(i; range(100))
>>>>>>
>>>>>> Which isn't terrible at all. 
>>>>>
>>>>> And it has the advantage of being more extensible.  And for allowing ranges to be treated as first class entities that can be passed around and manipulated.  But no, instead we get another one-trick pony.
>>>>>
>>>>> --bb
>>>> That was my first thought, too.
>>>>
>>>> In the "Array Slice Ranges" thread, several people mentioned first-class ranges:
>>>>
>>>> http://www.digitalmars.com/pnews/read.php?server=news.digitalmars.com&group=digitalmars.D&artnum=43865 
>>>>
>>>> http://www.digitalmars.com/pnews/read.php?server=news.digitalmars.com&group=digitalmars.D&artnum=43904 
>>>>
>>>> http://www.digitalmars.com/pnews/read.php?server=news.digitalmars.com&group=digitalmars.D&artnum=43905 
>>>>
>>>> http://www.digitalmars.com/pnews/read.php?server=news.digitalmars.com&group=digitalmars.D&artnum=43954 
>>>>
>>>>
>>>> Your implementation, Bill, seems to be just right, and gives you foreach over ranges for free.
>>>>
>>>> What's wrong with adding that to the language, but templated and with nice syntax?
>>>>
>>>> type name                                 literal
>>>> int..int  (range of int)                  1..5
>>>> int..double   (range of int to double)    1..5.0
>>>> int..int:int  (stepped range)             5..1:-1
>>>>
>>>> (I'm not sure of the use of mixed-type ranges, but this seems the most intuitive syntax. Since most ranges are probably of one type, how about allowing a symbol to denote "same type again". Any of the following could mean int..int:   int..#,   int.._, int..$)
>>>
>>> I don't think it make sense to have mixed type ranges. The normal promotion rules should apply. However...
>>>
>>> Floating-point ranges are tricky. Should they be open-ended, or closed-ended? 
>>
>> Both Matlab and Numpy have floating point ranges.   Matlab ranges are always inclusive, so 1:2.1:7.3 gives you 1.0, 3.1, 5.2, 7.3.  Python ranges are always non-inclusive, so it gives you 1.0,3.1,5.2.
>>
>>> Consider
>>> -real.infinity..real.infinity
>>> Are the infinities part of the range? If not, how do you specify a range which includes infinity?
>>
>> Does it matter that much?  I suppose it would be cool if it did something really consistent, but Numpy just craps out and gives you an empty list, and Matlab raises an error "Maximum variable size allowed by the program is exceeded".
> 
> I think that if you can't specify a range including an infinity, then floating point ranges don't make sense. Especially, I really don't like the idea that -real.infinity..real.infinity would include -infinity, but not +infinity.
> I've had a use for floating-point ranges: specifying domain and range of
> functions, where infinity is fairly common. When else would you use them?
>

It sounds like maybe you're talking about "intervals" rather than "ranges".  Yes, definitely intervals should be able to handle infinities correctly.  But a range (a la python or matlab) is a shortcut for a sequence of values, with equal-size steps in between beginning and end.  To have the beginning or end be infinite is asking for trouble.  For instance in Matlab that tries to allocate an infinite-sized array of numbers.

> Besides, "first_element .. last_element-1" doesn't work for (say)
> 0.00001 .. 0.00003;

Sure it does.  It's just a set containing only 0.00001.  I don't know what you mean by -1 there.  Just think of it as a do-while loop that generates numbers:

begin..end:step basically generates this:
float[] a;
float v=begin;
do {
   a ~= v;
   v+=step;
} while(v<end);

> it has to be first_element..nextDown(lastElement).
> The MatLab method (closed ranges) is a nicer fit to IEEE arithmetic.
> In fact, I'd even say that half-open ranges are only ideal for unsigned numbers.
> 
> But we probably don't want 1.0..5.0 to contain 5.0 when 1..5 doesn't contain 5.

Right.  Numpy had the same problem.  Python itself uses the same non-inclusive rule as D. But Python only handles integers in things like the "range(start,end,step)" function.  The Numpy folks wanted to extend that to work for floating point types as well.  But actually, in both matlab and numpy, if you want an evenly spaced set of numbers, you usually use the 'linspace' function, which has the signature linspace(begin,end,numvals).  This creates an inclusive array of numbers.

I think one source of confusion is that ranges and slices are very similar things, but not quite the same.

* A range is just a sequence of numbers.  It can exist and be interpreted independently.  Here allowing floating point numbers makes sense.  Allowing for infinity may make sense, but practically it's very niche.  Iterating over infinite things usually takes either too much time or too much memory.

* A slice needs an object to operate on for interpretation of object-relative things like $.  Generally speaking, only integers make sense in a slice.  Infinity doesn't really make sense because you can't generally have things that are both slice-able and infinite on a computer.

(* An interval just represents two points on a numberline, plus maybe an indication of the inclusivity of the endpoints.  Infinity -- ok. Floating point -- ok.)

It may be possible to combine the concepts into one type, but they *are* slightly different, and may benefit from being treated as so.

--bb
July 23, 2007
Sean Kelly wrote:
> Reiner Pope wrote:
>> Bill Baxter wrote:
>>> Don Clugston wrote:
>>>> I think the convention "first_element .. last_element+1" cannot be extended to negative and floating-point numbers without creating an inconsistency. Which is quite unfortunate.
>>
>> I'm not sure of the problem with negative integers is. Even for negative integers x, the identity still holds, that the following two expressions are equivalent:
>>
>>      a <= x
>>      a <  x+1
>>
>> But the floating point issue is a bummer. And it's also a bit silly for chars. To test whether c is a digit, you would have to write:
>>
>>    c in ['0'..'9'+1]
>>
>> which looks a little silly.
> 
> Perhaps there should be an operator for inclusive vs. exclusive ranges.  Something like:
> 
> c in ['0' -> '9']
> 
> Not ideal, I know.
> 
> 
> Sean

In previous discussion it was mentioned that Ruby has a..b and a...b as inclusive and exclusive ranges, respectively.  The previous thread also threw around a lot of possible alternative syntaxes.

--bb