RFC on range design for D2 (page 15) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » Announce » RFC on range design for D2 (page 15)

September 10, 2008

Re: RFC on range design for D2

Posted by Andrei Alexandrescu
in reply to Bill Baxter

Andrei Alexandrescu

Posted in reply to Bill Baxter

Bill Baxter wrote:
> But I think you and I are in agreement that it would be easier and
> more natural to think of ranges as iterators augmented with
> information about bounds, as opposed to a contiguous block of things
> from A to B.

I like that you are bringing this point up, it is interesting. Note that my API never assumes or requires that there's an actual contiguous block of things underneath. Au contraire, in the I/O case, there's only "the current element" underneath.

But a better example is generators. Consider a function generate that takes a string expression using a[0], a[1],... a[k] (the state) and returns a[k+1]. The generate function also takes the initial state. Then generate returns a range that returns in turn each element of the series.

Generate is easy to implement, but I don't want to get into that now, only into usage. Simplest use is:

auto boring = generate!("a[0]"(42);

This guy will generate the series 42 42 42 42 42 42 42... forever and ever. Now to use it at all we'd have to temper it. So we use function called "take", which accepts a maximum size. And then:

writeln(take(10, boring));

This guy will print "42 42 42 42".

Let's generate a more interesting series. How about an iota:

writeln(take(4, generate!("a[0] + 2")(5)));

That guy prints "5 7 9 11". Or Newton's square root approximations:

writeln(take(4, generate!("(a[0] + 2/a[0])/2")(1.0)));

which prints "1 1.5 1.4167 1.4142". All of these are ranges, some bounded and some unbounded, but do not have blocks of elements underneath them.

>> well these are the operations that you can do on basically all iterators
>> (and with wich you can define new iterators).
>> The one you propose need an underlying total order that can be efficiently
>> checked, for example iterators on trees do not have necessarily this
>> property, and then getting your kind of intersection can be difficult (and
>> not faster than the operation using atSamePlace.
> 
> I don't think that's correct.  Andrei's system does not need a total
> order any more than yours does.  The unions and diffs just create new
> ranges by combining the components of existing ranges.  They don't
> need to know anything about what happens in between those points or
> how you get from one to the other.  Just take the "begin" of this guy
> and put it together with the "end" of that guy, for example.  It
> doesn't require knowing how to get from anywhere to anywhere to create
> that new range.

Yes, that's exactly right.

Andrei

September 10, 2008

Re: RFC on range design for D2

Posted by Fawzi Mohamed
in reply to Bill Baxter

Fawzi Mohamed

Posted in reply to Bill Baxter

On 2008-09-10 14:35:29 +0200, "Bill Baxter" <wbaxter@gmail.com> said:

> On Wed, Sep 10, 2008 at 7:47 AM, Fawzi Mohamed <fmohamed@mac.com> wrote:
> 
>>>> 2) All the methods with intersection of iterator in my opinion are
>>>> difficult to memorize, and rarely used, I would scrap them.
>>>> Instead I would add the comparison operation .atSamePlace(iterator!(T)y)
>>>> that would say if two iterators are at the same place. With it one gets back
>>>> all the power of pointers, and with a syntax and use that are
>>>> understandable.
>>> 
>>> But that comparison operation is not enough to implement anything of
>>> substance. Try your hand at a few classic algorithms and you'll see.
>> 
>> are you sure? then a range is *exactly* equivalent to a STL iterator, only
>> that it cannot go out of bounds:
>> // left1-left2:
>> while((!i1.isEmpty) && (!i1.atSamePlace(i2))){
>> i1.next;
>> }
>> // left2-left1:
>> while((!i2.isEmpty) && (!i1.atSamePlace(i2))){
>> i1.next;
>> }
>> // union 1-2
>> while((!i1.isEmpty) && (!(i1.atSamePlace(i2))){
>> i1.next;
>> }
>> while(!i2.isEmpty){
>> i2.next;
>> }
>> // union 2-1
>> ...
>> // lower triangle
>> i1=c.all;
>> while(!i1.isEmpty){
>> i2=c.all;
>> while(!i2.isEmpty && !i2.atSamePlace(i1)){
>> i2.next;
>> }
> 
> Your code shows that you can successfully iterate over the same
> elements described by Andrei's various unions and differences, but
> they do not show how you would, say, pass that new range another
> function to do that job.  Such as you would want to do in say, a
> recursive sort.  Since in this design you can't set or access the
> individual iterator-like components of a range directly, being able to
> copy the begin or end iterator from one range over to another is
> necessary, I think.

yes you are right this operation on the simplest iterators cannot be preformed recursevely without overhead (you can do it once, but then you need to store i1 & i2 in the new iterator, to do it again will add more and more overhead.
Range union... can be used efficiently and safely only if the iterator has an order that can be easily checked, this is a useful abstraction, but not the basic one.

> But I think you and I are in agreement that it would be easier and
> more natural to think of ranges as iterators augmented with
> information about bounds, as opposed to a contiguous block of things
> from A to B.
> 
>> well these are the operations that you can do on basically all iterators
>> (and with wich you can define new iterators).
>> The one you propose need an underlying total order that can be efficiently
>> checked, for example iterators on trees do not have necessarily this
>> property, and then getting your kind of intersection can be difficult (and
>> not faster than the operation using atSamePlace.
> 
> I don't think that's correct.  Andrei's system does not need a total
> order any more than yours does.  The unions and diffs just create new
> ranges by combining the components of existing ranges.  They don't
> need to know anything about what happens in between those points or
> how you get from one to the other.  Just take the "begin" of this guy
> and put it together with the "end" of that guy, for example.  It
> doesn't require knowing how to get from anywhere to anywhere to create
> that new range.

well if you don't have a total order that you can easily check then this might be very unsafe
think to i1.begin..i2.begin if i2.begin<i1.begin, you might miss that it is empty, and iterate forever...

Fawzi

September 10, 2008

Re: RFC on range design for D2

Posted by Bill Baxter
in reply to Andrei Alexandrescu

Bill Baxter

Posted in reply to Andrei Alexandrescu

On Wed, Sep 10, 2008 at 10:07 PM, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:
> Bill Baxter wrote:
>>
>> But I think you and I are in agreement that it would be easier and more natural to think of ranges as iterators augmented with information about bounds, as opposed to a contiguous block of things from A to B.
>
> I like that you are bringing this point up, it is interesting. Note that my API never assumes or requires that there's an actual contiguous block of things underneath. Au contraire, in the I/O case, there's only "the current element" underneath.

Yes, I see that and think it's great.  But the point I've been trying to make is that the nomenclature you are using seems to emphasize the contiguous block interpretation, rather than the interpretation as a cursor plus a sentinel.  The contiguous block terminology makes good sense for slices, but less for things like trees and unbounded generators and HMMs.

And ok, I do think your incredible shrinking bidirectional range is borked.  But other than that, I'm just talking about terminology.

Did you read my posts over on DigtialMars.D?  I'm not into the "massive thread on d.announce" thing -- makes it too hard to find sub-threads later -- so I started some new sub-threads over there.

--bb

September 10, 2008

Re: RFC on range design for D2

Posted by Sergey Gromov
in reply to Andrei Alexandrescu

Sergey Gromov

Posted in reply to Andrei Alexandrescu

Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:
> Sergey Gromov wrote:
> > - the union operations look... weird.  Unobvious.  I'm too sleepy now to propose anything better but I'll definitely give it a try.  The rest of the interface seems very natural.
> 
> I agree I hadn't known what primitives would be needed when I sat down. Clearly there was a need for some since individual iterators are not available anymore. New ideas would be great; I suggest you validate them by implementing some nontrivial algorithms in std.algorithm with your, um, computational basis of choice :o).

r.before(s)
r.after(s)
r.begin
r.end

Here r.before(s) is everything from the r's first element (inclusive) to the first s's element (exclusive); r.after(s) is from last s's element (exclusive) to the last element of r (inclusive); r.begin is an empty range at the beginning of a parent range; and r.end is an empty range at the end of a parent range.  Therefore, according to your diagram:

r.toBegin(s) => r.before(s)
s.toEnd(r) => s.before(r.end)
s.fromEnd(r) => s.after(r)

September 10, 2008

Re: RFC on range design for D2

Posted by Leandro Lucarella
in reply to Andrei Alexandrescu

Leandro Lucarella

Posted in reply to Andrei Alexandrescu

Andrei Alexandrescu, el  9 de septiembre a las 18:13 me escribiste:
> Derek Parnell wrote:
> >On Tue, 09 Sep 2008 10:30:58 -0500, Andrei Alexandrescu wrote:
> >>I'd like to go with:
> >>
> >>r.first
> >>r.last
> >>r.next
> >>r.pop
> >LOL ... I was just thinking to myself ... "what's wrong with First and
> >Last? I should suggest them." then I read this post.
> >"next" is fine, but "pop"? Isn't the pair of "next" called "prev(ious)" and
> >the pair of "pop" called "push". So please, either have next/prev or
> >push/pop, and in that case push/pop looks quite silly.
> 
> Previous is confusing as it suggest I'm moving back where I came from. In reality I shrink the range from the other end. So we need:
> 
> "Shrink the range from the left end"
> "Shrink the range from the right end"

You mean from begining and the end I guess ;)

> The first will be used much more often than the second.

shrink(int n = 1)?
if n > 0, shrinks from the begining, if n < 0, shrinks if shrinks from the
end (0 is no-op). This way you can skip some elements too. Even when it
could be a little cryptic, I think it plays well with slicing.

Another posibility is shrink(int begin = 1, end = 0), to shrink both ends
at the time, for example (calling shrink1 to the first proposal and
shrink2 to the second):
r.pop == r.shrink1(-1) == r.shrink2(0, 1)
r.pop; r.pop == r.shrink(-2) == r.shrink2(0, 2)
r.shrink1() == r.shrink2()
r.shrink1(3) == r.shrink2(3)
r.shrink1(3); r.shrink1(-2) == r.shrink2(3, 2)


-- 
Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/
----------------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------------
<o_O> parakenotengobarraespaciadora
<o_O> aver
<o_O> estoyarreglandolabarraporkeserompiounapatita

September 10, 2008

Re: RFC on range design for D2

Posted by Andrei Alexandrescu
in reply to Sergey Gromov

Andrei Alexandrescu

Posted in reply to Sergey Gromov

Sergey Gromov wrote:
> Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:
>> Sergey Gromov wrote:
>>> - the union operations look... weird.  Unobvious.  I'm too sleepy now to propose anything better but I'll definitely give it a try.  The rest of the interface seems very natural.
>> I agree I hadn't known what primitives would be needed when I sat down. Clearly there was a need for some since individual iterators are not available anymore. New ideas would be great; I suggest you validate them by implementing some nontrivial algorithms in std.algorithm with your, um, computational basis of choice :o).
> 
> r.before(s)
> r.after(s)
> r.begin
> r.end
> 
> Here r.before(s) is everything from the r's first element (inclusive) to the first s's element (exclusive); r.after(s) is from last s's element (exclusive) to the last element of r (inclusive); r.begin is an empty range at the beginning of a parent range; and r.end is an empty range at the end of a parent range.  Therefore, according to your diagram: 
> 
> r.toBegin(s) => r.before(s)
> s.toEnd(r) => s.before(r.end)
> s.fromEnd(r) => s.after(r)

Cool! I was thinking of something along the same lines through the night, and actually with the same names before and after, but the begin and end did not occur to me. As soon as I'll have another chunk of time, I'll make another pass through algorithm2 to see how these work. But you may as well want to take std.algorithm and make it work with your primitives.


Andrei

September 10, 2008

Re: RFC on range design for D2

Posted by Andrei Alexandrescu
in reply to Bill Baxter

Andrei Alexandrescu

Posted in reply to Bill Baxter

Bill Baxter wrote:
> On Wed, Sep 10, 2008 at 10:07 PM, Andrei Alexandrescu
> <SeeWebsiteForEmail@erdani.org> wrote:
>> Bill Baxter wrote:
>>> But I think you and I are in agreement that it would be easier and
>>> more natural to think of ranges as iterators augmented with
>>> information about bounds, as opposed to a contiguous block of things
>>> from A to B.
>> I like that you are bringing this point up, it is interesting. Note that my
>> API never assumes or requires that there's an actual contiguous block of
>> things underneath. Au contraire, in the I/O case, there's only "the current
>> element" underneath.
> 
> Yes, I see that and think it's great.  But the point I've been trying
> to make is that the nomenclature you are using seems to emphasize the
> contiguous block interpretation, rather than the interpretation as a
> cursor plus a sentinel.  The contiguous block terminology makes good
> sense for slices, but less for things like trees and unbounded
> generators and HMMs.

I disagree that isEmpty, first, and next suggest anything near contiguous block. It's just list terminology. Is the list empty? Give me the first element of the list. Advance to the next element in the list.

Names for the before and after range operations are still in the air...

Are you referring to the "range" name itself?

> And ok, I do think your incredible shrinking bidirectional range is
> borked.  But other than that, I'm just talking about terminology.

How is it borked?


Andrei

September 10, 2008

Re: RFC on range design for D2

Posted by Bill Baxter
in reply to Andrei Alexandrescu

Bill Baxter

Posted in reply to Andrei Alexandrescu

On Wed, Sep 10, 2008 at 11:57 PM, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:
> Bill Baxter wrote:
>>
>> On Wed, Sep 10, 2008 at 10:07 PM, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:
>>>
>>> Bill Baxter wrote:
>>>>
>>>> But I think you and I are in agreement that it would be easier and more natural to think of ranges as iterators augmented with information about bounds, as opposed to a contiguous block of things from A to B.
>>>
>>> I like that you are bringing this point up, it is interesting. Note that
>>> my
>>> API never assumes or requires that there's an actual contiguous block of
>>> things underneath. Au contraire, in the I/O case, there's only "the
>>> current
>>> element" underneath.
>>
>> Yes, I see that and think it's great.  But the point I've been trying to make is that the nomenclature you are using seems to emphasize the contiguous block interpretation, rather than the interpretation as a cursor plus a sentinel.  The contiguous block terminology makes good sense for slices, but less for things like trees and unbounded generators and HMMs.
>
> I disagree that isEmpty, first, and next suggest anything near contiguous block. It's just list terminology. Is the list empty? Give me the first element of the list. Advance to the next element in the list.

However a range isn't, generally speaking, a list.  It's a way to traverse or access data that may or may not be a list.  For something like an unbounded generator, it is odd to speak of the "first".  Such an object has a current value and a "next", but the value you can look at right now is only the "first" by a bit of a terminology stretch.

I think using list terminology unnecessarily confuses the iterating
construct that does the accessing with the container being accessed.
The range is not the container.  The range consists of a place where
you are, and a termination condition.  The range is not "empty" or
"full" because it does not actually contain elements.  Sure, if you're
dead set on it, you can say that by "empty" we mean that the set of
things you would get if you called .next repeatedly is empty, but why?
 The terminology is just encouraging one to think of a range as a
container, when in fact it is not -- it is more like two goal posts.
Call it atEnd() or similar and you'll naturally encourage people to
think of ranges as references rather than containers.

Similarly, using list terminology led you to "pop".  But pop on a range does not actually remove any content.  Pop just moves the goal post on one end.

And then there's the various union/diff stuff, which everyone seems to find confusing.  I think much of that confusion and mental overhead just goes away if you think of a range as a good old iterator plus a stopping condition.

> Names for the before and after range operations are still in the air...
>
> Are you referring to the "range" name itself?

That could be part of the reason for this tendency to try to assign list-like names to the parts.  If it were called a "bounded iterator" I think that would better describe the perspective I'm pushing, and naturally lead to choices like "atEnd" instead of "isEmpty".

>> And ok, I do think your incredible shrinking bidirectional range is borked.  But other than that, I'm just talking about terminology.
>
> How is it borked?

See my post to Digitalmars.D.

But upon further reflection I think it may be that it's just not what
I would call a bidirectional range.  By that I mean it's not good at
solving the problems that a bidirectional iterator in C++ is good for.
 Your bidir range may be useful (though I'm not really convinced that
very many algorithms need what it provides) --  but I think one also
needs an iterator that's good at what C++'s bidir iterators are good
at, i.e. moving the active cursor backwards or forwards.  I would call
your construct more of a "double-headed" range than a bidirectional
one.

--bb

September 10, 2008

Re: RFC on range design for D2

Posted by Andrei Alexandrescu
in reply to Bill Baxter

Andrei Alexandrescu

Posted in reply to Bill Baxter

Bill Baxter wrote:
> On Wed, Sep 10, 2008 at 11:57 PM, Andrei Alexandrescu
> <SeeWebsiteForEmail@erdani.org> wrote:
>> Bill Baxter wrote:
>>> On Wed, Sep 10, 2008 at 10:07 PM, Andrei Alexandrescu
>>> <SeeWebsiteForEmail@erdani.org> wrote:
>>>> Bill Baxter wrote:
>>>>> But I think you and I are in agreement that it would be easier and
>>>>> more natural to think of ranges as iterators augmented with
>>>>> information about bounds, as opposed to a contiguous block of things
>>>>> from A to B.
>>>> I like that you are bringing this point up, it is interesting. Note that
>>>> my
>>>> API never assumes or requires that there's an actual contiguous block of
>>>> things underneath. Au contraire, in the I/O case, there's only "the
>>>> current
>>>> element" underneath.
>>> Yes, I see that and think it's great.  But the point I've been trying
>>> to make is that the nomenclature you are using seems to emphasize the
>>> contiguous block interpretation, rather than the interpretation as a
>>> cursor plus a sentinel.  The contiguous block terminology makes good
>>> sense for slices, but less for things like trees and unbounded
>>> generators and HMMs.
>> I disagree that isEmpty, first, and next suggest anything near contiguous
>> block. It's just list terminology. Is the list empty? Give me the first
>> element of the list. Advance to the next element in the list.
> 
> However a range isn't, generally speaking, a list.  It's a way to
> traverse or access data that may or may not be a list.  For something
> like an unbounded generator, it is odd to speak of the "first".  Such
> an object has a current value and a "next", but the value you can look
> at right now is only the "first" by a bit of a terminology stretch.

Agreed. The problem with "current" instead of "first" is that there's no clear correspondent for "the last that the current will be". First and last are obvious. Current and last are... well, not bad either :o).

> I think using list terminology unnecessarily confuses the iterating
> construct that does the accessing with the container being accessed.
> The range is not the container.  The range consists of a place where
> you are, and a termination condition.

No. A bidirectional range also knows the last place you'll ever be, and is able to manipulate it.

>  The range is not "empty" or
> "full" because it does not actually contain elements.

It is because a range is a view. The view can reduce to nothing. In math, an interval can be "empty". That doesn't mean it made all real numbers disappear :o).

> Sure, if you're
> dead set on it, you can say that by "empty" we mean that the set of
> things you would get if you called .next repeatedly is empty, but why?
>  The terminology is just encouraging one to think of a range as a
> container, when in fact it is not -- it is more like two goal posts.
> Call it atEnd() or similar and you'll naturally encourage people to
> think of ranges as references rather than containers.
> 
> Similarly, using list terminology led you to "pop".  But pop on a
> range does not actually remove any content.  Pop just moves the goal
> post on one end.

Correct. Then how would you name'em?

> And then there's the various union/diff stuff, which everyone seems to
> find confusing.  I think much of that confusion and mental overhead
> just goes away if you think of a range as a good old iterator plus a
> stopping condition.

I like before and after. Besides, the challenge is that you come with something that's not confusing.

>> Names for the before and after range operations are still in the air...
>>
>> Are you referring to the "range" name itself?
> 
> That could be part of the reason for this tendency to try to assign
> list-like names to the parts.  If it were called a "bounded iterator"
> I think that would better describe the perspective I'm pushing, and
> naturally lead to choices like "atEnd" instead of "isEmpty".

Words are powerful. Phrases are less powerful. I'll never ever settle on anything longer than ONE word for the concept. Ranges came to mind because boost uses them with a similar meaning.


Andrei

September 10, 2008

Re: RFC on range design for D2

Posted by Sergey Gromov
in reply to Leandro Lucarella

Sergey Gromov

Posted in reply to Leandro Lucarella

Leandro Lucarella <llucax@gmail.com> wrote:
> Andrei Alexandrescu, el  9 de septiembre a las 18:13 me escribiste:
> > "Shrink the range from the left end"
> > "Shrink the range from the right end"
> >
> > The first will be used much more often than the second.
> 
> shrink(int n = 1)?
> if n > 0, shrinks from the begining, if n < 0, shrinks if shrinks from the
> end (0 is no-op). This way you can skip some elements too. Even when it
> could be a little cryptic, I think it plays well with slicing.
> 
> Another posibility is shrink(int begin = 1, end = 0), to shrink both ends
> at the time, for example (calling shrink1 to the first proposal and
> shrink2 to the second):
> r.pop == r.shrink1(-1) == r.shrink2(0, 1)
> r.pop; r.pop == r.shrink(-2) == r.shrink2(0, 2)
> r.shrink1() == r.shrink2()
> r.shrink1(3) == r.shrink2(3)
> r.shrink1(3); r.shrink1(-2) == r.shrink2(3, 2)

I remember that shift() was a method to remove first element from an array.  Some Basic perhaps...  So it could be push, pop, shift, er... unshift?

But now that I think about it...  What's the use case for these operations?  It's clear to me that getNext/putNext are a generator/constructor pair, in the broad sense.  But push/whatever?

Andrei have told already, if I remember correctly, that ranges are views of data, not manipulators.  This means that they cannot be used to extend a collection.  Therefore ranges cannot have any extension methods whatsoever.

If you draw a parallel between a range and a stack of paper, the shrink methods would probably be pop/snap...  I'd also propose next() for moving the start and prev() for moving the end.  It sounds a bit misleading but, on the other hand, it closely resembles forward and backward iteration with the opposite end of a range representing the iteration limit.  Or maybe forward()/backward(), or fwd()/back()?

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation