Iterators recap (page 3) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » Learn » Iterators recap (page 3)

December 15, 2006

Iterators recap

Posted by Bill Baxter
in reply to Sean Kelly

Bill Baxter

Posted in reply to Sean Kelly

Sean Kelly wrote:
> Bill Baxter wrote:
>>
>> Too bad the iterator discussions fizzled out without anything getting decided.
> 
> I think the design was pretty well settled when things ended.  The next step would be a sample implementation so folks could wrangle over syntax.  It's on my "to do" list, but I'm a bit short on free time at the moment.

Huh.  So what is your understanding of what the consensus was?

--bb

December 15, 2006

Re: Iterators recap

Posted by Sean Kelly
in reply to Bill Baxter

Sean Kelly

Posted in reply to Bill Baxter

Bill Baxter wrote:
> Sean Kelly wrote:
>> Bill Baxter wrote:
>>>
>>> Too bad the iterator discussions fizzled out without anything getting decided.
>>
>> I think the design was pretty well settled when things ended.  The next step would be a sample implementation so folks could wrangle over syntax.  It's on my "to do" list, but I'm a bit short on free time at the moment.
> 
> Huh.  So what is your understanding of what the consensus was?

Java-style, with random access iterators overloading array operations. I can't remember if there was any clear preference for the hasNext/getNext vs. the atEnd/getVal approach, but I tend to favor the latter.

Sean

December 15, 2006

Re: Iterators recap

Posted by Bill Baxter
in reply to Sean Kelly

Bill Baxter

Posted in reply to Sean Kelly

Sean Kelly wrote:
> Bill Baxter wrote:
>> Sean Kelly wrote:
>>> Bill Baxter wrote:
>>>>
>>>> Too bad the iterator discussions fizzled out without anything getting decided.
>>>
>>> I think the design was pretty well settled when things ended.  The next step would be a sample implementation so folks could wrangle over syntax.  It's on my "to do" list, but I'm a bit short on free time at the moment.
>>
>> Huh.  So what is your understanding of what the consensus was?
> 
> Java-style, with random access iterators overloading array operations. I can't remember if there was any clear preference for the hasNext/getNext vs. the atEnd/getVal approach, but I tend to favor the latter.

I like next/value/(ptr) myself.
Makes for very succinct while loops.  ptr available where it makes sense.

while(iter.next()) {  writefln(iter.value); }

while(iter.next()) {  *iter.ptr = 17; }


--bb

December 15, 2006

Re: Iterators recap

Posted by Sean Kelly
in reply to Bill Baxter

Sean Kelly

Posted in reply to Bill Baxter

Bill Baxter wrote:
> Sean Kelly wrote:
>> Bill Baxter wrote:
>>> Sean Kelly wrote:
>>>> Bill Baxter wrote:
>>>>>
>>>>> Too bad the iterator discussions fizzled out without anything getting decided.
>>>>
>>>> I think the design was pretty well settled when things ended.  The next step would be a sample implementation so folks could wrangle over syntax.  It's on my "to do" list, but I'm a bit short on free time at the moment.
>>>
>>> Huh.  So what is your understanding of what the consensus was?
>>
>> Java-style, with random access iterators overloading array operations. I can't remember if there was any clear preference for the hasNext/getNext vs. the atEnd/getVal approach, but I tend to favor the latter.
> 
> I like next/value/(ptr) myself.
> Makes for very succinct while loops.  ptr available where it makes sense.
> 
> while(iter.next()) {  writefln(iter.value); }
> 
> while(iter.next()) {  *iter.ptr = 17; }

It's great for loops, but can be confusing when iterators stick around for a while.  Say I have some code that operates on the current value of an iterator.  With the next() approach, how do I know whether the iterator is valid?  Still, perhaps a hybrid approach is best.  Have next return true on success and offer an atEnd property as well.


Sean

December 15, 2006

Re: Iterators recap

Posted by Bill Baxter
in reply to Sean Kelly

Bill Baxter

Posted in reply to Sean Kelly

Sean Kelly wrote:
> Bill Baxter wrote:
> 
>> Sean Kelly wrote:
>>
>>> Bill Baxter wrote:
>>>
>>>> Sean Kelly wrote:
>>>>
>>>>> Bill Baxter wrote:
>>>>>
>>>>>>
>>>>>> Too bad the iterator discussions fizzled out without anything getting decided.
>>>>>
>>>>>
>>>>> I think the design was pretty well settled when things ended.  The next step would be a sample implementation so folks could wrangle over syntax.  It's on my "to do" list, but I'm a bit short on free time at the moment.
>>>>
>>>>
>>>> Huh.  So what is your understanding of what the consensus was?
>>>
>>>
>>> Java-style, with random access iterators overloading array operations. I can't remember if there was any clear preference for the hasNext/getNext vs. the atEnd/getVal approach, but I tend to favor the latter.
>>
>>
>> I like next/value/(ptr) myself.
>> Makes for very succinct while loops.  ptr available where it makes sense.
>>
>> while(iter.next()) {  writefln(iter.value); }
>>
>> while(iter.next()) {  *iter.ptr = 17; }
> 
> 
> It's great for loops, but can be confusing when iterators stick around for a while.  Say I have some code that operates on the current value of an iterator.  With the next() approach, how do I know whether the iterator is valid?  Still, perhaps a hybrid approach is best.  Have next return true on success and offer an atEnd property as well.

Good point.  There's another problem related to iters sitting around a while, too, that I hadn't considered.  If you have to call next() to get the first item then you'll also need some way to query if the iterator is in it's "pre-begin" state.  So the next/atend approach only really works when there's no 'value' and 'next' is the only way to get the value.  Ick.

So maybe a for-ish way is better than while-ish way:

for (; !iter.end; iter.next) {
    writefln(iter.value);
}

And have iterators start off valid (if not already at the end).


--bb

December 15, 2006

Re: Iterators recap

Posted by Oskar Linde
in reply to Bill Baxter

Oskar Linde

Posted in reply to Bill Baxter

Bill Baxter wrote:
> Sean Kelly wrote:
>> Bill Baxter wrote:
>>
>>> Sean Kelly wrote:
>>>
>>>> Bill Baxter wrote:
>>>>
>>>>> Sean Kelly wrote:
>>>>>
>>>>>> Bill Baxter wrote:
>>>>>>
>>>>>>>
>>>>>>> Too bad the iterator discussions fizzled out without anything getting decided.
>>>>>>
>>>>>>
>>>>>> I think the design was pretty well settled when things ended.  The next step would be a sample implementation so folks could wrangle over syntax.  It's on my "to do" list, but I'm a bit short on free time at the moment.
>>>>>
>>>>>
>>>>> Huh.  So what is your understanding of what the consensus was?
>>>>
>>>>
>>>> Java-style, with random access iterators overloading array operations. I can't remember if there was any clear preference for the hasNext/getNext vs. the atEnd/getVal approach, but I tend to favor the latter.

Yes, this was basically the consensus.

>>>
>>>
>>> I like next/value/(ptr) myself.
>>> Makes for very succinct while loops.  ptr available where it makes sense.
>>>
>>> while(iter.next()) {  writefln(iter.value); }
>>>
>>> while(iter.next()) {  *iter.ptr = 17; }
>>
>>
>> It's great for loops, but can be confusing when iterators stick around for a while.  Say I have some code that operates on the current value of an iterator.  With the next() approach, how do I know whether the iterator is valid?  Still, perhaps a hybrid approach is best.  Have next return true on success and offer an atEnd property as well.
> 
> Good point.  There's another problem related to iters sitting around a while, too, that I hadn't considered.  If you have to call next() to get the first item then you'll also need some way to query if the iterator is in it's "pre-begin" state.  So the next/atend approach only really works when there's no 'value' and 'next' is the only way to get the value.  Ick.
> 
> So maybe a for-ish way is better than while-ish way:
> 
> for (; !iter.end; iter.next) {
>     writefln(iter.value);
> }
> 
> And have iterators start off valid (if not already at the end).

Yes, this may be better. This was basically the point the last iterator discussion reached. I would like to make this even more fun by throwing in bidirectional iterators. :)

Just to clarify my terminology here, a forward iterator needs three things:
1(R) a way to reference the data
2(T) a way to traverse the data
3(S) a way to signal the end of the data

If one wants an interface with less than 3 methods, the above behaviors need to be combined in some way. In news://news.digitalmars.com:119/eiqltc$22c1$1@digitaldaemon.com I briefly reviewd three different ways used by three different languages (Java, C#, Python).

Java combines (R+T) through next() and (S) through hasNext()

C# combines (T+S) through MoveNext() and (R) though Current()

Python combines (R+T+S) in next() (throwing an exception for S)

There are downsides with all approaches.

C# requires the iterator to initially be in an invalid state. Java/Python disallows peek()ing without traversal.
Python's exception requires more scaffolding at the use site.

Now to bidirectional iterators.

Separating R, T, S as three different methods, here are two ways a bidirectional iterator could be implemented:

The value-style:

BidirectionalIterator1(T) {
	T* begin;
	T* end;
	T* current;

	bool hasValue() { return cursor != end && cursor != (begin-1); }
	void moveNext() { current++; }
	void movePrevious() { current--; }

	T value() { return *current; }
	T value(T v) { return *current = v; }
}

Silly usage example:

auto i = someIterator;
while(i.hasValue) {
	if (i.value > x)
		i.moveNext;
	else
		i.movePrevious;
}

I like how succinct this turns out, but it has the disadvantage of internally needing two invalid values, one at the start and one at the end. Considering pointers internally, this means a pointer pointing before the start of an array, which is very bad. The other disadvantage is hasValue that has to check for both of these two cases.

One alternative is the Java style version. Where the pointer/cursor lies conceptually between the elements instead of at an element, giving:

BidirectionalIterator2(T) {
	T* begin;
	T* end;
	T* cursor;

	bool hasNext() { return cursor != end; }
	bool hasPrevious() { return cursor != begin; }
	void moveNext() { cursor++; }
	void movePrevious() { cursor--; }

	T next() { return *cursor; }
	T next(T v) { return *cursor = v; }

	T previous() { return *(cursor-1); }
	T previous(T v) { return *(cursor-1); }
}

auto i = someIterator;
while(i.hasNext) {
	if (i.next > 3)
		i.moveNext;
	else
		i.movePrevious;
}

The disadvantage is requiring two additional methods (hasPrevious and previous), but the advantage is that it doesn't need more than 1 illegal value (pointing 1 past the end, which is well supported).

Renaming next/previous into front/back or something may give more or less confusion. Probably the former.

Apart from different method names, are there other variants?

/Oskar

December 28, 2006

Re: Iterators recap

Posted by Sean Kelly
in reply to Oskar Linde

Sean Kelly

Posted in reply to Oskar Linde

Oskar Linde wrote:
> 
> Now to bidirectional iterators.
> 
> Separating R, T, S as three different methods, here are two ways a bidirectional iterator could be implemented:
> 
> The value-style:
> 
> BidirectionalIterator1(T) {
>     T* begin;
>     T* end;
>     T* current;
> 
>     bool hasValue() { return cursor != end && cursor != (begin-1); }
>     void moveNext() { current++; }
>     void movePrevious() { current--; }
> 
>     T value() { return *current; }
>     T value(T v) { return *current = v; }
> }
> 
> Silly usage example:
> 
> auto i = someIterator;
> while(i.hasValue) {
>     if (i.value > x)
>         i.moveNext;
>     else
>         i.movePrevious;
> }
> 
> I like how succinct this turns out, but it has the disadvantage of internally needing two invalid values, one at the start and one at the end.

Not necessarily.  The dummy node could be shared by both.  This is one reason why circular linked-list implementations are so common.  Things do admittedly get a bit weird for binary trees, however.

> Considering pointers internally, this means a pointer pointing
> before the start of an array, which is very bad.

In C++, while a bidirectional iterator may be able to traverse in either direction, it still generally has some natural association as a forward or reverse iterator.  That is, when the iterator is first returned from the sequence it was likely either initialized pointing to the first or last element.  If it's the former then it's associated with forward iteration, if it's the latter then reverse iteration.  For example, the C++ vector iterator is bidirectional (actually, random), but iterators returned by the begin() method are effectively forward iterators while iterators returned by the rbegin() method are effectively reverse iterators.  Perhaps this is the correct approach to take here as well?

> The other disadvantage
> is hasValue that has to check for both of these two cases.

True enough.  But range-checking is range-checking.  If it's possible for an iterator to go out of range and it's possible to check for this condition then one should do so, at least in debug builds.

> One alternative is the Java style version. Where the pointer/cursor lies conceptually between the elements instead of at an element, giving:
> 
> BidirectionalIterator2(T) {
>     T* begin;
>     T* end;
>     T* cursor;
> 
>     bool hasNext() { return cursor != end; }
>     bool hasPrevious() { return cursor != begin; }
>     void moveNext() { cursor++; }
>     void movePrevious() { cursor--; }
> 
>     T next() { return *cursor; }
>     T next(T v) { return *cursor = v; }
> 
>     T previous() { return *(cursor-1); }
>     T previous(T v) { return *(cursor-1); }
> }
> 
> 
> auto i = someIterator;
> while(i.hasNext) {
>     if (i.next > 3)
>         i.moveNext;
>     else
>         i.movePrevious;
> }
>     The disadvantage is requiring two additional methods (hasPrevious and previous), but the advantage is that it doesn't need more than 1 illegal value (pointing 1 past the end, which is well supported).

The other disadvantage is that this returns to the original problem where it's impossible to determine whether an iterator points to a valid location without moving.

> Apart from different method names, are there other variants?

See above.  The C++ approach seems to make more sense to me, though I'm still mulling it over.  There is a definite appeal to having iterators that are truly direction-agnostic, but as you've demonstrated above it does result in slightly weird implementation requirements.  Since D is like C++ in that the one-past-the-end location is valid but one-past-the-beginning is not, it seems reasonable to preserve that behavior here, even if a more Java-like iterator design is used.

Sean

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation