RFC on range design for D2 (page 16) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » Announce » RFC on range design for D2 (page 16)

September 10, 2008

Re: RFC on range design for D2

Posted by Andrei Alexandrescu
in reply to Bill Baxter

Andrei Alexandrescu

Posted in reply to Bill Baxter

Bill Baxter wrote:
> But upon further reflection I think it may be that it's just not what
> I would call a bidirectional range.  By that I mean it's not good at
> solving the problems that a bidirectional iterator in C++ is good for.

It's good. I proved that constructively for std.algorithm, which of course doesn't stand. But I've also proved it theoretically informally to myself. Please imagine an algorithm that bidir iterators do and bidir ranges don't.

>  Your bidir range may be useful (though I'm not really convinced that
> very many algorithms need what it provides) --  but I think one also
> needs an iterator that's good at what C++'s bidir iterators are good
> at, i.e. moving the active cursor backwards or forwards.  I would call
> your construct more of a "double-headed" range than a bidirectional
> one.

Oh, one more thing. If you study any algorithm that uses bidirectional iterators (such as reverse or Stepanov's partition), you'll notice that ALWAYS WITHOUT EXCEPTION there's two iterators involved. One moves up, the other moves down. This is absolutely essential because it tells that a bidirectional range models all a bidirectional iterator could ever do. If you can move some bidirectional iterator down, then definitely you know its former boundary so you can model that move with a bidirectional range.

This is fundamental. Ranges NEVER grow. They ALWAYS shrink. Why? Simple: because a range has no idea what's outside of itself. It starts life with information of its limits from the container, and knows nothing about what's outside those limits. Consequently it ALWAYS WITHOUT EXCEPTION shrinks.

Andrei

September 10, 2008

Re: RFC on range design for D2

Posted by Bill Baxter
in reply to Andrei Alexandrescu

Bill Baxter

Posted in reply to Andrei Alexandrescu

On Thu, Sep 11, 2008 at 1:30 AM, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:
> Bill Baxter wrote:
>>
>> However a range isn't, generally speaking, a list.  It's a way to traverse or access data that may or may not be a list.  For something like an unbounded generator, it is odd to speak of the "first".  Such an object has a current value and a "next", but the value you can look at right now is only the "first" by a bit of a terminology stretch.
>
> Agreed. The problem with "current" instead of "first" is that there's no clear correspondent for "the last that the current will be". First and last are obvious. Current and last are... well, not bad either :o).
>
>> I think using list terminology unnecessarily confuses the iterating construct that does the accessing with the container being accessed. The range is not the container.  The range consists of a place where you are, and a termination condition.
>
> No. A bidirectional range also knows the last place you'll ever be, and is able to manipulate it.

That's just a mutable termination condition.  Still fits my description.

>>  The range is not "empty" or
>> "full" because it does not actually contain elements.
>
> It is because a range is a view. The view can reduce to nothing. In math, an interval can be "empty". That doesn't mean it made all real numbers disappear :o).

The other problem with empty is that it doesn't generalize to what I happen think a bidirectional range should be, one with .next .prev, .hasNext and .hasPrev.

Your bidir iterator in C++ parlance is a forward iterator and a reverse iterator operating on the same sequence.  I can't really think of any algorithms other than the one you showed that use such a pair.

On the other hand my bidir is useful in all the places a C++ bidir iterator is useful.  Any time you need to scan a cursor back and forth.  It basically maps directly onto the operation a doubly-linked list is good at.  But could be used in traversing any tree-like data structure too, I think.

>> Similarly, using list terminology led you to "pop".  But pop on a range does not actually remove any content.  Pop just moves the goal post on one end.
>
> Correct. Then how would you name'em?

I made one proposal on digitalmars.D and I'm still waiting for comments.

>> And then there's the various union/diff stuff, which everyone seems to find confusing.  I think much of that confusion and mental overhead just goes away if you think of a range as a good old iterator plus a stopping condition.
>
> I like before and after. Besides, the challenge is that you come with something that's not confusing.

Yeh, before and after aren't too bad.

>>> Names for the before and after range operations are still in the air...
>>>
>>> Are you referring to the "range" name itself?
>>
>> That could be part of the reason for this tendency to try to assign list-like names to the parts.  If it were called a "bounded iterator" I think that would better describe the perspective I'm pushing, and naturally lead to choices like "atEnd" instead of "isEmpty".
>
> Words are powerful. Phrases are less powerful. I'll never ever settle on anything longer than ONE word for the concept. Ranges came to mind because boost uses them with a similar meaning.

Yeh, I don't really have a problem with calling them ranges, as long as people keep in mind they're really bounded iterators.  :-)

--bb

September 10, 2008

Re: RFC on range design for D2

Posted by Andrei Alexandrescu
in reply to Bill Baxter

Andrei Alexandrescu

Posted in reply to Bill Baxter

Bill Baxter wrote:
> The other problem with empty is that it doesn't generalize to what I
> happen think a bidirectional range should be, one with .next .prev,
> .hasNext and .hasPrev.

hasNext and hasPrev are not orthogonal and add unnecessarily complicated. Is there a range that has next but not prev, or vice versa? No, Sir. There is an "there's still meat on the plate" condition and that's all needed.

> Your bidir iterator in C++ parlance is a forward iterator and a
> reverse iterator operating on the same sequence.  I can't really think
> of any algorithms other than the one you showed that use such a pair.
>
> On the other hand my bidir is useful in all the places a C++ bidir
> iterator is useful.  Any time you need to scan a cursor back and
> forth.  It basically maps directly onto the operation a doubly-linked
> list is good at.  But could be used in traversing any tree-like data
> structure too, I think.

--it is easily done with range primitives if you've saved the initial position of it.

>>> Similarly, using list terminology led you to "pop".  But pop on a
>>> range does not actually remove any content.  Pop just moves the goal
>>> post on one end.
>> Correct. Then how would you name'em?

r.atEnd
r.value
r.next
r.moveTo(s)
r.moveToEndOf(s)
r.last
r.pop
r.moveEndToEndOf(s) / moveEndTo(s)

I see in another post:

r.atStart

which I think is a design faux pas. Aside from that, things seem workable. But honestly I don't see how they bring a world of difference, nor had I a slap on my forehead moment when seeing the primitive names (as I did with before and after).

Andrei

September 10, 2008

Re: RFC on range design for D2

Posted by David Gileadi
in reply to Sergey Gromov

David Gileadi

Posted in reply to Sergey Gromov

Sergey Gromov wrote:
> If you draw a parallel between a range and a stack of paper, the shrink methods would probably be pop/snap...  I'd also propose next() for moving the start and prev() for moving the end.  It sounds a bit misleading but, on the other hand, it closely resembles forward and backward iteration with the opposite end of a range representing the iteration limit.  Or maybe forward()/backward(), or fwd()/back()?

Perhaps reduce() instead of pop() for moving the end?

September 10, 2008

Re: RFC on range design for D2

Posted by Andrei Alexandrescu
in reply to David Gileadi

Andrei Alexandrescu

Posted in reply to David Gileadi

David Gileadi wrote:
> Sergey Gromov wrote:
>> If you draw a parallel between a range and a stack of paper, the shrink methods would probably be pop/snap...  I'd also propose next() for moving the start and prev() for moving the end.  It sounds a bit misleading but, on the other hand, it closely resembles forward and backward iteration with the opposite end of a range representing the iteration limit.  Or maybe forward()/backward(), or fwd()/back()?
> 
> Perhaps reduce() instead of pop() for moving the end?

I love reduce! Thought of it as well. Unfortunately the term is loaded with reduction of a binary operation over a range, as e.g. in std.algorithm.

I think shrink() is reasonable. next() moves to the next thing. shrink() shrinks the set of dudes I can reach.

Andrei

September 10, 2008

Re: RFC on range design for D2

Posted by Steven Schveighoffer
in reply to Andrei Alexandrescu

Steven Schveighoffer

Posted in reply to Andrei Alexandrescu

"Andrei Alexandrescu" wrote
> Bill Baxter wrote:
>> But upon further reflection I think it may be that it's just not what I would call a bidirectional range.  By that I mean it's not good at solving the problems that a bidirectional iterator in C++ is good for.
>
> It's good. I proved that constructively for std.algorithm, which of course doesn't stand. But I've also proved it theoretically informally to myself. Please imagine an algorithm that bidir iterators do and bidir ranges don't.

Any iterative algorithm where the search might go up or down might be a candidate.  Although I think you have a hard time showing one that needs strictly bidirectional iterators and not random access iterators.  Perhaps a stream represented as a linked list?  Imagine a video stream coming in, where the player buffers 10 seconds of data for decoding, and keeps 10 seconds of data buffered behind the current spot.  If the user pauses the video, then wants to play backwards for 5 seconds, what kind of structure would you use to represent the 'current point in time'?  A bidir range doesn't cut it, because it can only move one direction at a time.  You would need 2 bidir ranges, but since you can't 'grow' the ranges, you can't add stuff as it is consumed from the forward range to the backwards range, or am I wrong about that?  So how do you continually update your backwards iterator?  I suppose you could simply 'generate' the backwards iterator when needed by diff'ing with the all range, but it seems unnecessarily cumbersome.  In fact, you'd need to regenerate both ranges as data is removed from the front and added to the back (because the ends are continually changing).  Perhaps a meta-range which has 2 bidir ranges in it can be provided.  It would be simple enough to implement using existing ranges, but might have unnecessary performance issues.

My belief is that ranges should be the form of input to algorithms, but iterators should be provided for using containers as general data structures.  Similar to how strings are represented by arrays/slices, but iterators (pointers) exist if you need them.

I'll probably move forward with this model in dcollections, I really like the range idea, and in general the view on how ranges are akin to slices. But I also like having access to iterators for other functions.

-Steve

September 10, 2008

Re: RFC on range design for D2

Posted by Andrei Alexandrescu
in reply to Steven Schveighoffer

Andrei Alexandrescu

Posted in reply to Steven Schveighoffer

Steven Schveighoffer wrote:
> "Andrei Alexandrescu" wrote
>> Bill Baxter wrote:
>>> But upon further reflection I think it may be that it's just not what
>>> I would call a bidirectional range.  By that I mean it's not good at
>>> solving the problems that a bidirectional iterator in C++ is good for.
>> It's good. I proved that constructively for std.algorithm, which of course doesn't stand. But I've also proved it theoretically informally to myself. Please imagine an algorithm that bidir iterators do and bidir ranges don't.
> 
> Any iterative algorithm where the search might go up or down might be a candidate.  Although I think you have a hard time showing one that needs strictly bidirectional iterators and not random access iterators.  Perhaps a stream represented as a linked list?  Imagine a video stream coming in, where the player buffers 10 seconds of data for decoding, and keeps 10 seconds of data buffered behind the current spot.  If the user pauses the video, then wants to play backwards for 5 seconds, what kind of structure would you use to represent the 'current point in time'?  A bidir range doesn't cut it, because it can only move one direction at a time.

Of course it does. You just remember the leftmost point in time you need to remember. Then you use range primitives to get to where you want. Maybe a better abstraction for all that is a sliding window though.

> You would need 2 bidir ranges, but since you can't 'grow' the ranges, you can't add stuff as it is consumed from the forward range to the backwards range, or am I wrong about that?  So how do you continually update your backwards iterator?  I suppose you could simply 'generate' the backwards iterator when needed by diff'ing with the all range, but it seems unnecessarily cumbersome.  In fact, you'd need to regenerate both ranges as data is removed from the front and added to the back (because the ends are continually changing).  Perhaps a meta-range which has 2 bidir ranges in it can be provided.  It would be simple enough to implement using existing ranges, but might have unnecessary performance issues.

You don't need a meta range, though it's a good idea to have it as a convenience structure. All you need is store the two ranges and do range operations on them.

Notice that "a range can't grow" is different from "a range can't be assigned from a larger range". In particular, a range operation can return a range larger than both input ranges. But not larger than their union :o).

> My belief is that ranges should be the form of input to algorithms, but iterators should be provided for using containers as general data structures.  Similar to how strings are represented by arrays/slices, but iterators (pointers) exist if you need them.

If we agree it's better without iterators if not needed, we'd need a strong case to add them. Right now I have a strong case against them.

> I'll probably move forward with this model in dcollections, I really like the range idea, and in general the view on how ranges are akin to slices. But I also like having access to iterators for other functions.

Which functions?


Andrei

September 10, 2008

Re: RFC on range design for D2

Posted by Bill Baxter
in reply to Andrei Alexandrescu

Bill Baxter

Posted in reply to Andrei Alexandrescu

On Thu, Sep 11, 2008 at 1:45 AM, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:
> Bill Baxter wrote:
>>
>> But upon further reflection I think it may be that it's just not what I would call a bidirectional range.  By that I mean it's not good at solving the problems that a bidirectional iterator in C++ is good for.
>
> It's good. I proved that constructively for std.algorithm, which of course doesn't stand. But I've also proved it theoretically informally to myself. Please imagine an algorithm that bidir iterators do and bidir ranges don't.

Here's one from DinkumWare's <algorithm>:

template<class _BidIt1,
	class _BidIt2,
	class _BidIt3> inline
	_BidIt3 _Merge_backward(_BidIt1 _First1, _BidIt1 _Last1,
		_BidIt2 _First2, _BidIt2 _Last2, _BidIt3 _Dest, _Range_checked_iterator_tag)
	{	// merge backwards to _Dest, using operator<
	for (; ; )
		if (_First1 == _Last1)
			return (_STDEXT unchecked_copy_backward(_First2, _Last2, _Dest));
		else if (_First2 == _Last2)
			return (_STDEXT unchecked_copy_backward(_First1, _Last1, _Dest));
		else if (_DEBUG_LT(*--_Last2, *--_Last1))
			*--_Dest = *_Last1, ++_Last2;
		else
			*--_Dest = *_Last2, ++_Last1;
	}

You can probably work around it some way, but basically it's using the ability to ++ and -- on the same end as a sort of "peek next".

Here's another, an insertion sort:

template<class _BidIt,
	class _Ty> inline
	void _Insertion_sort1(_BidIt _First, _BidIt _Last, _Ty *)
	{	// insertion sort [_First, _Last), using operator<
	if (_First != _Last)
		for (_BidIt _Next = _First; ++_Next != _Last; )
			{	// order next element
			_BidIt _Next1 = _Next;
			_Ty _Val = *_Next;

			if (_DEBUG_LT(_Val, *_First))
				{	// found new earliest element, move to front
				_STDEXT unchecked_copy_backward(_First, _Next, ++_Next1);
				*_First = _Val;
				}
			else
				{	// look for insertion point after first
				for (_BidIt _First1 = _Next1;
					_DEBUG_LT(_Val, *--_First1);
					_Next1 = _First1)
					*_Next1 = *_First1;	// move hole down
				*_Next1 = _Val;	// insert element in hole
				}
			}
	}

I /think/ that's taking advantage of going both ways on the same iterator (or at least copies of the same iterator), but the code is a little hard to read.

Part of my argument here is that it's more natural and requires less cognitive load to think of things in terms of moving a cursor back and forth.  So you won't convince me by constructing clever range unions and differences to achieve the same thing as a simple ++ and -- can do. :-)

Also a cursor that can go forward and backwards inbetween two limits is exactly what is easy to do with a doubly linked list.  If you know how to use a doubly linked list you know how to use my version of bidir ranges.  That's true in all cases where you are using a doubly-linked list.  For yours you have to think about how to map what you want to do onto the operations that are actually available.  To me that's clearly a greater cognitive load.

Another example is a function that is supposed to put a value back into its proper sorted place.  Say you had a sorted list and now the value of one node has been modified.  Write the function that puts that value back in its rightful place.  The natural way to do it is with a range that has a cursor pointing to the modified node that can be moved either back or forward.

Also I see the function "std::advance" is used quite a lot in this implementation of std::algorithm.  That moves the cursor forwards or backwards N steps depending on the sign of N.

>>  Your bidir range may be useful (though I'm not really convinced that
>> very many algorithms need what it provides) --  but I think one also
>> needs an iterator that's good at what C++'s bidir iterators are good
>> at, i.e. moving the active cursor backwards or forwards.  I would call
>> your construct more of a "double-headed" range than a bidirectional
>> one.
>
> Oh, one more thing. If you study any algorithm that uses bidirectional iterators (such as reverse or Stepanov's partition), you'll notice that ALWAYS WITHOUT EXCEPTION there's two iterators involved. One moves up, the other moves down. This is absolutely essential because it tells that a bidirectional range models all a bidirectional iterator could ever do. If you can move some bidirectional iterator down, then definitely you know its former boundary so you can model that move with a bidirectional range.

This does seem to be true of a many of the algorithms that use bidirs in std::algorithm, which did surprise me.  Actually seems to me that these types of algorithms are only using bidirectional iterators for a technicality -- because you can't compare a forward iterator and a reverse iterator.  The bidirectionality of the iterator is not really material.  One only needs the ++ op for one and the -- op for the other.  That says to me the name of the range that does these two things should be something other than "bidirectional", because bidirectionality is not really the key property.  "Two-headed range" or "squeeze range" or "pinch range" might be good names.

But anyway, I am convinced that your shrinking range type is useful.

> This is fundamental. Ranges NEVER grow. They ALWAYS shrink. Why? Simple: because a range has no idea what's outside of itself. It starts life with information of its limits from the container, and knows nothing about what's outside those limits. Consequently it ALWAYS WITHOUT EXCEPTION shrinks.

Doesn't seem to be quite so absolute from my perusal of std::algorithm.

--bb

September 10, 2008

Re: RFC on range design for D2

Posted by Andrei Alexandrescu
in reply to Bill Baxter

Andrei Alexandrescu

Posted in reply to Bill Baxter

Bill Baxter wrote:
> Part of my argument here is that it's more natural and requires less
> cognitive load to think of things in terms of moving a cursor back and
> forth.  So you won't convince me by constructing clever range unions
> and differences to achieve the same thing as a simple ++ and -- can
> do. :-)

I agree, and I agreed in the draft on ranges, that code using ranges can on occasion be more awkward than code using iterators. I think their advantages do outweigh this disadvantage.

>> This is fundamental. Ranges NEVER grow. They ALWAYS shrink. Why? Simple:
>> because a range has no idea what's outside of itself. It starts life with
>> information of its limits from the container, and knows nothing about what's
>> outside those limits. Consequently it ALWAYS WITHOUT EXCEPTION shrinks.
> 
> Doesn't seem to be quite so absolute from my perusal of std::algorithm.

Code using iterators will naturally avail itself of all of their advantages. Code using ranges will do the same. From my experience with rewriting std.algorithm, the working style is a bit different. On occasion iterators are indeed more flexible. But overall my code has reduced in size and became safer because ranges are a higher-level abstraction. Also often code using ranges is easier to follow because there are fewer variables with more apparent meaning, and the progress of the algorithm is easier to follow by tracking range shrinking.

Andrei

September 10, 2008

Re: RFC on range design for D2

Posted by Bill Baxter
in reply to Andrei Alexandrescu

Bill Baxter

Posted in reply to Andrei Alexandrescu

On Thu, Sep 11, 2008 at 2:44 AM, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:
> Steven Schveighoffer wrote:
>>
>> "Andrei Alexandrescu" wrote
>>>
>>> Bill Baxter wrote:
>>>>
>>>> But upon further reflection I think it may be that it's just not what I would call a bidirectional range.  By that I mean it's not good at solving the problems that a bidirectional iterator in C++ is good for.
>>>
>>> It's good. I proved that constructively for std.algorithm, which of
>>> course doesn't stand. But I've also proved it theoretically informally to
>>> myself. Please imagine an algorithm that bidir iterators do and bidir ranges
>>> don't.
>>
>> Any iterative algorithm where the search might go up or down might be a candidate.  Although I think you have a hard time showing one that needs strictly bidirectional iterators and not random access iterators.  Perhaps a stream represented as a linked list?  Imagine a video stream coming in, where the player buffers 10 seconds of data for decoding, and keeps 10 seconds of data buffered behind the current spot.  If the user pauses the video, then wants to play backwards for 5 seconds, what kind of structure would you use to represent the 'current point in time'?  A bidir range doesn't cut it, because it can only move one direction at a time.
>
> Of course it does. You just remember the leftmost point in time you need to remember. Then you use range primitives to get to where you want. Maybe a better abstraction for all that is a sliding window though.

Cognitive load...
What if I want to write a nice standalone function that takes a
pointer to where we are and manipulates it?  I have to pass that
function two iterators I suppose?  One is (begin,current) the other
(current,end), and as I iterate I have to move both the second of the
first and the first of second?  All just to do something that should
be trivial with a linked list.

I agree that your pinch range is needed, but I also see a need for something that maps more directly onto the features of a doubly linked list.

--bb

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation