September 09, 2008
Steven Schveighoffer wrote:
> "Andrei Alexandrescu" wrote
> <snip>
> 
> Excellent ideas!  I think the best part is about how you almost never need individual iterators, only ever ranges.  Perfectly explained!
> 
> One issue that you might not have considered is using a container as a data structure, and not using it for algorithms.  For example, how would one specify that one wants to remove a specific element, not a range of elements.  Having a construct that points to a specific element but not any specific end element might be a requirement for non-algorithmic reasons.

I'm sure you know and imply this, but just to clarify for everybody: Modifying the topology of the container is a task carried by the primitives of the container. Ranges can "look" at the topology and change elements sitting in it, but not alter the topology.

Much like in STL, there's a container primitive for removing a range. It will return a range too, namely the range starting at the deleted position. Removing an element is really removing a range of one element - just a particular case.

> Also, some ranges might become invalid later on whereas the iterators would not.  Take for example a Hash container.  If you have to rehash the table, a range now might not make any sense, as the 'end' element may have moved to be before the 'begin' element.  But an iterator that points to a given element will still be valid (and could be used for removal).  In fact, I don't think ranges make a lot of sense for things like Hash where there isn't any defined order.  But you still should have a 'pointer' type to support O(1) removal.

Iterators can be easily defined over hashtables, but indeed they are easily invalidated if implemented efficiently.

> One doesn't see any of these problems with arrays, because with arrays, you are guaranteed to have contiguous memory.
> 
> What I'm trying to say is there may be a reason to have pointers for certain containers, even though they might be unsafe.

A pointer to an element can be taken as &(r.first). The range may or may not allow that, it's up to it.

> My personal pet peeve of many container implementations is not being able to remove elements from a container while iterating.  For example, I have a linked list of open file descriptors, iterate over the list, closing and removing those that are done (which should be O(1) per removal).  In many container implementations, iteration implies immutable, which means you have to add references to the elements to remove to another list to then remove afterwards (somtimes at the cost of O(n) per removal.  grrrr.)  I hope ranges will support removal while traversing.

In STL removing from a list while iterating is easy and efficient, albeit verbose as always:

list<Filedesc> lst;
for (list<Filedesc>::iterator i = lst.begin(); i != lst.end(); )
{
    if (should_remove) i = lst.erase(i);
    else ++i;
}

In Phobos things will be something like:

List!(Filedesc) lst;
for (auto r = lst.all; !r.isEmpty; )
{
    if (should_remove) r = lst.erase(take(1, r));
    else r.next;
}


Andrei
September 09, 2008
Andrei Alexandrescu wrote:
> Manfred_Nowak wrote:
>> Benji Smith wrote:
>>
>>> Given the use of "getNext", I think "hasNext" is a more natural
>>> choice 
>>
>> clapping hands.
> 
> Walter would love that.
> 
> for (R r = getR(); r.hasNext; r.next) { ... }
> 
> Look ma, no negation! Oops, I just materialized one with the exclamation sign.

I just discovered a problem with that. hasNext implies I'm supposed to call next to get the thingie. It should be hasFirst, which is less appealing.

Andrei
September 09, 2008
On Wed, Sep 10, 2008 at 12:30 AM, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:
>
> I've updated http://ssli.ee.washington.edu/~aalexand/d/tmp/std_range.html with a drawing (right at the beginning) that illustrates what primitives we need. Maybe this will help us choose even better names.

The text says that s should be a subrange of r, but the drawing shows s extending beyond r.  Does it actually need to be a subrange?

--bb
September 09, 2008
On Tue, 09 Sep 2008 10:30:58 -0500, Andrei Alexandrescu wrote:


> I'd like to go with:
> 
> r.first
> r.last
> r.next
> r.pop

LOL ... I was just thinking to myself ... "what's wrong with First and Last? I should suggest them." then I read this post.

"next" is fine, but "pop"? Isn't the pair of "next" called "prev(ious)" and the pair of "pop" called "push". So please, either have next/prev or push/pop, and in that case push/pop looks quite silly.

-- 
Derek Parnell
Melbourne, Australia
skype: derek.j.parnell
September 09, 2008
Andrei Alexandrescu wrote:
> Andrei Alexandrescu wrote:
>> Manfred_Nowak wrote:
>>> Benji Smith wrote:
>>>
>>>> Given the use of "getNext", I think "hasNext" is a more natural
>>>> choice 
>>>
>>> clapping hands.
>>
>> Walter would love that.
>>
>> for (R r = getR(); r.hasNext; r.next) { ... }
>>
>> Look ma, no negation! Oops, I just materialized one with the exclamation sign.
> 
> I just discovered a problem with that. hasNext implies I'm supposed to call next to get the thingie. It should be hasFirst, which is less appealing.
> 
> Andrei

I see where you're coming from, because the range shrinks itself element-by-element as it iterates, eventually disappearing into a zero-element range.

But for me as a consumer of the container, it's a little weird.

When I visualize iteration (beware: silly metaphor ahead), it looks like a frog hopping from one lilly-pad to another. He might have a well-defined start and end point, but the lilly-pads don't evaporate after the frog hops away.

When I see "isEmpty" in the implementation of an iteration routine, it makes me think of a producer/consumer work queue. The list is only empty if the consumer has consumed all the work items from the queue.

I get what you're saying about the iteration-range being empty because it shrinks itself during each step of the iteration. But to me, iteration is an idempotent operation, so something non-empty before the iteration should not be empty after the iteration.

--benji
September 09, 2008
On Tue, 09 Sep 2008 10:45:53 -0500, Andrei Alexandrescu wrote:


>          ref int left() { ... }

Is "left" a "movement in a specific direction" as in "go left at the next lights" or the "amount of stuff left over"? It is a bit ambiguous. Even if it is a direction, is it moving towards the first or the last item? It is not self-evident. As a user of the Latin alphabet I'd assume it was going towards the first item but a Hebrew or Arabic users might assume it was heading towards the end.

-- 
Derek Parnell
Melbourne, Australia
skype: derek.j.parnell
September 09, 2008
Andrei Alexandrescu wrote:
> I put together a short document for the range design. I definitely missed about a million things and have been imprecise about another million, so feedback would be highly appreciated. See:
> 
> http://ssli.ee.washington.edu/~aalexand/d/tmp/std_range.html

Just thinking off the top of my head...

How well would the proposal support a producer/consumer work queue, or a signal/slot implementation?

A work-queue consumer would view the queue as an infinite range with no end, but the producer would view that same queue as an infinite range with no beginning. And, conceivably, you could have "conduits" between the producer and consumer that would view that same queue as having neither a beginning nor an end.

I'm making no judgment about whether the proposal supports or doesn't support that kind of model. I'm just putting the idea out there for consideration.

--benji
September 09, 2008
> I disagree with that as I think that would be one perfect place to simplify the language thus fulfilling bearophile's and many others' wish.

I agree with this. It is a good idea, and it is explained in a very good way. You have my support Mr. Alexandrescu.
September 09, 2008
"Andrei Alexandrescu" wrote
> Steven Schveighoffer wrote:
>> "Andrei Alexandrescu" wrote
>> <snip>
>>
>> Excellent ideas!  I think the best part is about how you almost never need individual iterators, only ever ranges.  Perfectly explained!
>>
>> One issue that you might not have considered is using a container as a data structure, and not using it for algorithms.  For example, how would one specify that one wants to remove a specific element, not a range of elements.  Having a construct that points to a specific element but not any specific end element might be a requirement for non-algorithmic reasons.
>
> I'm sure you know and imply this, but just to clarify for everybody: Modifying the topology of the container is a task carried by the primitives of the container. Ranges can "look" at the topology and change elements sitting in it, but not alter the topology.

I agree.  There are cases where just an iterator is necessary.  For example, a linked-list implementation where the length is calculated instead of stored.  But that is the exception, the rule should be that you always ask the container to alter the topology.

>
> Much like in STL, there's a container primitive for removing a range. It will return a range too, namely the range starting at the deleted position. Removing an element is really removing a range of one element - just a particular case.

Yes, but the problem I see is how do you specify a range of one element.  In the case of an array, it is easy because you always know that no matter what happens to a container, the end of '1' element is a pointer to the next element in memory.  In the case of other containers, which could have changed since you obtained the range, you cannot be sure the 'range of 1' hasn't changed.  For instance, I would assume that a linked list range has two pointers, one to the first element, and one to the element just past the last element in the range.  But what if an element is inserted inbetween? Then your range suddenly got bigger.

What I'm trying to say is, maybe it would be desirable to have a pointer to exactly one element, instead of a range that could possibly change. Operations on that pointer type would be supported just like the operations on the ranges, but is more specific.

>
>> Also, some ranges might become invalid later on whereas the iterators would not.  Take for example a Hash container.  If you have to rehash the table, a range now might not make any sense, as the 'end' element may have moved to be before the 'begin' element.  But an iterator that points to a given element will still be valid (and could be used for removal). In fact, I don't think ranges make a lot of sense for things like Hash where there isn't any defined order.  But you still should have a 'pointer' type to support O(1) removal.
>
> Iterators can be easily defined over hashtables, but indeed they are easily invalidated if implemented efficiently.

Iterators don't have to be invalidated, but ranges would.

>
>> One doesn't see any of these problems with arrays, because with arrays, you are guaranteed to have contiguous memory.
>>
>> What I'm trying to say is there may be a reason to have pointers for certain containers, even though they might be unsafe.
>
> A pointer to an element can be taken as &(r.first). The range may or may not allow that, it's up to it.

I thought you stated that 'pointers' shouldn't be allowed, only ranges?  In general, I agree with that, but I think the ability to use a pointer type instead of ranges has advantages in some cases.

>
>> My personal pet peeve of many container implementations is not being able to remove elements from a container while iterating.  For example, I have a linked list of open file descriptors, iterate over the list, closing and removing those that are done (which should be O(1) per removal).  In many container implementations, iteration implies immutable, which means you have to add references to the elements to remove to another list to then remove afterwards (somtimes at the cost of O(n) per removal. grrrr.)  I hope ranges will support removal while traversing.
>
> In STL removing from a list while iterating is easy and efficient, albeit verbose as always:
>
> list<Filedesc> lst;
> for (list<Filedesc>::iterator i = lst.begin(); i != lst.end(); )
> {
>     if (should_remove) i = lst.erase(i);
>     else ++i;
> }
>
> In Phobos things will be something like:
>
> List!(Filedesc) lst;
> for (auto r = lst.all; !r.isEmpty; )
> {
>     if (should_remove) r = lst.erase(take(1, r));
>     else r.next;
> }

I prefer the dcollections syntax, but I did write it, so that's to be expected :) :

foreach(ref doPurge, fd; lst.purger)
   doPurge = shouldIRemove(fd);

Or if you prefer iterator-style syntax:

for(auto i = lst.begin; i != lst.end;)
{
    if shouldIRemove(i.value) i = lst.remove(i);
    else ++i;
}

But yes, I see that ranges are used for removal, and that should be supported in ordered containers.  But the notion of storing a reference to a single element as a 'range of 1' in certain containers is troublesome I think.

-Steve


September 09, 2008
Andrei Alexandrescu:
> In Phobos things will be something like:

List!(Filedesc) lst;
for (auto r = lst.all; !r.isEmpty; ) {
     if (should_remove)
          r = lst.erase(take(1, r));
     else
          r.next;
}

It may be better to invent and add some sugar to that, and foreach helps, maybe something like:

List!(Filedesc) lst;
foreach (ref r; lst.all) {
     if (predicate(lst.item(r)))
          r = lst.erase(r);
     else
          r = r.next();
}

I think that code of mine isn't good yet :-)

Bye,
bearophile
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18