February 22, 2011
On Tuesday 22 February 2011 01:04:48 Lutger Blijdestijn wrote:
> Jonathan M Davis wrote:
> > Okay, removing elements from a container sucks right now. You can do stuff like removeAny (generally pretty useless IMHO) or removeFront just fine, but removing an arbitrary range from a container just plain sucks.
> > 
> > remove takes a range type which is the range type for the container that it's on. That makes sense. It's obviously not going to be able to a take an arbitrary range and remove that from itself. How would it know which elements in that range corresponded to which elements in itself - especially when it could be a range which skips elements or something similar? So, _that_ part makes sense.
> 
> I don't understand why not. Given, as you said, that a lot of functions return a new type it would make sense. My answer as to how would be something like:
> 
> while range not empty
>     container.remove range.front
>     range.popFront
> 
> Is there a problem with that?

It's horribly inefficient for one. Also, I'm not sure that the range is valid any more after the remove, since it's front was removed. popFront may not work correctly.

> > But have you actually tried to get a range of the appropriate type to remove from a container? It seems like almost ever function in std.range and std.algorithm returns a new range type, making them completely useless for processing a range to be removed from a container.
> > 
> > I was looking to remove a single element for a RedBlackTree. The best function that I could think to get the proper range was findSplit. The middle portion of the return value would be the portion that I was looking for. I'd take that and pass it to RedBlackTree's remove. Wrong. It uses takeExactly in its implementation and the first two portions of the result of findSplit aren't the right range type.
> > 
> > So, what do I do? The _only_ thing that I can think of at the moment is to use find to locate the beginning of the range that I want, take the length of the range with walkLength and then use popBackN to pop of the elements I don't want. e.g.
> 
> The table in the docs mention stableRemoveAny(v) which says "Same as
> c.removeAny(v), but is guaranteed to not invalidate any iterators."
> 
> Though c.removeAny(v) itself is not listed in the table nor implemented in RedBlackTree, isn't this the right function for the job? (I take it v stands for a value of the container, rather than a range). builtin aa's also implement this function.

removeAny doesn't solve the problem. For starters, the table appears to be wrong in that stableRemoveAny takes a value and removeAny doesn't. So, either stableRemoveAny is supposed to not take a value or there's supposed to be another version of removeAny which takes a value, and it's missing. So, I don't know what exactly is intended with stableRemoveAny.

Regardless of that, however, remove still needs to work. It's perfectly valid to want to remove a range of elements rather than just one.

- Jonathan M Davis
February 22, 2011
Jonathan M Davis wrote:

> On Tuesday 22 February 2011 01:04:48 Lutger Blijdestijn wrote:
>> Jonathan M Davis wrote:
>> > Okay, removing elements from a container sucks right now. You can do stuff like removeAny (generally pretty useless IMHO) or removeFront just fine, but removing an arbitrary range from a container just plain sucks.
>> > 
>> > remove takes a range type which is the range type for the container that it's on. That makes sense. It's obviously not going to be able to a take an arbitrary range and remove that from itself. How would it know which elements in that range corresponded to which elements in itself - especially when it could be a range which skips elements or something similar? So, _that_ part makes sense.
>> 
>> I don't understand why not. Given, as you said, that a lot of functions return a new type it would make sense. My answer as to how would be something like:
>> 
>> while range not empty
>>     container.remove range.front
>>     range.popFront
>> 
>> Is there a problem with that?
> 
> It's horribly inefficient for one. Also, I'm not sure that the range is valid any more after the remove, since it's front was removed. popFront may not work correctly.

It's a matter of range (in)validation right? I admit at this point I'm not sure how it is supposed to work. remove + popFront is a bad idea if the range is a view into the contained from where stuff is removed, but in c++ this works for map: some_map.erase(i++) where i is an iterator. So it depends on the type of container what is supported I think. But perhaps this falls under Walter's category of useless wankery, since it is easy to write yourself.

>> > But have you actually tried to get a range of the appropriate type to remove from a container? It seems like almost ever function in std.range and std.algorithm returns a new range type, making them completely useless for processing a range to be removed from a container.
>> > 
>> > I was looking to remove a single element for a RedBlackTree. The best function that I could think to get the proper range was findSplit. The middle portion of the return value would be the portion that I was looking for. I'd take that and pass it to RedBlackTree's remove. Wrong. It uses takeExactly in its implementation and the first two portions of the result of findSplit aren't the right range type.
>> > 
>> > So, what do I do? The _only_ thing that I can think of at the moment is to use find to locate the beginning of the range that I want, take the length of the range with walkLength and then use popBackN to pop of the elements I don't want. e.g.
>> 
>> The table in the docs mention stableRemoveAny(v) which says "Same as
>> c.removeAny(v), but is guaranteed to not invalidate any iterators."
>> 
>> Though c.removeAny(v) itself is not listed in the table nor implemented in RedBlackTree, isn't this the right function for the job? (I take it v stands for a value of the container, rather than a range). builtin aa's also implement this function.
> 
> removeAny doesn't solve the problem. For starters, the table appears to be wrong in that stableRemoveAny takes a value and removeAny doesn't. So, either stableRemoveAny is supposed to not take a value or there's supposed to be another version of removeAny which takes a value, and it's missing. So, I don't know what exactly is intended with stableRemoveAny.
> 
> Regardless of that, however, remove still needs to work. It's perfectly valid to want to remove a range of elements rather than just one.
> 
> - Jonathan M Davis

(stable)removeAny does not solve the problem of removing ranges, but there must be *something* that solves the problem of removing one element. You found one way to do it for RedBlackTree (i gave up), but if I am not mistaken it doesn't have the right complexity and efficient removal is a key property of this container.
February 22, 2011
On Tuesday 22 February 2011 03:52:34 Lutger Blijdestijn wrote:
> Jonathan M Davis wrote:
> > On Tuesday 22 February 2011 01:04:48 Lutger Blijdestijn wrote:
> >> Jonathan M Davis wrote:
> >> > Okay, removing elements from a container sucks right now. You can do stuff like removeAny (generally pretty useless IMHO) or removeFront just fine, but removing an arbitrary range from a container just plain sucks.
> >> > 
> >> > remove takes a range type which is the range type for the container that it's on. That makes sense. It's obviously not going to be able to a take an arbitrary range and remove that from itself. How would it know which elements in that range corresponded to which elements in itself - especially when it could be a range which skips elements or something similar? So, _that_ part makes sense.
> >> 
> >> I don't understand why not. Given, as you said, that a lot of functions return a new type it would make sense. My answer as to how would be something like:
> >> 
> >> while range not empty
> >> 
> >>     container.remove range.front
> >>     range.popFront
> >> 
> >> Is there a problem with that?
> > 
> > It's horribly inefficient for one. Also, I'm not sure that the range is valid any more after the remove, since it's front was removed. popFront may not work correctly.
> 
> It's a matter of range (in)validation right? I admit at this point I'm not sure how it is supposed to work. remove + popFront is a bad idea if the range is a view into the contained from where stuff is removed, but in c++ this works for map: some_map.erase(i++) where i is an iterator. So it depends on the type of container what is supported I think. But perhaps this falls under Walter's category of useless wankery, since it is easy to write yourself.

It depends on the implementation. It's possible that iterating an iterator or popping the front of a range where that iterator no longer points to a valid element actually works. It's also possible that, because it no longer points to a valid element, it _doesn't_ work. It wouldn't surprise me at all that some_map.erase(i++) isn't guaranteed to work in C++ at all but that you've just gotten lucky with the STL implementation that you've been using. I'd have to research it to be sure. But my guess would be that you've been lucky.

> >> > But have you actually tried to get a range of the appropriate type to remove from a container? It seems like almost ever function in std.range and std.algorithm returns a new range type, making them completely useless for processing a range to be removed from a container.
> >> > 
> >> > I was looking to remove a single element for a RedBlackTree. The best function that I could think to get the proper range was findSplit. The middle portion of the return value would be the portion that I was looking for. I'd take that and pass it to RedBlackTree's remove. Wrong. It uses takeExactly in its implementation and the first two portions of the result of findSplit aren't the right range type.
> >> > 
> >> > So, what do I do? The _only_ thing that I can think of at the moment is to use find to locate the beginning of the range that I want, take the length of the range with walkLength and then use popBackN to pop of the elements I don't want. e.g.
> >> 
> >> The table in the docs mention stableRemoveAny(v) which says "Same as
> >> c.removeAny(v), but is guaranteed to not invalidate any iterators."
> >> 
> >> Though c.removeAny(v) itself is not listed in the table nor implemented in RedBlackTree, isn't this the right function for the job? (I take it v stands for a value of the container, rather than a range). builtin aa's also implement this function.
> > 
> > removeAny doesn't solve the problem. For starters, the table appears to be wrong in that stableRemoveAny takes a value and removeAny doesn't. So, either stableRemoveAny is supposed to not take a value or there's supposed to be another version of removeAny which takes a value, and it's missing. So, I don't know what exactly is intended with stableRemoveAny.
> > 
> > Regardless of that, however, remove still needs to work. It's perfectly valid to want to remove a range of elements rather than just one.
> > 
> > - Jonathan M Davis
> 
> (stable)removeAny does not solve the problem of removing ranges, but there must be *something* that solves the problem of removing one element. You found one way to do it for RedBlackTree (i gave up), but if I am not mistaken it doesn't have the right complexity and efficient removal is a key property of this container.

If you could just do take or takeExactly on the range you got from find and pass it to remove, then it wouldn't matter whether you wanted to remove 1 element or many. The only difference is the number passed to take. The problem is that that doesn't work, because take returns the wrong range type. It has no way to get n elements from the front of the range and have them be in a range which is the same type as the orignal. So, I don't think that one element vs many is really the problem here. It's the inability to get the first n elements of a forward range and still retain the same range type.

Regardless, you're right. the solution I found most definitely does _not_ have the right complexity. Because you have to use walkLength on the range that you get from find, it becomes O(n) instead of O(log n), and O(log n) is what it's _supposed_ to be for a red black tree. As I said, I don't believe that the current situation with remove is tenable, and the only solution that I see at the moment is to change forward range so that it has a function which returns the first n elements of that range with the same range type as that range.

- Jonathan M Davis
February 22, 2011
On 2/21/11 8:55 PM, Jonathan M Davis wrote:
> Okay, removing elements from a container sucks right now. You can do stuff like
> removeAny (generally pretty useless IMHO) or removeFront just fine, but removing
> an arbitrary range from a container just plain sucks.
>
> remove takes a range type which is the range type for the container that it's
> on. That makes sense. It's obviously not going to be able to a take an arbitrary
> range and remove that from itself. How would it know which elements in that
> range corresponded to which elements in itself - especially when it could be a
> range which skips elements or something similar? So, _that_ part makes sense.
>
> But have you actually tried to get a range of the appropriate type to remove
> from a container? It seems like almost ever function in std.range and
> std.algorithm returns a new range type, making them completely useless for
> processing a range to be removed from a container.
>
> I was looking to remove a single element for a RedBlackTree. The best function
> that I could think to get the proper range was findSplit. The middle portion of
> the return value would be the portion that I was looking for. I'd take that and
> pass it to RedBlackTree's remove. Wrong. It uses takeExactly in its
> implementation and the first two portions of the result of findSplit aren't the
> right range type.

It's good that positioned remove does not work with findSplit. It would be inefficient to use the wonderfully structured tree for linear search. The algorithm should be simple - find the key to delete in O(log n), then remove it.

> So, what do I do? The _only_ thing that I can think of at the moment is to use
> find to locate the beginning of the range that I want, take the length of the
> range with walkLength and then use popBackN to pop of the elements I don't want.
[snip]

What we need to do is add to RedBlackTree a function that accepts the result of take() - the same way SList does. It has only recently dawned on me that certain generic ranges are pivotal to algorithms and containers, and so far I identified take() and takeExactly() to be two such important ranges.

One other thing that we can add to RedBlackTree is a simple removeKey() convenience method that does the find-and-remove thing. Any Phobos developer (including JMD of course :o)), please feel free to implement all of the above and submit it for pulling. And thanks Jonathan for using std.container - I am sure there are a few other wrinkles that will be revealed and then ironed out following consistent usage.


Andrei
February 22, 2011
On 2/21/11 11:27 PM, Jonathan M Davis wrote:
> The typical way to remove an element in the STL is to use find to find an element
> and then erase to remove it. remove in std.container is doing the same thing.
> The problem is that you can't give the result of find to remove, because instead
> of a single iterator, find gives you a whole range, and you probably don't want
> to remove that whole range. You generally either want to remove the first element
> or some set of elements at the front of the range.

This is the exact insight that made me recognize take() is an essential range type.

Andrei
February 22, 2011
On 2/22/11 3:04 AM, Lutger Blijdestijn wrote:
> The table in the docs mention stableRemoveAny(v) which says "Same as
> c.removeAny(v), but is guaranteed to not invalidate any iterators."
>
> Though c.removeAny(v) itself is not listed in the table nor implemented in
> RedBlackTree, isn't this the right function for the job? (I take it v stands
> for a value of the container, rather than a range). builtin aa's also
> implement this function.

I don't think removeAny() works for his case because he wants to remove a specific element from a container. removeAny() is useful when you want to express "pick one element from that container in the easiest way possible".

Andrei
February 22, 2011
Jonathan M Davis wrote:

> On Tuesday 22 February 2011 03:52:34 Lutger Blijdestijn wrote:
>> Jonathan M Davis wrote:
>> > On Tuesday 22 February 2011 01:04:48 Lutger Blijdestijn wrote:
>> >> Jonathan M Davis wrote:
>> >> > Okay, removing elements from a container sucks right now. You can do stuff like removeAny (generally pretty useless IMHO) or removeFront just fine, but removing an arbitrary range from a container just plain sucks.
>> >> > 
>> >> > remove takes a range type which is the range type for the container that it's on. That makes sense. It's obviously not going to be able to a take an arbitrary range and remove that from itself. How would it know which elements in that range corresponded to which elements in itself - especially when it could be a range which skips elements or something similar? So, _that_ part makes sense.
>> >> 
>> >> I don't understand why not. Given, as you said, that a lot of functions return a new type it would make sense. My answer as to how would be something like:
>> >> 
>> >> while range not empty
>> >> 
>> >>     container.remove range.front
>> >>     range.popFront
>> >> 
>> >> Is there a problem with that?
>> > 
>> > It's horribly inefficient for one. Also, I'm not sure that the range is valid any more after the remove, since it's front was removed. popFront may not work correctly.
>> 
>> It's a matter of range (in)validation right? I admit at this point I'm not sure how it is supposed to work. remove + popFront is a bad idea if the range is a view into the contained from where stuff is removed, but in c++ this works for map: some_map.erase(i++) where i is an iterator. So it depends on the type of container what is supported I think. But perhaps this falls under Walter's category of useless wankery, since it is easy to write yourself.
> 
> It depends on the implementation. It's possible that iterating an iterator or popping the front of a range where that iterator no longer points to a valid element actually works. It's also possible that, because it no longer points to a valid element, it _doesn't_ work. It wouldn't surprise me at all that some_map.erase(i++) isn't guaranteed to work in C++ at all but that you've just gotten lucky with the STL implementation that you've been using. I'd have to research it to be sure. But my guess would be that you've been lucky.

I'm fairly certain this is guaranteed, but specifically tied to std::map.
The iterator returns the old value and is incremented before erasure, which
doesn't invalidate iterators except the one that got removed.

>> >> > But have you actually tried to get a range of the appropriate type to remove from a container? It seems like almost ever function in std.range and std.algorithm returns a new range type, making them completely useless for processing a range to be removed from a container.
>> >> > 
>> >> > I was looking to remove a single element for a RedBlackTree. The best function that I could think to get the proper range was findSplit. The middle portion of the return value would be the portion that I was looking for. I'd take that and pass it to RedBlackTree's remove. Wrong. It uses takeExactly in its implementation and the first two portions of the result of findSplit aren't the right range type.
>> >> > 
>> >> > So, what do I do? The _only_ thing that I can think of at the moment is to use find to locate the beginning of the range that I want, take the length of the range with walkLength and then use popBackN to pop of the elements I don't want. e.g.
>> >> 
>> >> The table in the docs mention stableRemoveAny(v) which says "Same as
>> >> c.removeAny(v), but is guaranteed to not invalidate any iterators."
>> >> 
>> >> Though c.removeAny(v) itself is not listed in the table nor implemented in RedBlackTree, isn't this the right function for the job? (I take it v stands for a value of the container, rather than a range). builtin aa's also implement this function.
>> > 
>> > removeAny doesn't solve the problem. For starters, the table appears to be wrong in that stableRemoveAny takes a value and removeAny doesn't. So, either stableRemoveAny is supposed to not take a value or there's supposed to be another version of removeAny which takes a value, and it's missing. So, I don't know what exactly is intended with stableRemoveAny.
>> > 
>> > Regardless of that, however, remove still needs to work. It's perfectly valid to want to remove a range of elements rather than just one.
>> > 
>> > - Jonathan M Davis
>> 
>> (stable)removeAny does not solve the problem of removing ranges, but there must be *something* that solves the problem of removing one element. You found one way to do it for RedBlackTree (i gave up), but if I am not mistaken it doesn't have the right complexity and efficient removal is a key property of this container.
> 
> If you could just do take or takeExactly on the range you got from find and pass it to remove, then it wouldn't matter whether you wanted to remove 1 element or many. The only difference is the number passed to take. The problem is that that doesn't work, because take returns the wrong range type. It has no way to get n elements from the front of the range and have them be in a range which is the same type as the orignal. So, I don't think that one element vs many is really the problem here. It's the inability to get the first n elements of a forward range and still retain the same range type.

There are two problems I think. What if I want to remove an array of strings from a bst of strings? That should be supported, even if only by removing them one by one yourself. If all we got was remove(Range) then this scenario isn't supported at all. Another problem is the efficiency. With a bst, using find + take + remove has to locate the element twice (once for find, once for remove), whereas remove(value) would only need to do this once.

> Regardless, you're right. the solution I found most definitely does _not_ have the right complexity. Because you have to use walkLength on the range that you get from find, it becomes O(n) instead of O(log n), and O(log n) is what it's _supposed_ to be for a red black tree. As I said, I don't believe that the current situation with remove is tenable, and the only solution that I see at the moment is to change forward range so that it has a function which returns the first n elements of that range with the same range type as that range.
> 
> - Jonathan M Davis
February 22, 2011
On Tue, 22 Feb 2011 03:43:25 -0500, Denis Koroskin <2korden@gmail.com> wrote:

>
> I believe remove operation is better expressed in terms of a Cursor (iterator in C++). It also doesn't make much sense (at least of me) for a find to return a subrange of the source range. While it does generalize well, it's not really what most people expect.
>
> What I'd love to have instead is something like this:
>
> int[] arr = [1, 2, 3, 4, 5];
> auto cursor = arr.find(3); // either points to 3 or null
>
> int[] before = arr[0..cursor];
> int value = arr[cursor]; // for consistency, C++ uses *cursor
> int[] after = arr[arr.advance(cursor)..$];

This is exactly how dcollections works.

> Note that typeof(cursor) could/should actually be int*, but I don't use arr[cursor+1..$] because cursor doesn't know about range bounds, and as such it can't advance on it's own.

Here is how dcollections solves this.

A cursor in dcollections is actually a zero or one element range.  It can only reference exactly one element.  Since it has no references to other elements, it is immune to operations that use the surrounding elements.  In contrast, a red black tree range contains references to the begin and the end node.  This means operations on these nodes (i.e. removing the end node, or inserting an element between the two) can affect the range.

So essentially, your code would look like this in dcollections:

auto arr = new ArrayList!int(1,2,3,4,5);

auto cursor = arr.find(3);

int[] before = arr[0..cursor];
int value = arr[cursor];
cursor.popFront();
int[] after = arr[cursor..arr.end]; // note, $ overrides don't currently work, so we can't get the sugar here...

> I think it's also worth noting that "find" for trees makes even less sense (to me). You can't "slice" a tree arbitrarily, yet find tries to do that:
>
>>     auto rbt = RedBlackTree!int([0, 2, 5, 12, 59, 22]);
>>     assert(to!string(rbt[]) == "[0, 2, 5, 12, 22, 59]");
>>
>>     auto found = find(rbt[], 5);
>>     assert(to!string(found) == "[5, 12, 22, 59]");
>
> What is the type of "found"? It's clearly not a tree anymore. Don't know why, but it's still a range. How is it implemented? Without looking at the source I can tell it's implemented as an original tree + a cursor. And I believe the two are better be separated.

Actually, find is probably not the correct thing to use for this.  Use upperBound (or lowerBound, I can never remember which one does what) to get all the elements >= 5.

The range is not a tree, and this is intentional.  Containers are *not* ranges.  Think of a range as a 'view' into a container, but not a container itself.  You can't add/remove elements from the tree using a range, you can just traverse it.  If it helps, think of the range as a pair of iterators.

-Steve
February 22, 2011
Andrei Alexandrescu wrote:

> On 2/22/11 3:04 AM, Lutger Blijdestijn wrote:
>> The table in the docs mention stableRemoveAny(v) which says "Same as
>> c.removeAny(v), but is guaranteed to not invalidate any iterators."
>>
>> Though c.removeAny(v) itself is not listed in the table nor implemented in RedBlackTree, isn't this the right function for the job? (I take it v stands for a value of the container, rather than a range). builtin aa's also implement this function.
> 
> I don't think removeAny() works for his case because he wants to remove a specific element from a container. removeAny() is useful when you want to express "pick one element from that container in the easiest way possible".
> 
> Andrei

I was looking for remove(ElementType value) but this function is not listed anywhere in std.container. The entry for stableRemoveAny(v) references removeAny(v) - note the parameter - so I thought that indicated the intent of a function removeAny(ElementType) but it doesn't exist in the table either.

I assumed (mistakenly) that removeAny(v) would remove any value x where x == v from the container, where the choice of which would be left up to the container.


February 22, 2011
On Mon, 21 Feb 2011 21:55:20 -0500, Jonathan M Davis <jmdavisProg@gmx.com> wrote:

> Okay, removing elements from a container sucks right now. You can do stuff like
> removeAny (generally pretty useless IMHO) or removeFront just fine, but removing
> an arbitrary range from a container just plain sucks.
>
> remove takes a range type which is the range type for the container that it's
> on. That makes sense. It's obviously not going to be able to a take an arbitrary
> range and remove that from itself. How would it know which elements in that
> range corresponded to which elements in itself - especially when it could be a
> range which skips elements or something similar? So, _that_ part makes sense.
>
> But have you actually tried to get a range of the appropriate type to remove
> from a container? It seems like almost ever function in std.range and
> std.algorithm returns a new range type, making them completely useless for
> processing a range to be removed from a container.
>
> I was looking to remove a single element for a RedBlackTree. The best function
> that I could think to get the proper range was findSplit. The middle portion of
> the return value would be the portion that I was looking for. I'd take that and
> pass it to RedBlackTree's remove. Wrong. It uses takeExactly in its
> implementation and the first two portions of the result of findSplit aren't the
> right range type.

RedBlackTree supports the equalRange function, which gives you a range of elements equal to the value you give.

If your RedBlackTree does not support duplicate elements, then this should always be a 1 or zero element range, which you can then remove.

> That's disgusting. All that just to remove one element? And what if the range
> isn't bidirectional (e.g. SList)? Well, then you have no popBack, and as far as
> I can tell, you're screwed, since you can't use either take or takeExactly,
> because both of them return new range types.

take is supported as a range type for SList.  It is not for RedBlackTree because there are usually better ways to get ranges from the container.  However, I can see reasons that take could be used.  For example, in a RedBlackTree implementing a map, you may want to search for the first element that has a particular value (O(n) complexity find).  In that case, you should be able to remove just one element.  This should be added as a bugzilla report, and I'll add the ability to do take ranges to the red black tree.  RBT is overdue for some improvements anyways (bearophile found quite a few shortfalls with it).

But I see your point in that you should really be able to construct *any* type of range and have it work.

I wonder if there is a way we could generalize "give me the implementation-specific representation of the first item *reference*"

For example, in dcollections, we have the notion of cursors.  The remove functions take either a cursor or a range.  However, I could possibly extend the range removal function to allow any range type as long as the begin() member gives you the correct cursor type.

-Steve