Why is std.algorithm so complicated to use? (page 13)

Another example of how std.algorithm is so hard to use (it's almost tempting me to start swearing...): How do you remove an item from an array in place? It seems so darn simple, and yet it's not in std.algorithm (or std.range). It makes something so easy so tedious.

On Sunday, July 15, 2012 23:42:29 Mehrdad wrote: > Another example of how std.algorithm is so hard to use (it's almost tempting me to start swearing...): > > How do you remove an item from an array in place? > > It seems so darn simple, and yet it's not in std.algorithm (or std.range). It makes something so easy so tedious. Iterators have exactly the same problem for exactly the same reasons as std.algorithm.remove (e.g. take a look at C++'s erase function). The only way to do this in place would be to create a separate removeInPlace function specifically for arrays. But since all it takes is reassigning the result of std.algorithm.remove to the array that you passed in and an std.array.replaceInPlace would be doing exactly that, I don't think that adding such a function buys us much: auto arr = [10, 22, 19, 4, 6]; arr = remove(arr, 3); assert(arr == [10, 22, 19, 6]); The main problem is understanding why remove (or erase in C++) works this way, which seems to throw off a bunch of people in both D and C++, but it's something that we're pretty much stuck with. You need the actual container (not an iterator or range) if you want to actually remove the element. - Jonathan M Davis

On Sunday, 15 July 2012 at 22:03:33 UTC, Jonathan M Davis wrote: > auto arr = [10, 22, 19, 4, 6]; > arr = remove(arr, 3); > assert(arr == [10, 22, 19, 6]); Yeah, the problem is that this reallocates... (not claiming remove() itself is supposed to be efficient, but this is making it even worse) > The main problem is understanding why remove (or erase in C++) works this way, which seems to throw off a bunch of people in both D and C++, but it's something that we're pretty much stuck with. You need the actual container (not an iterator or range) if you want to actually remove the element. Well, for arrays, the "actual container" is the array, so that doesn't work either. :\ On the other hand, even C++ has vector<T>::erase()!

On 7/15/12 5:42 PM, Mehrdad wrote: > Another example of how std.algorithm is so hard to use (it's almost > tempting me to start swearing...): > > How do you remove an item from an array in place? > > It seems so darn simple, and yet it's not in std.algorithm (or > std.range). It makes something so easy so tedious. Look for "remove" here: http://dlang.org/phobos/std_algorithm.html Unfortunately the anchor called "remove" is swallowed by an unrelated enum value (we should fix that in ddoc). From the example: int[] a = [ 3, 5, 7, 8 ]; assert(remove(a, 1) == [ 3, 7, 8 ]); assert(a == [ 3, 7, 8, 8 ]); Therefore, to remove an element in place: int[] a = [ 3, 5, 7, 8 ]; a = remove(a, 1); You also have a less stable but faster version: a = remove!(SwapStrategy.unstable)(a, 1); Andrei

On 7/15/12 6:22 PM, Mehrdad wrote: > On Sunday, 15 July 2012 at 22:03:33 UTC, Jonathan M Davis wrote: >> auto arr = [10, 22, 19, 4, 6]; >> arr = remove(arr, 3); >> assert(arr == [10, 22, 19, 6]); > > Yeah, the problem is that this reallocates... doesn't

On Sunday, July 15, 2012 18:24:47 Andrei Alexandrescu wrote: > On 7/15/12 6:22 PM, Mehrdad wrote: > > On Sunday, 15 July 2012 at 22:03:33 UTC, Jonathan M Davis wrote: > >> auto arr = [10, 22, 19, 4, 6]; > >> arr = remove(arr, 3); > >> assert(arr == [10, 22, 19, 6]); > > > > Yeah, the problem is that this reallocates... > > doesn't Yeah. It just slices. remove moves the elements over by one, and returns a slice of the range which is shorter by the number of elements removed. And since assigning one array is just assigning the ptr and length properties, there's no copying of the elements there, and no reallocations are required anywhere. Now, if you want to append to the array afterwards, then you're going to get a reallocation (unless you know that there are no other arrays referring to anything after the returned slice, in which case, you can use assumeSafeAppend make it so that it doesn't reallocate). But the removal involves no reallocations at all. - Jonathan M Davis

On Sunday, 15 July 2012 at 22:39:47 UTC, Jonathan M Davis wrote: > On Sunday, July 15, 2012 18:24:47 Andrei Alexandrescu wrote: >> On 7/15/12 6:22 PM, Mehrdad wrote: >> > On Sunday, 15 July 2012 at 22:03:33 UTC, Jonathan M Davis wrote: >> >> auto arr = [10, 22, 19, 4, 6]; >> >> arr = remove(arr, 3); >> >> assert(arr == [10, 22, 19, 6]); >> > >> > Yeah, the problem is that this reallocates... >> >> doesn't > > Yeah. It just slices. Ooooooh, I misunderstood what was happening. Thanks to both you & Andrei!

On Monday, July 16, 2012 01:47:15 Mehrdad wrote: > On Sunday, 15 July 2012 at 22:39:47 UTC, Jonathan M Davis wrote: > > On Sunday, July 15, 2012 18:24:47 Andrei Alexandrescu wrote: > >> On 7/15/12 6:22 PM, Mehrdad wrote: > >> > On Sunday, 15 July 2012 at 22:03:33 UTC, Jonathan M Davis > >> > > >> > wrote: > >> >> auto arr = [10, 22, 19, 4, 6]; > >> >> arr = remove(arr, 3); > >> >> assert(arr == [10, 22, 19, 6]); > >> > > >> > Yeah, the problem is that this reallocates... > >> > >> doesn't > > > > Yeah. It just slices. > > Ooooooh, I misunderstood what was happening. Thanks to both you & Andrei! std.algorithm.remove does pretty much exactly what the STL's erase function does, only it operates on ranges rather than iterators. - Jonathan M Davis

On 2012-07-16 00:03, Jonathan M Davis wrote: > Iterators have exactly the same problem for exactly the same reasons as > std.algorithm.remove (e.g. take a look at C++'s erase function). The only way > to do this in place would be to create a separate removeInPlace function > specifically for arrays. But since all it takes is reassigning the result of > std.algorithm.remove to the array that you passed in and an > std.array.replaceInPlace would be doing exactly that, I don't think that > adding such a function buys us much: > > auto arr = [10, 22, 19, 4, 6]; > arr = remove(arr, 3); > assert(arr == [10, 22, 19, 6]); > > The main problem is understanding why remove (or erase in C++) works this way, > which seems to throw off a bunch of people in both D and C++, but it's > something that we're pretty much stuck with. You need the actual container > (not an iterator or range) if you want to actually remove the element. It would be really useful if std.algorithm (and similar functions) would indicate clearly if they operate in place or not. It took me a while to understand that "sort" operates in place. In Ruby this is achieved by a naming convention: a = [3, 4, 5] a = a.sort # returns a new array a.sort! # sorts in place -- /Jacob Carlborg

July 16, 2012

Re: Why is std.algorithm so complicated to use?

Posted by Jonathan M Davis
in reply to Jacob Carlborg

Permalink

Jonathan M Davis

Posted in reply to Jacob Carlborg

Permalink

On Monday, July 16, 2012 09:07:06 Jacob Carlborg wrote:
> On 2012-07-16 00:03, Jonathan M Davis wrote:
> > Iterators have exactly the same problem for exactly the same reasons as std.algorithm.remove (e.g. take a look at C++'s erase function). The only way to do this in place would be to create a separate removeInPlace function specifically for arrays. But since all it takes is reassigning the result of std.algorithm.remove to the array that you passed in and an std.array.replaceInPlace would be doing exactly that, I don't think that
> > 
> > adding such a function buys us much:
> >      auto arr = [10, 22, 19, 4, 6];
> >      arr = remove(arr, 3);
> >      assert(arr == [10, 22, 19, 6]);
> > 
> > The main problem is understanding why remove (or erase in C++) works this way, which seems to throw off a bunch of people in both D and C++, but it's something that we're pretty much stuck with. You need the actual container (not an iterator or range) if you want to actually remove the element.
> It would be really useful if std.algorithm (and similar functions) would indicate clearly if they operate in place or not. It took me a while to understand that "sort" operates in place. In Ruby this is achieved by a naming convention:
> 
> a = [3, 4, 5]
> a = a.sort # returns a new array
> a.sort! # sorts in place

We do have InPlace in function names when one version of a function operates in place and another doesn't, but we haven't generally done that on functions which only have one version, and I don't expect that to change now (at least not for existing functions).

In any case, _most_ range-based functions _don't_ operate in place, but it's certainly true that the ones that do should make that clear in their documentation.

remove is a bit of a special case, however, in that it does the changes in place but doesn't take a ref, so while the return value is shortened to not include the removed elements, the original is just as long as it was before (which is useful in cases where you want to then set the elements at the end), and you end up with something that's sort of in place and sort of not. Regardless, remove is confusing enough to most people (just like erase in C++) that it's documentation needs to be very clear, but I don't know how good it is or isn't. Given the difficulty of explaining it properly and Merhdad's confusion, it probably stands to be improved.

- Jonathan M Davis

Forums