Jump to page: 1 2
Thread overview
Removing multiple elements from array
Dec 21, 2013
aldanor
Dec 21, 2013
Ali Çehreli
Dec 21, 2013
aldanor
Dec 21, 2013
Ali Çehreli
Dec 21, 2013
H. S. Teoh
Dec 21, 2013
H. S. Teoh
Dec 21, 2013
aldanor
Dec 21, 2013
bearophile
Dec 21, 2013
aldanor
Dec 21, 2013
MrSmith
Dec 22, 2013
aldanor
Dec 22, 2013
Ali Çehreli
December 21, 2013
Is there an efficient method to remove elements with multiple (compile-time-unknown) indices from an array? I currently do something like

		if (!index.empty)
			foreach (i; index.sort.reverse)
				a = a.remove(i);

... which looks a bit awkward.
December 21, 2013
On 12/20/2013 04:47 PM, aldanor wrote:

> Is there an efficient method to remove elements with multiple
> (compile-time-unknown) indices from an array? I currently do something like
>
>          if (!index.empty)
>              foreach (i; index.sort.reverse)
>                  a = a.remove(i);


That's probably a bug, right? The indexes would probably become off as the array gets shorter.

> ... which looks a bit awkward.

I am sure others can come up with solutions but I can't think of a way of doing this Phobos right now. :)

Here is a range that does what you want (not very well tested ;) ):

import std.stdio;
import std.array;
import std.range;
import std.algorithm;

struct SkipIndexes(R, I)
{
    alias SortedIndexRange = typeof((I.init).sort.uniq);

    R range;
    SortedIndexRange indexes;
    size_t currentIndex;

    this(R range, I indexes, size_t currentIndex = 0)
    {
        this.range = range;
        this.indexes = indexes.sort.uniq;
        this.currentIndex = currentIndex;

        prime();
    }

    bool empty() const @property
    {
        return range.empty;
    }

    ElementType!R front() const @property
    {
        return range.front;
    }

    void prime()
    {
        while (!empty && !indexes.empty && (indexes.front == currentIndex)) {
            range.popFront();
            indexes.popFront();
            ++currentIndex;
        }
    }

    void popFront()
    {
        range.popFront();
        ++currentIndex;
        prime();
    }

    auto save() @property
    {
        return this;
    }

    void report() const
    {
        writeln(indexes);
        writeln(currentIndex);
    }
}

auto skipIndexes(R, I)(R range, I skipIndexes, size_t currentIndex = 0)
{
    return SkipIndexes!(R, I)(range, skipIndexes, currentIndex);
}

void main()
{
    auto arr = [ 0, 1, 2, 3, 4, 5 ];
    size_t[] badIndexes = [ 0, 1, 5 ];

    auto skipped = arr.skipIndexes(badIndexes);

    // skipped is a lazy range:
    assert(skipped.equal([ 2, 3, 4 ]));

    // arr is untouched at this point:
    assert(arr == [ 0, 1, 2, 3, 4, 5 ]);

    // To affect it, assign from .array of the lazy range:
    arr = skipped.array;
    assert(arr == [ 2, 3, 4 ]);
}

Ali

December 21, 2013
On Saturday, 21 December 2013 at 01:42:04 UTC, Ali Çehreli wrote:

> That's probably a bug, right? The indexes would probably become off as the array gets shorter.

Note the .reverse there? Which makes sure indices don't get off. Still, that's probably an inefficient way of doing this...

December 21, 2013
aldanor:

> Is there an efficient method to remove elements with multiple (compile-time-unknown) indices from an array? I currently do something like
>
> 		if (!index.empty)
> 			foreach (i; index.sort.reverse)
> 				a = a.remove(i);
>
> ... which looks a bit awkward.

Do not forget to add the () after the sort, otherwise you call the deprecated, buggy and slow built-in sort.

reverse is another deprecated built-in, so use "retro".

The first "if" is not much useful, trying to sort an empty array will not wast much time.

foreach (i; index.sort().retro)
    a = a.remove(i);

In alternative you can also invert the sorting cmp (untested):

foreach (i; index.sort!q{a > b})
    a = a.remove(i);

If you have to remove just few items in-place, that's the way to do it. In future I hope we'll have a better remove function in Phobos (but it's not hard to write).

Another solution is to filter the array a. If you have the indexes, you can zip a with the indexes, and then filter, map to extract just the items and use copy() to work in place, but it's not nice-looking code.

Bye,
bearophile
December 21, 2013
On 12/20/2013 06:17 PM, aldanor wrote:
> On Saturday, 21 December 2013 at 01:42:04 UTC, Ali Çehreli wrote:
>
>> That's probably a bug, right? The indexes would probably become off as
>> the array gets shorter.
>
> Note the .reverse there?

Yes, I missed that part. :)

> Which makes sure indices don't get off. Still,
> that's probably an inefficient way of doing this...

Yeah, every removal would move the right-hand elements one position to the left. But you would be removing only M times (the number of indexes). So the complexity is I think O(M logM) + O(NM) for sorting the indexing plus removing M elements by shifting at the order of N elements.

The range solution that I've come up with is I think O(M log M) + O(M) + O(N), for sorting the indexes, removing the duplicates with uniq and iterating the elements. Since O(M log M) dominates O(M), it should be O(M log M) + O(N).

Ali

P.S. Hm. I think I've reading an algorithms book lately. :p

December 21, 2013
On Fri, Dec 20, 2013 at 07:01:17PM -0800, Ali Çehreli wrote:
> On 12/20/2013 06:17 PM, aldanor wrote:
> >On Saturday, 21 December 2013 at 01:42:04 UTC, Ali Çehreli wrote:
> >
> >>That's probably a bug, right? The indexes would probably become off as the array gets shorter.
> >
> >Note the .reverse there?
> 
> Yes, I missed that part. :)
> 
> >Which makes sure indices don't get off. Still,
> >that's probably an inefficient way of doing this...
> 
> Yeah, every removal would move the right-hand elements one position
> to the left. But you would be removing only M times (the number of
> indexes). So the complexity is I think O(M logM) + O(NM) for sorting
> the indexing plus removing M elements by shifting at the order of N
> elements.
> 
> The range solution that I've come up with is I think O(M log M) +
> O(M) + O(N), for sorting the indexes, removing the duplicates with
> uniq and iterating the elements. Since O(M log M) dominates O(M), it
> should be O(M log M) + O(N).
[...]

If you know the which elements you're going to remove, you can remove
them all at once in O(N) time:

	int[] arr = [ ... ];
	for (i=0, j=0; i < arr.length; i++)
	{
		if (deleteThisElement(arr[i]))
			continue;
		if (j < i)
			arr[j] = arr[i];
		j++;
	}
	arr.length = j;

Of course, if you know which *indices* you're going to remove, then you can do even better (by starting your loop at the first index to be deleted).


T

-- 
BREAKFAST.COM halted...Cereal Port Not Responding. -- YHL
December 21, 2013
On Fri, Dec 20, 2013 at 07:13:59PM -0800, H. S. Teoh wrote:
> On Fri, Dec 20, 2013 at 07:01:17PM -0800, Ali Çehreli wrote:
> > On 12/20/2013 06:17 PM, aldanor wrote:
> > >On Saturday, 21 December 2013 at 01:42:04 UTC, Ali Çehreli wrote:
> > >
> > >>That's probably a bug, right? The indexes would probably become off as the array gets shorter.
> > >
> > >Note the .reverse there?
> > 
> > Yes, I missed that part. :)
> > 
> > >Which makes sure indices don't get off. Still,
> > >that's probably an inefficient way of doing this...
> > 
> > Yeah, every removal would move the right-hand elements one position
> > to the left. But you would be removing only M times (the number of
> > indexes). So the complexity is I think O(M logM) + O(NM) for sorting
> > the indexing plus removing M elements by shifting at the order of N
> > elements.
> > 
> > The range solution that I've come up with is I think O(M log M) +
> > O(M) + O(N), for sorting the indexes, removing the duplicates with
> > uniq and iterating the elements. Since O(M log M) dominates O(M), it
> > should be O(M log M) + O(N).
> [...]
> 
> If you know the which elements you're going to remove, you can remove
> them all at once in O(N) time:
> 
> 	int[] arr = [ ... ];
> 	for (i=0, j=0; i < arr.length; i++)
> 	{
> 		if (deleteThisElement(arr[i]))
> 			continue;
> 		if (j < i)
> 			arr[j] = arr[i];
> 		j++;
> 	}
> 	arr.length = j;

There's also a Phobos solution: the above code is equivalent to:

	import std.algorithm : remove;
 	int[] arr = [ ... ];
	arr = arr.remove!deleteThisElement();


> Of course, if you know which *indices* you're going to remove, then you can do even better (by starting your loop at the first index to be deleted).
[...]

Unfortunately, while std.algorithm.remove does provide a way to do this if the number of indices to remove are known beforehand, it doesn't seem to be able to remove a dynamic list of indices. Probably an enhancement request should be filed:

	http://d.puremagic.com/issues

If the number of indices to remove are fixed, though, you can do this:

	import std.algorithm : remove;
 	int[] arr = [ ... ];
	int i1, i2, i3;  // indices to remove
	arr = arr.remove(i1, i2, i3);


T

-- 
It's bad luck to be superstitious. -- YHL
December 21, 2013
On Saturday, 21 December 2013 at 03:22:22 UTC, H. S. Teoh wrote:
> Unfortunately, while std.algorithm.remove does provide a way to do this
> if the number of indices to remove are known beforehand, it doesn't seem
> to be able to remove a dynamic list of indices. Probably an enhancement
> request should be filed:
>
> 	http://d.puremagic.com/issues
>
> If the number of indices to remove are fixed, though, you can do this:
>
> 	import std.algorithm : remove;
>  	int[] arr = [ ... ];
> 	int i1, i2, i3;  // indices to remove
> 	arr = arr.remove(i1, i2, i3);

Yes, this is exactly why I opened this thread initially, I noticed this functionality in .remove() and was native enough to assume there's a shortcut for batch removal with indices unknown in advance :)
December 21, 2013
On Saturday, 21 December 2013 at 02:52:00 UTC, bearophile wrote:
> Do not forget to add the () after the sort, otherwise you call the deprecated, buggy and slow built-in sort.
>
> reverse is another deprecated built-in, so use "retro".
>
> The first "if" is not much useful, trying to sort an empty array will not wast much time.
>
> foreach (i; index.sort().retro)
>     a = a.remove(i);
>
> In alternative you can also invert the sorting cmp (untested):
>
> foreach (i; index.sort!q{a > b})
>     a = a.remove(i);

Thanks, that looks like the cleanest solution so far! So (I guess) it will not cause index.length reallocations by just moving the elements in-place and then returning a slice?

P.S. I wonder where would one learn about the deprecated sort and reverse if not from this forum? I followed the official docs and there's nothing about these properties being deprecated... yet another spot where a compiler warning would be appropriate?
December 21, 2013
On Saturday, 21 December 2013 at 00:47:04 UTC, aldanor wrote:
> Is there an efficient method to remove elements with multiple (compile-time-unknown) indices from an array? I currently do something like
>
> 		if (!index.empty)
> 			foreach (i; index.sort.reverse)
> 				a = a.remove(i);
>
> ... which looks a bit awkward.

just use foreach_reverse
« First   ‹ Prev
1 2