January 16, 2019
De-duplicating a range that's not necessarily sorted seems to be a pretty common task, so here's a generic function for whoever else might want to do this:

	import std.range.primitives;

	auto deduplicate(R)(R range)
		if (isInputRange!R)
	{
		import std.algorithm : filter;
		alias E = ElementType!R;
		bool[E] seen;
		return range.filter!((e) {
			if (e in seen) return false;
			seen[e] = true;
			return true;
		});
	}


T

-- 
Why have vacation when you can work?? -- EC
January 16, 2019
On Wednesday, 16 January 2019 at 18:20:57 UTC, H. S. Teoh wrote:
>
> T

I'm aware of how to do it manually as I already stated I went with a similar approach.

There should just be something standard for it and uniq should have an overload or something that allows for another behavior that doesn't rely on sorting.

Even if it's "slower".
January 16, 2019
On Wed, Jan 16, 2019 at 06:25:48PM +0000, bauss via Digitalmars-d-learn wrote: [...]
> I'm aware of how to do it manually as I already stated I went with a similar approach.
> 
> There should just be something standard for it and uniq should have an overload or something that allows for another behavior that doesn't rely on sorting.

Feel free to submit a PR to Phobos for this. The templatized function I posted is probably already pretty close to Phobos standards, you just need to add a few unittests and a nice ddoc header.

I wouldn't call it `uniq`, though.  It doesn't quite jive with the "examine nearby elements only" philosophy of .uniq (which inherits from the Unix `uniq` that does the same thing), and it's not @nogc unlike .uniq, etc., and IME, trying to shoehorn something like this into an existing symbol with already preconceived semantics will make the PR review needlessly controversial.  I'd just submit it under a different name.


> Even if it's "slower".

Actually, filtering with an AA is not slower, since the AA amortizes the
overall cost to O(n), which is even better than the O(n log n) of the
pre-sorting approach.

Of course, this comes at the cost of not being @nogc, which might be a turnoff for some folks.  It's probably possible to write a @nogc version of this function, but it will be more complicated (much more complicated if you're working with a range of unknown length). This is one of the places where IMO the hassle of manually managing the memory of a lookup table far outweighs the "cost" (perceived or real) of just embracing the GC.


T

-- 
Elegant or ugly code as well as fine or rude sentences have something in common: they don't depend on the language. -- Luca De Vitis
1 2
Next ›   Last »