Thread overview | |||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
May 14, 2014 Array!T and find are slow | ||||
---|---|---|---|---|
| ||||
I've found bench-marking my program that std.algorithm.find is very slow on Array!T, due to the fact it iterates on a range instead of a plain array. I've written some search functions, which are many times faster, is it worth making a pull request? http://dpaste.dzfl.pl/63b54aa27f35# |
May 14, 2014 Re: Array!T and find are slow | ||||
---|---|---|---|---|
| ||||
Posted in reply to Damian Day | On Wednesday, 14 May 2014 at 14:24:28 UTC, Damian Day wrote:
> I've written some search functions, which are many times faster, is it
> worth making a pull request?
Generally, we strive to make the algorithms in Phobos as fast as possible even in the general case. I didn't look at the issue at hand in any detail, but if the "-O -release -inline" performance when operating on Arrays is considerably worse than when directly operating on the data, we have a problem, as it likely to also impact other ranges.
In short, I don't think the best solution is to add this special case to Array!T, but rather to figure out what exactly is wrong with Array!T and/or find() and fix the problem at its source. Could you post a short benchmark snippet explicitly showing the problem?
Best,
David
|
May 14, 2014 Re: Array!T and find are slow | ||||
---|---|---|---|---|
| ||||
Posted in reply to David Nadlinger | On Wednesday, 14 May 2014 at 14:54:57 UTC, David Nadlinger wrote: > Could you post a short benchmark snippet explicitly showing the problem? Benchmark found here: http://dpaste.dzfl.pl/0058fc8341830 |
May 14, 2014 Re: Array!T and find are slow | ||||
---|---|---|---|---|
| ||||
Posted in reply to Damian Day | On Wednesday, 14 May 2014 at 15:42:13 UTC, Damian Day wrote: > On Wednesday, 14 May 2014 at 14:54:57 UTC, David Nadlinger wrote: > >> Could you post a short benchmark snippet explicitly showing the problem? > > Benchmark found here: > http://dpaste.dzfl.pl/0058fc8341830 Unfortunately, I don't have the time to look at the details right now, but https://github.com/D-Programming-Language/phobos/pull/1713 might have improved the situation somewhat. David |
May 14, 2014 Re: Array!T and find are slow | ||||
---|---|---|---|---|
| ||||
Posted in reply to David Nadlinger | On Wednesday, 14 May 2014 at 14:54:57 UTC, David Nadlinger wrote: > On Wednesday, 14 May 2014 at 14:24:28 UTC, Damian Day wrote: >> I've written some search functions, which are many times faster, is it >> worth making a pull request? > > Generally, we strive to make the algorithms in Phobos as fast as possible even in the general case. I didn't look at the issue at hand in any detail, but if the "-O -release -inline" performance when operating on Arrays is considerably worse than when directly operating on the data, we have a problem, as it likely to also impact other ranges. > > In short, I don't think the best solution is to add this special case to Array!T, but rather to figure out what exactly is wrong with Array!T and/or find() and fix the problem at its source. Could you post a short benchmark snippet explicitly showing the problem? > > Best, > David "One" of the issue is the cost of repeatedly indexing Array, when internally it is nothing more than an array. Adding a special case "in" Array.Range allows bypassing the repeated indexing costs, but at the very least, should be implemented "in terms of" find itself, eg: Range searchForward(E)(E needle) if (isImplicitlyConvertible!(E, T)) { assert(_outer._data.refCountedStore.isInitialized); auto haystack = _outer._data._payload[_a .. _b]; haystack = haystack.find(needle); return Range(this._outer, _b - haystack.length, _b); } |
May 14, 2014 Re: Array!T and find are slow | ||||
---|---|---|---|---|
| ||||
Posted in reply to Damian Day | On Wednesday, 14 May 2014 at 15:42:13 UTC, Damian Day wrote: > On Wednesday, 14 May 2014 at 14:54:57 UTC, David Nadlinger wrote: > >> Could you post a short benchmark snippet explicitly showing the problem? > > Benchmark found here: > http://dpaste.dzfl.pl/0058fc8341830 FYI, that implementation is wrong: Array.Range keeps `_a` and `_b`, which represents the "internal" bounds of the payload proper (and not "low" and "length"). You want something along the lines of: Range searchForward(E)(E needle) if (isImplicitlyConvertible!(E, T)) { assert(_data.refCountedStore.isInitialized); auto haystack = _data._payload[_a .. _b]; immutable len = _b - _a; for (size_t index = 0; index < len; ++index) { if (haystack[index] is needle) return Range(this, _b - len, _b); } return Range(this, _b, _b); } But even then, as I suggested above, while I think having searchForward could improve the situation, implementing it "in terms of" find would be better: This way, you'd still benefit from some of the highly optimized cases in find. |
May 14, 2014 Re: Array!T and find are slow | ||||
---|---|---|---|---|
| ||||
Posted in reply to Damian Day | Damian Day:
> Benchmark found here:
> http://dpaste.dzfl.pl/0058fc8341830
Perhaps it's a problem of missed dmd inlining. So I suggest to run the benchmark with ldc2.
Bye,
bearophile
|
May 14, 2014 Re: Array!T and find are slow | ||||
---|---|---|---|---|
| ||||
Posted in reply to Damian Day | On Wednesday, 14 May 2014 at 14:24:28 UTC, Damian Day wrote: > I've found bench-marking my program that std.algorithm.find is > very slow on Array!T, due to the fact it iterates on a range > instead of a plain array. > > I've written some search functions, which are many times faster, is it > worth making a pull request? > > http://dpaste.dzfl.pl/63b54aa27f35# BTW, this is a more "general" issue: Given a generic algorithm "std.foo", how can I write my own (better optimized) "object.foo", and make sure *that* is called instead? I initially filed the issue for "retro", while indeed mentioning that "find" was also open to the improvement: https://issues.dlang.org/show_bug.cgi?id=12583 This spawned the thread: http://forum.dlang.org/thread/op.xeuot6g2eav7ka@stevens-macbook-pro-2.local Unfortunately, nothing came of it. |
May 14, 2014 Re: Array!T and find are slow | ||||
---|---|---|---|---|
| ||||
Posted in reply to monarch_dodra | On Wednesday, 14 May 2014 at 17:57:30 UTC, monarch_dodra wrote:
>
> BTW, this is a more "general" issue: Given a generic algorithm "std.foo", how can I write my own (better optimized) "object.foo", and make sure *that* is called instead?
>
> I initially filed the issue for "retro", while indeed mentioning that "find" was also open to the improvement:
> https://issues.dlang.org/show_bug.cgi?id=12583
>
> This spawned the thread:
> http://forum.dlang.org/thread/op.xeuot6g2eav7ka@stevens-macbook-pro-2.local
>
> Unfortunately, nothing came of it.
It's an interesting problem to solve.
Would having a specialized container range, which has search, removal, etc primitives work better?
|
May 14, 2014 Re: Array!T and find are slow | ||||
---|---|---|---|---|
| ||||
Posted in reply to monarch_dodra | On Wednesday, 14 May 2014 at 17:36:35 UTC, monarch_dodra wrote:
> Adding a special case "in" Array.Range allows bypassing the repeated indexing costs, but at the very least, should be implemented "in terms of" find itself, eg:
Shouldn't the extra indirection just vanish into thin air once opIndex or what have you is inlined? I don't see a conceptual reason for overhead in this case.
Of course, the compiler probably wouldn't be able to infer any of the memchr/… optimizations find does for the array case, but Damian's alternative version doesn't do any of that either.
David
|
Copyright © 1999-2021 by the D Language Foundation