Array!T and find are slow

I've found bench-marking my program that std.algorithm.find is very slow on Array!T, due to the fact it iterates on a range instead of a plain array. I've written some search functions, which are many times faster, is it worth making a pull request? http://dpaste.dzfl.pl/63b54aa27f35#

On Wednesday, 14 May 2014 at 14:24:28 UTC, Damian Day wrote: > I've written some search functions, which are many times faster, is it > worth making a pull request? Generally, we strive to make the algorithms in Phobos as fast as possible even in the general case. I didn't look at the issue at hand in any detail, but if the "-O -release -inline" performance when operating on Arrays is considerably worse than when directly operating on the data, we have a problem, as it likely to also impact other ranges. In short, I don't think the best solution is to add this special case to Array!T, but rather to figure out what exactly is wrong with Array!T and/or find() and fix the problem at its source. Could you post a short benchmark snippet explicitly showing the problem? Best, David

On Wednesday, 14 May 2014 at 14:54:57 UTC, David Nadlinger wrote: > Could you post a short benchmark snippet explicitly showing the problem? Benchmark found here: http://dpaste.dzfl.pl/0058fc8341830

May 14, 2014

Re: Array!T and find are slow

Posted by monarch_dodra
in reply to David Nadlinger

Permalink

monarch_dodra

Posted in reply to David Nadlinger

Permalink

On Wednesday, 14 May 2014 at 14:54:57 UTC, David Nadlinger wrote:
> On Wednesday, 14 May 2014 at 14:24:28 UTC, Damian Day wrote:
>> I've written some search functions, which are many times faster, is it
>> worth making a pull request?
>
> Generally, we strive to make the algorithms in Phobos as fast as possible even in the general case. I didn't look at the issue at hand in any detail, but if the "-O -release -inline" performance when operating on Arrays is considerably worse than when directly operating on the data, we have a problem, as it likely to also impact other ranges.
>
> In short, I don't think the best solution is to add this special case to Array!T, but rather to figure out what exactly is wrong with Array!T and/or find() and fix the problem at its source. Could you post a short benchmark snippet explicitly showing the problem?
>
> Best,
> David

"One" of the issue is the cost of repeatedly indexing Array, when internally it is nothing more than an array. Adding a special case "in" Array.Range allows bypassing the repeated indexing costs, but at the very least, should be implemented "in terms of" find itself, eg:

Range searchForward(E)(E needle)
    if (isImplicitlyConvertible!(E, T))
{
    assert(_outer._data.refCountedStore.isInitialized);

    auto haystack = _outer._data._payload[_a .. _b];
    haystack = haystack.find(needle);
    return Range(this._outer, _b - haystack.length, _b);
}

On Wednesday, 14 May 2014 at 15:42:13 UTC, Damian Day wrote: > On Wednesday, 14 May 2014 at 14:54:57 UTC, David Nadlinger wrote: > >> Could you post a short benchmark snippet explicitly showing the problem? > > Benchmark found here: > http://dpaste.dzfl.pl/0058fc8341830 FYI, that implementation is wrong: Array.Range keeps `_a` and `_b`, which represents the "internal" bounds of the payload proper (and not "low" and "length"). You want something along the lines of: Range searchForward(E)(E needle) if (isImplicitlyConvertible!(E, T)) { assert(_data.refCountedStore.isInitialized); auto haystack = _data._payload[_a .. _b]; immutable len = _b - _a; for (size_t index = 0; index < len; ++index) { if (haystack[index] is needle) return Range(this, _b - len, _b); } return Range(this, _b, _b); } But even then, as I suggested above, while I think having searchForward could improve the situation, implementing it "in terms of" find would be better: This way, you'd still benefit from some of the highly optimized cases in find.

On Wednesday, 14 May 2014 at 14:24:28 UTC, Damian Day wrote: > I've found bench-marking my program that std.algorithm.find is > very slow on Array!T, due to the fact it iterates on a range > instead of a plain array. > > I've written some search functions, which are many times faster, is it > worth making a pull request? > > http://dpaste.dzfl.pl/63b54aa27f35# BTW, this is a more "general" issue: Given a generic algorithm "std.foo", how can I write my own (better optimized) "object.foo", and make sure *that* is called instead? I initially filed the issue for "retro", while indeed mentioning that "find" was also open to the improvement: https://issues.dlang.org/show_bug.cgi?id=12583 This spawned the thread: http://forum.dlang.org/thread/op.xeuot6g2eav7ka@stevens-macbook-pro-2.local Unfortunately, nothing came of it.

On Wednesday, 14 May 2014 at 17:57:30 UTC, monarch_dodra wrote: > > BTW, this is a more "general" issue: Given a generic algorithm "std.foo", how can I write my own (better optimized) "object.foo", and make sure *that* is called instead? > > I initially filed the issue for "retro", while indeed mentioning that "find" was also open to the improvement: > https://issues.dlang.org/show_bug.cgi?id=12583 > > This spawned the thread: > http://forum.dlang.org/thread/op.xeuot6g2eav7ka@stevens-macbook-pro-2.local > > Unfortunately, nothing came of it. It's an interesting problem to solve. Would having a specialized container range, which has search, removal, etc primitives work better?

On Wednesday, 14 May 2014 at 17:36:35 UTC, monarch_dodra wrote: > Adding a special case "in" Array.Range allows bypassing the repeated indexing costs, but at the very least, should be implemented "in terms of" find itself, eg: Shouldn't the extra indirection just vanish into thin air once opIndex or what have you is inlined? I don't see a conceptual reason for overhead in this case. Of course, the compiler probably wouldn't be able to infer any of the memchr/… optimizations find does for the array case, but Damian's alternative version doesn't do any of that either. David

Forums