Thread overview
[Issue 13877] Problem with map+join
Dec 19, 2014
Ketmar Dark
Dec 20, 2014
safety0ff.bugz
Jan 02, 2015
Peter Alexander
Jan 02, 2015
Peter Alexander
December 19, 2014
https://issues.dlang.org/show_bug.cgi?id=13877

--- Comment #1 from bearophile_hugs@eml.cc ---
The work-around was found by "anon":

http://forum.dlang.org/thread/ibfthfpvccfamdezbiin@forum.dlang.org#post-rincbqgpzcezcuiksvos:40forum.dlang.org

--
December 19, 2014
https://issues.dlang.org/show_bug.cgi?id=13877

Ketmar Dark <ketmar@ketmar.no-ip.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |ketmar@ketmar.no-ip.org

--- Comment #2 from Ketmar Dark <ketmar@ketmar.no-ip.org> ---
btw, with such modification both variants becomes equivalent:

  auto stripSave(RT) (auto ref RT rng) {
    static struct R {
      RT r;
      @disable void save ();
      alias r this;
    }
    return R(rng);
  }


  const(int)[] foo2 (in int[] data, in int i, in int max) {
    count2++;
    if (i < max) {
      return data.map!(n => foo2(data, i + 1, max)).stripSave.join;
    } else {
      return data;
    }
  }


the code turns MapResult to non-forward range, and then join doesn't calculate everything twice.

the question is: should MapResult has '.save' here in the first place?

or maybe we need some API — as `stripSave` to explicitly remove features from ranges? it's very obvious to the programmer that recalculating range values is costly in this case, so he can just remove `.save` to stop Phobos using it.

--
December 20, 2014
https://issues.dlang.org/show_bug.cgi?id=13877

safety0ff.bugz <safety0ff.bugz@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |safety0ff.bugz@gmail.com
           Hardware|x86                         |All
                 OS|Windows                     |All

--- Comment #3 from safety0ff.bugz <safety0ff.bugz@gmail.com> ---
IMO the problem is that join assumes that traversal of the range of ranges is cheap.

All the optimization is doing is avoiding reallocation costs while building the array of results (by computing length then allocating appropriately sized array for the results.)

As shown by bearophile, this shouldn't be assumed in the general case.

I ran into an issue where std.parallelism assumed that opIndex was as cheap as popFront, but for my specific range it was easier to compute the next front knowing the previous one than it was to find the element given by the index.

If these type of optimizations are important, perhaps there should be some "performance hint tags" for range operations. e.g. popFront on an array range is "cheap" whereas popFront on a MapResult may vary arbitrarily.

--
January 02, 2015
https://issues.dlang.org/show_bug.cgi?id=13877

Peter Alexander <peter.alexander.au@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |ASSIGNED
                 CC|                            |peter.alexander.au@gmail.co
                   |                            |m
           Assignee|nobody@puremagic.com        |peter.alexander.au@gmail.co
                   |                            |m

--
January 02, 2015
https://issues.dlang.org/show_bug.cgi?id=13877

--- Comment #4 from Peter Alexander <peter.alexander.au@gmail.com> ---
https://github.com/D-Programming-Language/phobos/pull/2837

For now, I'm limiting the optimization to arrays only. Any other range is conservatively assumed to have arbitrarily expensive iteration, so the normal appender path is used for those.

--
January 08, 2015
https://issues.dlang.org/show_bug.cgi?id=13877

--- Comment #5 from github-bugzilla@puremagic.com ---
Commits pushed to master at https://github.com/D-Programming-Language/phobos

https://github.com/D-Programming-Language/phobos/commit/f81de7d9b9a9d4d8af4615306b2b759f845bd792 Fix Issue 13877 - join assumes forward range can be cheaply iterated twice

e.g. with `r.map!(someExpensiveFunction).join`, `join` would previously iterate twice: once to compute length, and again to build up the result. The extra iteration to compute length may be disproportionately expensive, so we now only precompute length for built-in arrays, since we know those are cheap to iterate.

The heuristic could be improved over time, although I can't think of anything better right now.

https://issues.dlang.org/show_bug.cgi?id=13877

https://github.com/D-Programming-Language/phobos/commit/ba100ff803f8d266f97e9c02503e19f961be1e7e Merge pull request #2837 from Poita/Issue13877

Fix Issue 13877 - join assumes forward range can be cheaply iterated twice

--
February 19, 2015
https://issues.dlang.org/show_bug.cgi?id=13877

--- Comment #6 from github-bugzilla@puremagic.com ---
Commits pushed to 2.067 at https://github.com/D-Programming-Language/phobos

https://github.com/D-Programming-Language/phobos/commit/f81de7d9b9a9d4d8af4615306b2b759f845bd792 Fix Issue 13877 - join assumes forward range can be cheaply iterated twice

https://github.com/D-Programming-Language/phobos/commit/ba100ff803f8d266f97e9c02503e19f961be1e7e Merge pull request #2837 from Poita/Issue13877

--