Thread overview
joiner and map, strange behavior
Mar 12, 2013
Stephan Schiffels
Mar 12, 2013
bearophile
Mar 12, 2013
Stephan Schiffels
Mar 12, 2013
lomereiter
Mar 12, 2013
Timon Gehr
Mar 13, 2013
Stephan Schiffels
Mar 13, 2013
H. S. Teoh
Mar 13, 2013
bearophile
March 12, 2013
Hi,

I am struggling with understanding this behavior. In the code below, the function "getVec" is called 8 times, but it should be called only 4 times (once for each call inside of map).

Any explanations?

Stephan


import std.stdio;
import std.algorithm;
import std.array;

int[] getVec(size_t i) {
  writeln("getVec is called");
  auto vals = [
    [1, 2, 3, 4],
    [5, 6, 7, 8],
    [9, 10, 11, 12],
    [13, 14, 15, 16]
  ];
  return vals[i];
}


void main() {

  auto result = [0, 1, 2, 3].map!(getVec).joiner.array;
  writeln(result);

}
March 12, 2013
Stephan Schiffels:

> I am struggling with understanding this behavior. In the code below, the function "getVec" is called 8 times, but it should be called only 4 times (once for each call inside of map).
>
> Any explanations?

Maybe it's a matter of calling front() more times, as in filter in a recent thread. To be sure take a look inside phobos sources.

By the way, your code is quite inefficient, because it allocates a new matrix at each call to getVec.

Bye,
bearophile
March 12, 2013
On Tuesday, 12 March 2013 at 17:43:43 UTC, bearophile wrote:
> Stephan Schiffels:
>
>> I am struggling with understanding this behavior. In the code below, the function "getVec" is called 8 times, but it should be called only 4 times (once for each call inside of map).
>>
>> Any explanations?
>
> Maybe it's a matter of calling front() more times, as in filter in a recent thread. To be sure take a look inside phobos sources.
>
> By the way, your code is quite inefficient, because it allocates a new matrix at each call to getVec.
>
> Bye,
> bearophile

Thanks, I had a brief look at std.algorithm.joiner but couldn't find anything obvious, maybe I should look deeper into it.

And yes, the example is quite bad. This version of getVec would have been better to illustrate:

import std.range;
auto getVec(size_t i) {
  writeln("getVec is called");
  return iota(i * 4, (i + 1) * 4);
}

Stephan
March 12, 2013
Current implementation of joiner accesses each element two times - first, in a loop which skips empty ranges, plus after the loop assignment to a variable occurs:

https://github.com/D-Programming-Language/phobos/blob/master/std/algorithm.d#L2993

On Tuesday, 12 March 2013 at 17:51:57 UTC, Stephan Schiffels wrote:
> 
>> Thanks, I had a brief look at std.algorithm.joiner but couldn't
> find anything obvious, maybe I should look deeper into it.
>
March 12, 2013
On 03/12/2013 06:51 PM, Stephan Schiffels wrote:
> ...
>
> Thanks, I had a brief look at std.algorithm.joiner but couldn't find
> anything obvious, maybe I should look deeper into it.
> ...

I guess it is because of the following:

Eg (similar code occurs two times):
...
    if (_sep.empty)
    {
        // Advance to the next range in the
        // input
        //_items.popFront();
        for (;; _items.popFront())
        {
            if (_items.empty) return;
            if (!_items.front.empty) break; // front called
        }
        _current = _items.front; // front called again
        _items.popFront();
    }
...

The code should only call front once for the first non-empty range.

Presumed example fix (though the implementation seems rather clumsy and should probably be replaced completely):

...
    if (_sep.empty)
    {
        // Advance to the next range in the
        // input
        //_items.popFront();
        for (;; _items.popFront())
        {
            if (_items.empty) return;
            _current = _items.front;
            if (!_current.empty) break;
        }
        _items.popFront();
    }
...
March 13, 2013
Am 12.03.13 20:22, schrieb Timon Gehr:
> On 03/12/2013 06:51 PM, Stephan Schiffels wrote:
>> ...
>>
>> Thanks, I had a brief look at std.algorithm.joiner but couldn't find
>> anything obvious, maybe I should look deeper into it.
>> ...
>
> I guess it is because of the following:
>
> Eg (similar code occurs two times):
> ...
>      if (_sep.empty)
>      {
>          // Advance to the next range in the
>          // input
>          //_items.popFront();
>          for (;; _items.popFront())
>          {
>              if (_items.empty) return;
>              if (!_items.front.empty) break; // front called
>          }
>          _current = _items.front; // front called again
>          _items.popFront();
>      }
> ...
>
> The code should only call front once for the first non-empty range.
>
> Presumed example fix (though the implementation seems rather clumsy and
> should probably be replaced completely):
>
> ...
>      if (_sep.empty)
>      {
>          // Advance to the next range in the
>          // input
>          //_items.popFront();
>          for (;; _items.popFront())
>          {
>              if (_items.empty) return;
>              _current = _items.front;
>              if (!_current.empty) break;
>          }
>          _items.popFront();
>      }
> ...

Thanks, that's very clear now. I stumbled over it because the function I call loads a lot of data and a factor 2 makes all the difference in runtime. Maybe I should file a bug report or at least a request to improve this.

Stephan
March 13, 2013
On Wed, Mar 13, 2013 at 12:22:48AM +0000, Stephan Schiffels wrote:
> Am 12.03.13 20:22, schrieb Timon Gehr:
> >On 03/12/2013 06:51 PM, Stephan Schiffels wrote:
> >>...
> >>
> >>Thanks, I had a brief look at std.algorithm.joiner but couldn't find
> >>anything obvious, maybe I should look deeper into it.
> >>...
> >
> >I guess it is because of the following:
> >
> >Eg (similar code occurs two times):
> >...
> >     if (_sep.empty)
> >     {
> >         // Advance to the next range in the
> >         // input
> >         //_items.popFront();
> >         for (;; _items.popFront())
> >         {
> >             if (_items.empty) return;
> >             if (!_items.front.empty) break; // front called
> >         }
> >         _current = _items.front; // front called again
> >         _items.popFront();
> >     }
> >...
> >
> >The code should only call front once for the first non-empty range.
> >
> >Presumed example fix (though the implementation seems rather clumsy and should probably be replaced completely):
> >
> >...
> >     if (_sep.empty)
> >     {
> >         // Advance to the next range in the
> >         // input
> >         //_items.popFront();
> >         for (;; _items.popFront())
> >         {
> >             if (_items.empty) return;
> >             _current = _items.front;
> >             if (!_current.empty) break;
> >         }
> >         _items.popFront();
> >     }
> >...
> 
> Thanks, that's very clear now. I stumbled over it because the function I call loads a lot of data and a factor 2 makes all the difference in runtime. Maybe I should file a bug report or at least a request to improve this.
[...]

I was the one responsible for the recent rewrite of joiner; I'll think over how to fix it so that .front is only evaluated once.


T

-- 
Старый друг лучше новых двух.
March 13, 2013
H. S. Teoh:

> I was the one responsible for the recent rewrite of joiner; I'll think
> over how to fix it so that .front is only evaluated once.

This is a strictly related issue, about filter:
http://d.puremagic.com/issues/show_bug.cgi?id=9674

Bye,
bearophile