cache() method - should it cache on popFront() or on first call to front()

Feb 10, 2020

Piotr Mitana

Feb 11, 2020

aliak

Feb 11, 2020

Steven Schveighoffer

Feb 11, 2020

Dukc

I have recently found std.algorithm.iteration.cache() function's behavior a little bit confusing. I'll demonstrate it with this example: void main() { auto arr = sequence!"n" .map!(n => { writefln("Mapped: %d", n); return n; }()) .cache .until(3, No.openRight) .array; arr.writeln; } What would you expect such a code to write? Actually what it writes is: Mapped: 0 Mapped: 1 Mapped: 2 Mapped: 3 Mapped: 4 [0, 1, 2, 3] What was unexpected to me is that it still passed 4 to map function despite the fact that I should be left out by until. Later I have read that range's element is calculated eagerly when the popFront() is called. However, when the until() is called immediately after, there is element is calculated needlessly. This could be avoided by caching not when popFront() is called, but when front() is called for the first time of the new element. I solved the problem with the following simple template: auto cacheOnAccess(Range)(Range source) if(isInputRange!Range) { struct CacheOnAccess { Range source; Nullable!(ElementType!Range) cachedFront; this(Range source) { this.source = source; } @property ElementType!Range front() { if(cachedFront.isNull) cachedFront = source.front.nullable; return cachedFront.get; } @property bool empty() { return source.empty; } void popFront() { source.popFront(); cachedFront.nullify(); } } return CacheOnAccess(source); } Maybe a change in Phobos could be considered - either adding a parameter to cache() template, so that the moment of caching could be determined? Alternatively, another function could be added, which would cache when the element is accessed for the first time.

On Monday, 10 February 2020 at 10:43:33 UTC, Piotr Mitana wrote: > I have recently found std.algorithm.iteration.cache() function's behavior a little bit confusing. I'll demonstrate it with this example: > > [...] Hi! Nice catch! I'm not sure if the behaviour is intentional (it sounds off). So maybe add this to the issue tracker? You can add it here: https://issues.dlang.org/

February 11, 2020

Re: cache() method - should it cache on popFront() or on first call to front()

Posted by Steven Schveighoffer
in reply to Piotr Mitana

Permalink

Steven Schveighoffer

Posted in reply to Piotr Mitana

Permalink

On 2/10/20 5:43 AM, Piotr Mitana wrote:
> I have recently found std.algorithm.iteration.cache() function's behavior a little bit confusing. I'll demonstrate it with this example:
> 
> void main()
> {
>      auto arr = sequence!"n"
>          .map!(n => { writefln("Mapped: %d", n); return n; }())
>          .cache
>          .until(3, No.openRight)
>          .array;
> 
>      arr.writeln;
> }
> 
> What would you expect such a code to write? Actually what it writes is:
> 
> Mapped: 0
> Mapped: 1
> Mapped: 2
> Mapped: 3
> Mapped: 4
> [0, 1, 2, 3]
> 
> What was unexpected to me is that it still passed 4 to map function despite the fact that I should be left out by until. Later I have read that range's element is calculated eagerly when the popFront() is called.
> 
> However, when the until() is called immediately after, there is element is calculated needlessly. This could be avoided by caching not when popFront() is called, but when front() is called for the first time of the new element.
> 
> I solved the problem with the following simple template:
> 
> auto cacheOnAccess(Range)(Range source) if(isInputRange!Range)
> {
>      struct CacheOnAccess
>      {
>          Range source;
>          Nullable!(ElementType!Range) cachedFront;
> 
>          this(Range source)
>          {
>              this.source = source;
>          }
> 
>          @property ElementType!Range front()
>          {
>              if(cachedFront.isNull)
>                  cachedFront = source.front.nullable;
> 
>              return cachedFront.get;
>          }
> 
>          @property bool empty()
>          {
>              return source.empty;
>          }
> 
>          void popFront()
>          {
>              source.popFront();
>              cachedFront.nullify();
>          }
>      }
> 
>      return CacheOnAccess(source);
> }
> 
> Maybe a change in Phobos could be considered - either adding a parameter to cache() template, so that the moment of caching could be determined? Alternatively, another function could be added, which would cache when the element is accessed for the first time.

If you do this, this means that front is modifying the range. In general ranges should be modified only by popFront, and empty and front should not.

However, the entire point of cache is to not evaluate primitives more than once, so I think it's a reasonable implementation detail to delay evaluation of the call until it's actually called. I'd put it as a compile-time parameter to the cache function.

The reason I would make this a parameter is because sometimes the range may be implemented to alter the front element in multiple copies of the range, which would affect the behavior in strange ways vs. eager evaluation.

-Steve

On Monday, 10 February 2020 at 10:43:33 UTC, Piotr Mitana wrote: > I have recently found std.algorithm.iteration.cache() function's behavior a little bit confusing. I'll demonstrate it with this example: > > void main() > { > auto arr = sequence!"n" > .map!(n => { writefln("Mapped: %d", n); return n; }()) > .cache > .until(3, No.openRight) > .array; > > arr.writeln; > } > > What would you expect such a code to write? Actually what it writes is: > > Mapped: 0 > Mapped: 1 > Mapped: 2 > Mapped: 3 > Mapped: 4 > [0, 1, 2, 3] > > What was unexpected to me is that it still passed 4 to map function despite the fact that I should be left out by until. Later I have read that range's element is calculated eagerly when the popFront() is called. > > However, when the until() is called immediately after, there is element is calculated needlessly. This could be avoided by caching not when popFront() is called, but when front() is called for the first time of the new element. > > I solved the problem with the following simple template: > > auto cacheOnAccess(Range)(Range source) if(isInputRange!Range) > { > struct CacheOnAccess > { > Range source; > Nullable!(ElementType!Range) cachedFront; > > this(Range source) > { > this.source = source; > } > > @property ElementType!Range front() > { > if(cachedFront.isNull) > cachedFront = source.front.nullable; > > return cachedFront.get; > } > > @property bool empty() > { > return source.empty; > } > > void popFront() > { > source.popFront(); > cachedFront.nullify(); > } > } > > return CacheOnAccess(source); > } > > Maybe a change in Phobos could be considered - either adding a parameter to cache() template, so that the moment of caching could be determined? Alternatively, another function could be added, which would cache when the element is accessed for the first time. Sounds good. std.range.tee[1], which also always executes the function only once for each element, already does this. By default it executes on popFront() (but on popped value, not the new one, which frankly is surprising at least for me), but it has PipeOnPop flag which, if set to no, executes on first invocation of front. If we follow this example, it means adding a template parameter. 1: https://dlang.org/phobos-prerelease/std_range.html#tee

Forums