In general, who should do more work: popFront or front? (page 2)

June 15, 2021

Re: In general, who should do more work: popFront or front?

Posted by Paul Backus
in reply to surlymoor

Permalink

Paul Backus

Posted in reply to surlymoor

Permalink

On Tuesday, 15 June 2021 at 04:24:09 UTC, surlymoor wrote:

All my custom range types perform all their meaningful work in their respective popFront methods, in addition to its expected source data iteration duties. The reason I do this is because I swear I read in a github discussion that front is expected to be O(1), and the only way I can think to achieve this is to stash the front element of a range in a private field; popFront would thus also set this field to a new value upon every call, and front would forward to it. (Or front would be the cache itself.)
At the moment, I feel that as long as the stashed front element isn't too "big" (For some definition of big, I guess.), that built-in caching should be fine. But is this acceptable? What's the best practice for determining which range member should perform what work? (Other than iterating, of course.)

It's a time-space tradeoff. As you say, caching requires additional space to store the cached element. On the other hand, not caching means that you spend unnecessary time computing the next element in cases where the range is only partially consumed. For example:

import std.range: generate, take;
import std.algorithm: each;
import std.stdio: writeln;

generate!someExpensiveFunction.take(3).each!writeln;

Naively, you'd expect that someExpensiveFunction would be called 3 times--but it is actually called 4 times, because generate does its work in its constructor and popFront instead of in front.

On Tue, Jun 15, 2021 at 02:20:11PM +0000, Paul Backus via Digitalmars-d-learn wrote: [...] > It's a time-space tradeoff. As you say, caching requires additional space to store the cached element. On the other hand, *not* caching means that you spend unnecessary time computing the next element in cases where the range is only partially consumed. For example: > > ```d > import std.range: generate, take; > import std.algorithm: each; > import std.stdio: writeln; > > generate!someExpensiveFunction.take(3).each!writeln; > ``` > > Naively, you'd expect that `someExpensiveFunction` would be called 3 times--but it is actually called 4 times, because `generate` does its work in its constructor and `popFront` instead of in `front`. One way to address this is to make the computation lazy: the element is only computed once on demand, and once computed it's cached. But of course, this is probably overkill on many ranges. So the long answer is, it depends. :-) T -- It is not the employer who pays the wages. Employers only handle the money. It is the customer who pays the wages. -- Henry Ford

On 6/15/21 12:24 AM, surlymoor wrote: > All my custom range types perform all their meaningful work in their respective popFront methods, in addition to its expected source data iteration duties. The reason I do this is because I swear I read in a github discussion that front is expected to be O(1), and the only way I can think to achieve this is to stash the front element of a range in a private field; popFront would thus also set this field to a new value upon every call, and front would forward to it. (Or front would be the cache itself.) > At the moment, I feel that as long as the stashed front element isn't too "big" (For some definition of big, I guess.), that built-in caching should be fine. But is this acceptable? What's the best practice for determining which range member should perform what work? (Other than iterating, of course.) IMO, `front` should not be figuring out which element is next. That is the job of `popFront`. But that doesn't mean `front` cannot do work. map is a prime example. If you use `map` though, you know what you are signing up for. `front` is expected to be called multiple times to get the same data. One thing to think about is that there is a `cache` wrapper range which can store it for you. This way, you give your users the option of caching or not. Each situation is different, and depends on the underlying mechanisms needed to make the range operate. -Steve

Forums