Why does `filterBidirectional` exist, and why isn't it called `filter`?

Why does `filterBidirectional` exist, and why isn't it called `filter`?
Mar 09, 2023 FeepingCreature
Mar 09, 2023 Steven Schveighoffer
Mar 09, 2023 FeepingCreature
Mar 09, 2023 Steven Schveighoffer
Mar 09, 2023 Paul Backus
Mar 09, 2023 Monkyyy
Mar 09, 2023 jmh530
Mar 09, 2023 Steven Schveighoffer
Mar 09, 2023 Paul Backus
Mar 09, 2023 Steven Schveighoffer
Mar 10, 2023 Ogi
Mar 10, 2023 Paul Backus
Mar 11, 2023 Timon Gehr
Mar 12, 2023 Tejas
Mar 12, 2023 Timon Gehr
Mar 12, 2023 JG
Apr 14, 2023 Atila Neves
Apr 18, 2023 Lata chaudhary

March 09, 2023

Posted by FeepingCreature

Permalink

FeepingCreature

Permalink

Yes I know the stated reason, to save time initializing the range in filter.

You know what I did because we didn't know filterBidirectional existed?

I literally walked the entire range returned by filter to find the last matching element.

filter should expose back. By all means, let back lazily initialize, but I don't understand how filterBidirectional was ever considered acceptable. Since when do we sacrifice ease of use for performance? I mean, since 2011 apparently. - This is bad ergonomics, bad discoverability and bad API design. filterBidirectional delenda est.

March 09, 2023

Re: Why does `filterBidirectional` exist, and why isn't it called `filter`?

Posted by Steven Schveighoffer
in reply to FeepingCreature

Permalink

Steven Schveighoffer

Posted in reply to FeepingCreature

Permalink

On 3/9/23 3:06 AM, FeepingCreature wrote:

Yes I know the stated reason, to save time initializing the range in filter.

You know what I did because we didn't know filterBidirectional existed?

I literally walked the entire range returned by filter to find the last matching element.

Why not rng.retro.filter?

This has been a constant debate -- should ranges be allowed to lazily initialize on the first call to front (or back)?

I'd say no. back/front shouldn't be modifying operations. I dislike the idea of storing a "was initialized" flag and having to check that on every call.

That being said, there's no formal requirement for it. It's just a convention I think Phobos should stick to for its included ranges.

-Steve

March 09, 2023

Re: Why does `filterBidirectional` exist, and why isn't it called `filter`?

Posted by FeepingCreature
in reply to Steven Schveighoffer

Permalink

FeepingCreature

Posted in reply to Steven Schveighoffer

Permalink

On Thursday, 9 March 2023 at 13:16:07 UTC, Steven Schveighoffer wrote:

On 3/9/23 3:06 AM, FeepingCreature wrote:

Yes I know the stated reason, to save time initializing the range in filter.

You know what I did because we didn't know filterBidirectional existed?

I literally walked the entire range returned by filter to find the last matching element.

Why not rng.retro.filter?

retro changes the semantics of the range, which often represents business data. Thinking about the underlying model is complicated enough without throwing in "But consider: what if the elements were in reverse order".

> >

This has been a constant debate -- should ranges be allowed to lazily initialize on the first call to front (or back)?

I'd say no. back/front shouldn't be modifying operations. I dislike the idea of storing a "was initialized" flag and having to check that on every call.

That being said, there's no formal requirement for it. It's just a convention I think Phobos should stick to for its included ranges.

-Steve

Sure: so precompute both back and front, every filter call. No caching. This isn't done right now because the language presumes, arbitrarily, that you won't care about back. That's what I meant by sacrificing convenience for performance. Just do what filterBidirectional does, every time, and have the current filter be the odd one out. filterForward or whatever.

(I think the best way to go, in truth, is to preinitialize front and cache back. But that solution, much as I think it is optimal for all involved interests, is very hard to love.)

That said, I think the range API has a fundamental problem here. filter pre-initializes front on every access, because it reasonably presumes that we will then want to query it. However, it cannot rely on this. If we were guaranteed that every access to popFront would alternate with at least one access to front, and conversely back with popBack, we could just reduce popBack to nextRange.popBack and do the rest of the popping in back! Then in the fast case (one access to each, in turn) we would not waste any check. And the actual filter call would always be constant-time.

March 09, 2023

Re: Why does `filterBidirectional` exist, and why isn't it called `filter`?

Posted by Paul Backus
in reply to Steven Schveighoffer

Permalink

Paul Backus

Posted in reply to Steven Schveighoffer

Permalink

On Thursday, 9 March 2023 at 13:16:07 UTC, Steven Schveighoffer wrote:

This has been a constant debate -- should ranges be allowed to lazily initialize on the first call to front (or back)?

I'd say no. back/front shouldn't be modifying operations. I dislike the idea of storing a "was initialized" flag and having to check that on every call.

That being said, there's no formal requirement for it. It's just a convention I think Phobos should stick to for its included ranges.

I think probably you have to go on a case-by-case basis. File.byLine, for example, does not and should not precompute front, because doing so is potentially very expensive. OTOH, iota does and should precompute front, because doing so is essentially free and makes the implementation much simpler.

In general, I think we should err on the side of laziness, because it makes range composition easier. For example, when the user writes something like chain(a(), b(), c()) or choose(cond, a(), b()), they probably do not want to spend unnecessary cycles precomputing b().front and c().front.

March 09, 2023

Re: Why does `filterBidirectional` exist, and why isn't it called `filter`?

Posted by Monkyyy
in reply to Paul Backus

Permalink

Monkyyy

Posted in reply to Paul Backus

Permalink

On Thursday, 9 March 2023 at 17:07:40 UTC, Paul Backus wrote:

On Thursday, 9 March 2023 at 13:16:07 UTC, Steven Schveighoffer wrote:

[...]

In general if someone would want to precompute a range shouldnt front just be a varible?

March 09, 2023

Re: Why does `filterBidirectional` exist, and why isn't it called `filter`?

Posted by jmh530
in reply to Paul Backus

Permalink

jmh530

Posted in reply to Paul Backus

Permalink

On Thursday, 9 March 2023 at 17:07:40 UTC, Paul Backus wrote:

[snip]

Or, provide the option to allow the user to make the choice at compile-time.

March 09, 2023

Re: Why does `filterBidirectional` exist, and why isn't it called `filter`?

Posted by Steven Schveighoffer
in reply to FeepingCreature

Permalink

Steven Schveighoffer

Posted in reply to FeepingCreature

Permalink

On 3/9/23 11:44 AM, FeepingCreature wrote:

On Thursday, 9 March 2023 at 13:16:07 UTC, Steven Schveighoffer wrote:

On 3/9/23 3:06 AM, FeepingCreature wrote:

Yes I know the stated reason, to save time initializing the range in filter.

You know what I did because we didn't know filterBidirectional existed?

I literally walked the entire range returned by filter to find the last matching element.

Why not rng.retro.filter?

Well, walking the entire range to find the last one doesn't imply that ordering matters at all. You are just looking for the last one.

> > >

This has been a constant debate -- should ranges be allowed to lazily initialize on the first call to front (or back)?

I'd say no. back/front shouldn't be modifying operations. I dislike the idea of storing a "was initialized" flag and having to check that on every call.

That being said, there's no formal requirement for it. It's just a convention I think Phobos should stick to for its included ranges.

I would be fine with that. But in truth, this is simply an acceptance of the status quo, just with different names.

(I think the best way to go, in truth, is to preinitialize front and cache back. But that solution, much as I think it is optimal for all involved interests, is very hard to love.)

I prefer to have control over all the capabilities. If there is going to be significant cost for something I won't actually use, I'd rather avoid it.

Allowing composability might be the way to go.

front and empty are semantically non-mutating properties. Calling front and/or empty as many times as you want should not alter any values until you call popFront.

I prefer them to be actually non-mutating as well.

Delaying construction seems like the better option, and having such a wrapper solves all those problems for those who want lazy, without imposing on users who are intending to just use a range as it's intended.

-Steve

March 09, 2023

Re: Why does `filterBidirectional` exist, and why isn't it called `filter`?

Posted by Steven Schveighoffer
in reply to Paul Backus

Permalink

Steven Schveighoffer

Posted in reply to Paul Backus

Permalink

On 3/9/23 12:07 PM, Paul Backus wrote:

On Thursday, 9 March 2023 at 13:16:07 UTC, Steven Schveighoffer wrote:

This has been a constant debate -- should ranges be allowed to lazily initialize on the first call to front (or back)?

I'd say no. back/front shouldn't be modifying operations. I dislike the idea of storing a "was initialized" flag and having to check that on every call.

That being said, there's no formal requirement for it. It's just a convention I think Phobos should stick to for its included ranges.

I disagree. If you construct a byLine you intend to iterate by lines. Fetching the first line is reasonable and expected (by me anyway).

The big issue with byLine's iteration mechanism is whether popFront should consume the next line or not. But that is a semantic expectation that people can differ on.

Again, as I said elsewhere, using wrappers to compose the desired behavior should be what we use, and all ranges should just always have the same expected semantics.

-Steve

March 09, 2023

Re: Why does `filterBidirectional` exist, and why isn't it called `filter`?

Posted by Paul Backus
in reply to Steven Schveighoffer

Permalink

Paul Backus

Posted in reply to Steven Schveighoffer

Permalink

On Thursday, 9 March 2023 at 19:48:21 UTC, Steven Schveighoffer wrote:

On 3/9/23 12:07 PM, Paul Backus wrote:

I disagree. If you construct a byLine you intend to iterate by lines.

I gave examples in my post of situations where the user may construct a range before deciding whether or not to iterate it. If you find those examples unconvincing, I'd be happy to hear why.

March 09, 2023

Re: Why does `filterBidirectional` exist, and why isn't it called `filter`?

Posted by Steven Schveighoffer
in reply to Paul Backus

Permalink

Steven Schveighoffer

Posted in reply to Paul Backus

Permalink

On 3/9/23 4:16 PM, Paul Backus wrote:

On Thursday, 9 March 2023 at 19:48:21 UTC, Steven Schveighoffer wrote:

On 3/9/23 12:07 PM, Paul Backus wrote:

I disagree. If you construct a byLine you intend to iterate by lines.

I gave examples in my post of situations where the user may construct a range before deciding whether or not to iterate it. If you find those examples unconvincing, I'd be happy to hear why.

choose(cond, a(), b()) could evaluate the parameters lazily, and then it doesn't matter if a() or b() precomputes. In fact, the implementation ignores one of them anyway.

For chain, it's more complex, because you don't know when the initialization will need to happen. But I'm also fine with that consequence in exchange for simplification of expectations.

Let's say it's chain(File("foo.txt").byLine, File("bar.txt").byLine). It's going to be expensive to open the file, but not really any more expensive to read the first data into the buffer (the disk cache will have it ready for you). So you don't save anything significant. However, you lose on the congnitive load for all uses of byLine, and indeed all ranges now have a semantic parameter that you have to either "just know" or read the docs/implementation.

We already have very complex situations that happen because of the interactions between range initialization and iteration. Most of them happen when you expect lazy or eager, and the other is chosen. I think we should just pick one, and say, it's always that way, and if you want something different, you need wrappers.

-Steve

Top | Forum index | About this forum

Forums