March 23, 2014 protocol for using InputRanges | ||||
---|---|---|---|---|
| ||||
It's become clear to me that we've underspecified what an InputRange is. The normal way to use it is: while (!r.empty) { auto e = r.front; ... do something with e ... r.popFront(); } no argument there. But there are two issues: 1. If you know the range is not empty, is it allowed to call r.front without calling r.empty first? If this is true, extra logic will need to be added to r.front in many cases. 2. Can r.front be called n times in a row? I.e. is calling front() destructive? If true, this means that r.front will have to cache a copy in many cases. |
March 23, 2014 Re: protocol for using InputRanges | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | On Sunday, 23 March 2014 at 00:50:34 UTC, Walter Bright wrote:
> 2. Can r.front be called n times in a row? I.e. is calling front() destructive?
>
> If true, this means that r.front will have to cache a copy in many cases.
If `front` was destructive there would be little point in having it separate from `popFront`. I think it must be non-destructive to make sense.
|
March 23, 2014 Re: protocol for using InputRanges | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | On Sunday, 23 March 2014 at 00:50:34 UTC, Walter Bright wrote: > auto e = r.front; Remember that front is a getter property, which means it should work like a variable. Typically, reading a variable is not destructive and needs no preparation. There's exceptions to the rule, but they almost always work this way, so ranges should too. > If true, this means that r.front will have to cache a copy in many cases. Indeed, in many of my quick input ranges, I just make front a regular variable and popFront updates it to the next item. |
March 23, 2014 Re: protocol for using InputRanges | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | On Sunday, 23 March 2014 at 00:50:34 UTC, Walter Bright wrote:
> It's become clear to me that we've underspecified what an InputRange is. The normal way to use it is:
>
> while (!r.empty) {
> auto e = r.front;
> ... do something with e ...
> r.popFront();
> }
>
> no argument there. But there are two issues:
>
> 1. If you know the range is not empty, is it allowed to call r.front without calling r.empty first?
>
> If this is true, extra logic will need to be added to r.front in many cases.
>
> 2. Can r.front be called n times in a row? I.e. is calling front() destructive?
>
> If true, this means that r.front will have to cache a copy in many cases.
It would be nice to have ranges full stop described. And how to use/make them in D. I have yet to learn them because of no official documentation on them.
On this topic we also need to work on common patterns and creating documentation on e.g. the site or wiki for it. Saw that we do have a bit in the wiki under tutorials. Maybe if I get some time I'll work on that.
|
March 23, 2014 Re: protocol for using InputRanges | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | On Saturday, March 22, 2014 17:50:34 Walter Bright wrote: > It's become clear to me that we've underspecified what an InputRange is. The normal way to use it is: > > while (!r.empty) { > auto e = r.front; > ... do something with e ... > r.popFront(); > } > > no argument there. But there are two issues: > > 1. If you know the range is not empty, is it allowed to call r.front without calling r.empty first? > > If this is true, extra logic will need to be added to r.front in many cases. You definitely don't have to call empty before calling front if you know that it's not empty. Both front and empty should normally be pure (or at least act that way) and essentially act like variables. In most cases, it works best for the work of the range to go in popFront. The exception is when you're dealing with a random-access range, since then any element could be accessed, making it so that you can't be doing the work in popFront. I think that we have a general agreement on this based on previous discussions, though it's certainly not unanimous. > 2. Can r.front be called n times in a row? I.e. is calling front() > destructive? > > If true, this means that r.front will have to cache a copy in many cases. If calling front were destructive, that would break a lot of code. It's probably true that most range-based code should avoid calling front multiple times (in case front has to do more work than just return the value as well as to avoid copying the result if that happens on every call), though if front is auto ref, it could be more efficient to call it multiple times. So, it's not entirely clear-cut. But again, front and empty should normally function as if they were variables. They should be property functions and should be pure (or at least act like they're pure). I'm sure that a _lot_ of code will break if that isn't followed. There are corner cases which can get a bit mucky though - e.g. auto a = map!(to!string)(range); In this case, front is pure, but it returns a new value each time (albeit a value that's equal each time until popFront is called). And there's no real way to fix that if the resulting range is random access (though if it weren't, the work could go in popFront, which _would_ make it so that front always returned the same result). And there have been arguments over whether the result of front should be valid after popFront has been called (i.e. whether it's transient or not). A lot of code assumes that it will be, but we have some nasty exceptions (e.g. std.stdio.ByLine) - typically because front's a buffer which gets reused. IIRC, in those cases, Andrei favored saying that input ranges that weren't forward ranges could have a transient front but that forward ranges couldn't (which I tend to agree with, though I'd prefer that _no_ ranges have transient fronts, since it can really cause problems - e.g. std.array.array not working). I don't think that a consensus was reached on that though, since a few folks really liked using transient fronts with more complicated ranges. In general though, I think that most of us would agree that front and empty should be treated as properties - i.e. as if they were variables - and that they should have try to stick to those semantics as closely as possible. Ranges that stray from that seriously risk not working with a lot of range- based code. - Jonathan M Davis |
March 23, 2014 Re: protocol for using InputRanges | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | Am Sat, 22 Mar 2014 17:50:34 -0700 schrieb Walter Bright <newshound2@digitalmars.com>: > It's become clear to me that we've underspecified what an InputRange is. The normal way to use it is: > > while (!r.empty) { > auto e = r.front; > ... do something with e ... > r.popFront(); > } > > no argument there. But there are two issues: > > 1. If you know the range is not empty, is it allowed to call r.front without calling r.empty first? > > If this is true, extra logic will need to be added to r.front in many cases. Looking at ranges as a user with all the subjectivity it entails, I'd expect .empty to be a read-only property, a @safe-const-pure-nothrow candidate. Actions that are not logically const are commonly encapsulated in methods (as opposed to read-only properties like .empty) and often also carry a verb in their name like "create", "update" or "append". > 2. Can r.front be called n times in a row? I.e. is calling front() destructive? > > If true, this means that r.front will have to cache a copy in many cases. On the other side of it were (IIRC) there are use cases for return-by-ref on .front: o .front gives access to a large struct. Return-by-ref can avoid a copy on the call site, but may result in several .front calls. o .front is a view into some struct which holds system resources and providing a copy requires internals as in the File struct. I both cases you deal with a range over value types that are costly to copy and where "auto e = r.front;" is to be avoided. I'm 100% unsure how to deal with this. -- Marco |
March 23, 2014 Re: protocol for using InputRanges | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | I understand it like this. * empty - Are there no more values? * front - Get me the current value. * popFront - Advance to the next value. In terms of how I implement an InputRange in general, I typically end up with this. * empty - Advance and cache "current value," return true if we ran out of values. * front - enforce(!empty), which in turn caches the current value, which we then return. * popFront - enforce(!empty), clear the cached value so we can later advance. So .front gives you the same thing again and again until you call popFront, you could indeed call .front before .empty, but you may get an exception. This obviously isn't how I implement all InputRanges, as there are better ways to write other ranges, such as ranges which operate on sets of integers or wrap arrays. This is just my last resort general case InputRange implementation. .front assignment obviously replaces the cached value. |
March 23, 2014 Re: protocol for using InputRanges | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | On Sunday, 23 March 2014 at 00:50:34 UTC, Walter Bright wrote: > > 1. If you know the range is not empty, is it allowed to call r.front without calling r.empty first? IMO: yes. Logic of empty() sould be const and not have side effects. > > If this is true, extra logic will need to be added to r.front in many cases. > > 2. Can r.front be called n times in a row? I.e. is calling front() destructive? > > If true, this means that r.front will have to cache a copy in many cases. Yes, all true. Not sure if there is something like that in Phobos but if for example you had RandomValueRange, ever call to front() should return the same random number until popFront() is called at which point internal front variable is being recalculated and cached. Since only popFront() is non-const, it is there where all the magic/mutation should happen. In my code I also have CachingRange which calls front() on underlying range the first time and then stores result internally. I use it for ranges which generate front() on the fly and I knot it is expensive. I could cache that calculation directly in that underlying range but CachingRange is reusable. |
March 23, 2014 Re: protocol for using InputRanges | ||||
---|---|---|---|---|
| ||||
Posted in reply to w0rp | I should add that I have implemented some ranges where .front and .popFront are both nothrow, as !empty doesn't "advance and cache" for these ranges and the check is moved into an in{} contract. For these ranges, they tend to behave like arrays with bounds checking, only now the bounds checking is turned off by virtue of -release. |
March 23, 2014 Re: protocol for using InputRanges | ||||
---|---|---|---|---|
| ||||
Posted in reply to w0rp | On Sunday, 23 March 2014 at 09:26:53 UTC, w0rp wrote: > I understand it like this. > > * empty - Are there no more values? > * front - Get me the current value. > * popFront - Advance to the next value. That is correct. > > In terms of how I implement an InputRange in general, I typically end up with this. > > * empty - Advance and cache "current value," return true if we ran out of values. > * front - enforce(!empty), which in turn caches the current value, which we then return. > * popFront - enforce(!empty), clear the cached value so we can later advance. > > So .front gives you the same thing again and again until you call popFront, you could indeed call .front before .empty, but you may get an exception. This obviously isn't how I implement all InputRanges, as there are better ways to write other ranges, such as ranges which operate on sets of integers or wrap arrays. This is just my last resort general case InputRange implementation. > > .front assignment obviously replaces the cached value. That is not consistent with the first part of your reply and is incorrect imho. Calling empty() twice should not destroy anything, even according to your understanding. Neither should front(). User should be able to call them as many times as he wishes getting same value every time. Only popFront() mutate the range. |
Copyright © 1999-2021 by the D Language Foundation