March 23, 2014 Re: protocol for using InputRanges | ||||
---|---|---|---|---|
| ||||
Posted in reply to Szymon Gatner | On Sunday, 23 March 2014 at 09:34:28 UTC, Szymon Gatner wrote:
> On Sunday, 23 March 2014 at 00:50:34 UTC, Walter Bright wrote:
>>
>> 1. If you know the range is not empty, is it allowed to call r.front without calling r.empty first?
>
> IMO: yes. Logic of empty() sould be const and not have side effects.
>
>>
>> If this is true, extra logic will need to be added to r.front in many cases.
>>
>> 2. Can r.front be called n times in a row? I.e. is calling front() destructive?
>>
>> If true, this means that r.front will have to cache a copy in many cases.
>
> Yes, all true. Not sure if there is something like that in Phobos but if for example you had RandomValueRange, ever call to front() should return the same random number until popFront() is called at which point internal front variable is being recalculated and cached.
>
> Since only popFront() is non-const, it is there where all the magic/mutation should happen.
>
> In my code I also have CachingRange which calls front() on underlying range the first time and then stores result internally. I use it for ranges which generate front() on the fly and I knot it is expensive. I could cache that calculation directly in that underlying range but CachingRange is reusable.
I tend to think about pair of C++ iterators, which are generalization of pointers. So for C pointers 'first' and 'last':
*first -> front()
first != last -> empty()
++first -> popFont()
And I stick to semantics.
|
March 23, 2014 Re: protocol for using InputRanges | ||||
---|---|---|---|---|
| ||||
On 23/03/14 08:53, Jonathan M Davis wrote: > But again, front and empty should normally function as if they were variables. > They should be property functions and should be pure (or at least act like > they're pure). I'm sure that a _lot_ of code will break if that isn't > followed. There are some not-very-nice exceptions to that in std.random, where in many of the range types you have stuff along the following lines: auto front() @property { if (notInitialized) { init(); } return _value; } see e.g. MersenneTwister, RandomSample, and others. It's all a nasty consequence of that RNGs-as-value-types problem we've discussed previously. However, the class-based std.random2 that is being discussed on the announce list fixes that: http://forum.dlang.org/thread/cyytvhixkqlbwkmiugex@forum.dlang.org > In general though, I think that most of us would agree that front and empty > should be treated as properties - i.e. as if they were variables - and that > they should have try to stick to those semantics as closely as possible. > Ranges that stray from that seriously risk not working with a lot of range- > based code. Yes, absolutely. |
March 23, 2014 Re: protocol for using InputRanges | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | On 03/23/2014 01:50 AM, Walter Bright wrote:
>
> 2. Can r.front be called n times in a row? I.e. is calling front()
> destructive?
>
> If true, this means that r.front will have to cache a copy in many cases.
An analogous problem exists and is more severe for RandomAccessRanges.
import std.algorithm, std.range;
class Obj{ this(int){} }
void main(){
auto x=[1,2,3].map!(a=>new Obj(a));
auto a=x.front;
auto b=x.front;
auto c=x[0];
auto d=x[0];
assert(a is b); // fail
assert(b is c); // fail
assert(c is d); // fail
}
|
March 23, 2014 Re: protocol for using InputRanges | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | On Sunday, 23 March 2014 at 00:50:34 UTC, Walter Bright wrote:
> It's become clear to me that we've underspecified what an InputRange is. The normal way to use it is:
>
> while (!r.empty) {
> auto e = r.front;
> ... do something with e ...
> r.popFront();
> }
>
> no argument there. But there are two issues:
>
> 1. If you know the range is not empty, is it allowed to call r.front without calling r.empty first?
>
> If this is true, extra logic will need to be added to r.front in many cases.
>
> 2. Can r.front be called n times in a row? I.e. is calling front() destructive?
>
> If true, this means that r.front will have to cache a copy in many cases.
Is InputRange supposed to be a one-pass range and am I supposed to able to use InputRange as a lightweight wrapper for a stream? If the answer is yes, then I think the fundamental issue is that the empty-front-popFront interface is not optimal for something like InputRange. But given that's the interface we have, I think that InputRange's front must be allowed to be destructive because the stream it could potentially be wrapping is destructive.
|
March 23, 2014 Re: protocol for using InputRanges | ||||
---|---|---|---|---|
| ||||
Posted in reply to Jonathan M Davis | On Sunday, 23 March 2014 at 07:54:12 UTC, Jonathan M Davis wrote:
>
> If calling front were destructive, that would break a lot of code.
I thought that "breaking existing code" meant either "causing existing code do something it wasn't supposed to do" or "causing existing code not compile", but you're using it in the meaning of "making existing code not conform to the language specification". Are you sure that's the correct use of the expression?
|
March 24, 2014 Re: protocol for using InputRanges | ||||
---|---|---|---|---|
| ||||
Posted in reply to Tommi | On Sun, 23 Mar 2014 09:00:17 -0400, Tommi <tommitissari@hotmail.com> wrote:
> On Sunday, 23 March 2014 at 00:50:34 UTC, Walter Bright wrote:
>> It's become clear to me that we've underspecified what an InputRange is. The normal way to use it is:
>>
>> while (!r.empty) {
>> auto e = r.front;
>> ... do something with e ...
>> r.popFront();
>> }
>>
>> no argument there. But there are two issues:
>>
>> 1. If you know the range is not empty, is it allowed to call r.front without calling r.empty first?
>>
>> If this is true, extra logic will need to be added to r.front in many cases.
>>
>> 2. Can r.front be called n times in a row? I.e. is calling front() destructive?
>>
>> If true, this means that r.front will have to cache a copy in many cases.
>
> Is InputRange supposed to be a one-pass range and am I supposed to able to use InputRange as a lightweight wrapper for a stream? If the answer is yes, then I think the fundamental issue is that the empty-front-popFront interface is not optimal for something like InputRange. But given that's the interface we have, I think that InputRange's front must be allowed to be destructive because the stream it could potentially be wrapping is destructive.
A range interface only works for a buffered stream, which naturally allows caching.
-Steve
|
March 24, 2014 Re: protocol for using InputRanges | ||||
---|---|---|---|---|
| ||||
Posted in reply to Szymon Gatner | On Sun, 23 Mar 2014 05:34:26 -0400, Szymon Gatner <noemail@gmail.com> wrote:
> On Sunday, 23 March 2014 at 00:50:34 UTC, Walter Bright wrote:
>>
>> 1. If you know the range is not empty, is it allowed to call r.front without calling r.empty first?
>
> IMO: yes. Logic of empty() sould be const and not have side effects.
Here is the crux of the issue. I think aside from Walter's question, we have situations where it is expensive to construct each value of a range. For a range that is never used, that is an expense that you don't have to pay.
There was a post a little while ago about how File.byLine reads the first line on construction. Consider the case where stdin.byLine needs to be constructed, but not *iterated* before writing some information to stdout. This gives us a predicament that the first line has to be available before you can output any instructions. As we all know, stdin will block on first read, unless you piped in a file or something. This makes an interactive program simply incorrect.
From the library point of view, caching the first line on construction is the most sensible thing -- This allows empty, front, and popFront to all be consistent across the entire iteration. Making empty or front do something different on the first access requires awkward machinery (a boolean indicating it should do so, plus a check every call thereafter). But from the user's point of view, sometimes you want lazy access to the first element, for logical reasons.
I think in light of such possibilities, empty and front should not have to be const or pure. Logically, it can be viewed that way, but not technically.
-Steve
|
March 24, 2014 Re: protocol for using InputRanges | ||||
---|---|---|---|---|
| ||||
Posted in reply to Timon Gehr | On Sunday, 23 March 2014 at 11:52:53 UTC, Timon Gehr wrote: > On 03/23/2014 01:50 AM, Walter Bright wrote: >> >> 2. Can r.front be called n times in a row? I.e. is calling front() >> destructive? >> >> If true, this means that r.front will have to cache a copy in many cases. > > An analogous problem exists and is more severe for RandomAccessRanges. > > import std.algorithm, std.range; > class Obj{ this(int){} } > void main(){ > auto x=[1,2,3].map!(a=>new Obj(a)); > auto a=x.front; > auto b=x.front; > auto c=x[0]; > auto d=x[0]; > assert(a is b); // fail > assert(b is c); // fail > assert(c is d); // fail > } I have an open proposal for a "cache" range adaptor that somewhat eliviates the problem: https://github.com/D-Programming-Language/phobos/pull/1364 But it restricts the range to bidirectional (for the reasons above). Actually, it's mimited to Bidir, but if you *do* use it as bidirectional, then 1 element will be evaluated twice. So I'm wondering if it might not be better to simply restrict it to forward... |
March 24, 2014 Re: protocol for using InputRanges | ||||
---|---|---|---|---|
| ||||
Posted in reply to Szymon Gatner | On Sunday, 23 March 2014 at 09:38:43 UTC, Szymon Gatner wrote:
> On Sunday, 23 March 2014 at 09:26:53 UTC, w0rp wrote:
>> I understand it like this.
>>
>> * empty - Are there no more values?
>> * front - Get me the current value.
>> * popFront - Advance to the next value.
>
> That is correct.
>
>>
>> In terms of how I implement an InputRange in general, I typically end up with this.
>>
>> * empty - Advance and cache "current value," return true if we ran out of values.
>> * front - enforce(!empty), which in turn caches the current value, which we then return.
>> * popFront - enforce(!empty), clear the cached value so we can later advance.
>>
>> So .front gives you the same thing again and again until you call popFront, you could indeed call .front before .empty, but you may get an exception. This obviously isn't how I implement all InputRanges, as there are better ways to write other ranges, such as ranges which operate on sets of integers or wrap arrays. This is just my last resort general case InputRange implementation.
>>
>> .front assignment obviously replaces the cached value.
>
> That is not consistent with the first part of your reply and is incorrect imho. Calling empty() twice should not destroy anything, even according to your understanding. Neither should front(). User should be able to call them as many times as he wishes getting same value every time. Only popFront() mutate the range.
You read wrong. Say you have this sequence of three numbers and 'x' represents the cached value being empty, for a range 'r.'
---
1 2 3
x
r.empty
1 2 3
^
r.empty
1 2 3
^
r.front == 1
r.popFront()
1 2 3
x
r.empty
1 2 3
^
---
front and popFront each enforce(!empty) first, so this works, etc.
---
1 2 3
x
r.popFront()
# enforce(!empty);
1 2 3
^
# Clear the cached value.
1 2 3
x
---
So it works for any sequence of values, and you can call .empty and .front as much as you want without changing the result. It just so happens that it's actually .empty which does the work of actually fetching the next value. There are better ways to implement this for certain ranges, but this works for anything.
|
March 24, 2014 Re: protocol for using InputRanges | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | On Sunday, 23 March 2014 at 00:50:34 UTC, Walter Bright wrote:
> It's become clear to me that we've underspecified what an InputRange is. The normal way to use it is:
>
> while (!r.empty) {
> auto e = r.front;
> ... do something with e ...
> r.popFront();
> }
>
> no argument there. But there are two issues:
>
> 1. If you know the range is not empty, is it allowed to call r.front without calling r.empty first?
>
> If this is true, extra logic will need to be added to r.front in many cases.
>
> 2. Can r.front be called n times in a row? I.e. is calling front() destructive?
>
> If true, this means that r.front will have to cache a copy in many cases.
I think there is one design mistake with current InputRange rules that makes usage so inconsistent. We have `empty` but don't have any distinct `not yet started` state. If calling `popFront` at least once was required before accessing `front`, I can't imagine the case where having any non-trivial code in `front` would have been necessary.
|
Copyright © 1999-2021 by the D Language Foundation