state of ranges

What's the current state of the range specification? For instance, do we have to call empty before calling front? Can front provide different answers each time (e.g. map)? Were these kinds of issues resolved or not? Is Phobos respecting some consensual protocol or are there still gotchas?

On Tuesday, 12 December 2017 at 18:32:11 UTC, Luís Marques wrote: > What's the current state of the range specification? -> https://dlang.org/phobos/std_range_primitives.html#isInputRange > For instance, do we have to call empty before calling front? Spec: "r.front can be legally evaluated if and only if evaluating r.empty has, or would have, equaled false." > Can front provide different answers each time (e.g. map)? Spec: "r.front evaluated multiple times, without calling r.popFront, or otherwise mutating the range object or the underlying data, yields the same result for every evaluation." > Were these kinds of issues resolved or not? > Is Phobos respecting some consensual protocol or are there still gotchas? Yes, otherwise please open a bug.

On Tuesday, 12 December 2017 at 18:40:51 UTC, Seb wrote: > Spec: "r.front can be legally evaluated if and only if evaluating r.empty has, or would have, equaled false." > > Spec: "r.front evaluated multiple times, without calling r.popFront, or otherwise mutating the range object or the underlying data, yields the same result for every evaluation." Ok, so the consensus was to make ranges easy to use. Was there any progress on mechanisms to avoid the possible performance penalties, and to make the implementation side more regular?

On Tuesday, 12 December 2017 at 18:58:11 UTC, Luís Marques wrote: > Ok, so the consensus was to make ranges easy to use. Was there any progress on mechanisms to avoid the possible performance penalties, and to make the implementation side more regular? Have you noticed performance problems or implementation side irregularities? It would be handy for us to make benchmarks (if nobody has done it yet?), but my feeling is that it doesn't have a measurable impact on performance in the context of a single iterator compared to opApply. Probably has a greater impact when you're stacking things together, since ranges can use design by introspection to do less work.

On Tuesday, 12 December 2017 at 23:25:19 UTC, Neia Neutuladh wrote: > Have you noticed performance problems or implementation side irregularities? Well, I was referring to things like in front() having to use code such as `if(!inited) ...; return value`, which was discussed here in the forum in the past. The performance side hasn't been too bad for me personally (so far...), but I started this thread exactly because I wanted to know if I could move some code to the empty() method of the range, as that would be more convenient and performant. So you could say I mostly noticed the "regularity" part.

On Tuesday, 12 December 2017 at 23:43:19 UTC, Luís Marques wrote: > Well, I was referring to things like in front() having to use code such as `if(!inited) ...; return value I think you only have to do that if you have some custom pointer arithmetic and you want to make sure it remains memory safe. However, in the general case you don't need to do that. front() can assume that something can be found, so it may as well fetch the value without checking and rely on built-in array bounds checking and null behaviour for memory safety. empty() is the one which should check those things manually.

On 12/13/2017 10:13 AM, Dukc wrote: > front() can assume that something can be found, so it may as well fetch the value without checking and rely on built-in array bounds checking and null behaviour for memory safety. empty() is the one which should check those things manually. No. As Seb has quoted, `front` can't assume that `empty` has been called before. For a well-behaved range, `front` must work the same whether you've called `empty` or not (given that the range isn't actually empty).

On 12/12/17 6:43 PM, Luís Marques wrote: > On Tuesday, 12 December 2017 at 23:25:19 UTC, Neia Neutuladh wrote: >> Have you noticed performance problems or implementation side irregularities? > > Well, I was referring to things like in front() having to use code such as `if(!inited) ...; return value`, which was discussed here in the forum in the past. The proper place to generate the first element, IMO, is in the constructor. > > The performance side hasn't been too bad for me personally (so far...), but I started this thread exactly because I wanted to know if I could move some code to the empty() method of the range, as that would be more convenient and performant. So you could say I mostly noticed the "regularity" part. Note that there is no compile-time test to figure out when you perform your operations, so you can cheat if you want. I don't think there's a requirement for empty not to do any work, it just has to return the same value each time. -Steve

On Wednesday, December 13, 2017 11:33:35 Steven Schveighoffer via Digitalmars-d wrote: > I don't think there's a requirement for empty not to do any work, it just has to return the same value each time. IIRC, when this was discussed previously, it was decided that you really couldn't do work in empty. The problem is that under some circumstances at least, it's perfectly legitimate to skip calls to empty (e.g. if you already know that there's plenty of elements left in the range, because you've called save previously and have iterated through at least that many elements in another copy of the range or because the range has length). it was decided that you really couldn't do work in empty. It might be legitimate if your range was not a forward range, but it would only work under cirmustances where it would be impossible to know whether there are elements left in the range or not without calling empty - which is not the case when dealing with forward ranges. - Jonathan M Davis

December 13, 2017

Re: state of ranges

Posted by H. S. Teoh

Permalink

H. S. Teoh

Permalink

On Wed, Dec 13, 2017 at 12:33:02PM -0700, Jonathan M Davis via Digitalmars-d wrote:
> On Wednesday, December 13, 2017 11:33:35 Steven Schveighoffer via Digitalmars-d wrote:
> > I don't think there's a requirement for empty not to do any work, it just has to return the same value each time.
> 
> IIRC, when this was discussed previously, it was decided that you really couldn't do work in empty. The problem is that under some circumstances at least, it's perfectly legitimate to skip calls to empty (e.g. if you already know that there's plenty of elements left in the range, because you've called save previously and have iterated through at least that many elements in another copy of the range or because the range has length). it was decided that you really couldn't do work in empty. It might be legitimate if your range was not a forward range, but it would only work under cirmustances where it would be impossible to know whether there are elements left in the range or not without calling empty - which is not the case when dealing with forward ranges.
[...]

Basically, it comes down to (1) doing the least amount of work necessary to get your job done; and (2) programming defensively, i.e., assume the worst about user code, or, don't assume anything more than what the API dictates.

(1) From the range user's POV, the range API essentially says that if .empty is true, then you can call .front and .popFront.  Doing the least amount of work means you don't have to call .empty if you already know beforehand it would have returned false. Similarly, it's legal to call .popFront without calling .front (or .empty) in between, if you already know beforehand .empty won't become true in the meantime.

(2) From the range author's POV, assuming the worst means not assuming that user code will follow a particular sequence of calls to the range API, other than what is required by the API itself. That is, if your .empty would have returned false, then assume that somebody will call .front or .popFront without calling .empty. Don't assume that someone will always call .empty first.

(OTOH, the range API does require that .empty be false before .front and .popFront are called, so you shouldn't need to check .empty yourself in the implementation of .front and .popFront, i.e., avoid doing more work than necessary.)

Whether you do work in .empty or .front is not really relevant, as long
as ANY sequence of valid range API calls will always produce the same
result.  And by any sequence of valid calls, of course, I include
sequences that don't include .empty if somehow the user code already
knows beforehand when .empty will become true.  I.e., the sequence
{ r.popFront; r.popFront; r.popFront; ... ; .front } ought to produce
the correct result, as long as .empty never becomes true in the meantime
(and .empty does not need to be called explicitly).

If you can do work in .empty (or .front) while still fulfilling this requirement, then great.  If not, perhaps you should find a different implementation strategy.

Everything else is just pudding.

T

-- 
Guns don't kill people. Bullets do.

Forums