protocol for using InputRanges - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » protocol for using InputRanges

Thread overview

protocol for using InputRanges
Mar 23, 2014 Walter Bright
Mar 23, 2014 Jakob Ovrum
Mar 23, 2014 Adam D. Ruppe
Mar 23, 2014 Rikki Cattermole
Mar 27, 2014 Chris
Mar 28, 2014 Walter Bright
Mar 28, 2014 Chris
Mar 28, 2014 w0rp
Mar 28, 2014 Chris
Mar 28, 2014 Regan Heath
Mar 28, 2014 Chris
Mar 28, 2014 Regan Heath
Mar 31, 2014 Steven Schveighoffer
Mar 23, 2014 Jonathan M Davis
Mar 23, 2014 Tommi
Mar 25, 2014 Walter Bright
Mar 23, 2014 Marco Leise
Mar 23, 2014 w0rp
Mar 23, 2014 w0rp
Mar 23, 2014 Szymon Gatner
Mar 24, 2014 w0rp
Mar 25, 2014 Walter Bright
Mar 23, 2014 Szymon Gatner
Mar 23, 2014 Szymon Gatner
Mar 24, 2014 Steven Schveighoffer
Mar 23, 2014 Joseph Rushton Wakeling
Mar 23, 2014 Timon Gehr
Mar 24, 2014 monarch_dodra
Mar 23, 2014 Tommi
Mar 24, 2014 Steven Schveighoffer
Mar 24, 2014 Dicebot
Mar 24, 2014 Joseph Rushton Wakeling
Mar 25, 2014 Dicebot
Mar 25, 2014 Andrei Alexandrescu
Mar 25, 2014 Dicebot
Mar 25, 2014 Chris
Mar 25, 2014 Walter Bright
Mar 25, 2014 monarch_dodra
Mar 25, 2014 Walter Bright
Mar 25, 2014 Andrei Alexandrescu
Mar 25, 2014 Andrei Alexandrescu
Mar 25, 2014 Andrei Alexandrescu
Mar 25, 2014 Timon Gehr
Mar 25, 2014 monarch_dodra
Mar 25, 2014 Walter Bright
Mar 26, 2014 Regan Heath
Mar 26, 2014 Steven Schveighoffer
Mar 26, 2014 Steven Schveighoffer
Mar 26, 2014 Regan Heath
Mar 26, 2014 Steven Schveighoffer
Mar 26, 2014 monarch_dodra
Mar 26, 2014 Regan Heath
Mar 26, 2014 monarch_dodra
Mar 26, 2014 Regan Heath
Mar 27, 2014 Daniel Murphy
Mar 27, 2014 Regan Heath
Mar 26, 2014 Regan Heath
Mar 26, 2014 Andrei Alexandrescu
Mar 26, 2014 Ola Fosheim Grøstad
Mar 27, 2014 Paulo Pinto
Mar 27, 2014 Ola Fosheim Grøstad
Mar 27, 2014 Steven Schveighoffer
Mar 27, 2014 Walter Bright
Mar 27, 2014 Steven Schveighoffer
Mar 27, 2014 Walter Bright
Mar 27, 2014 Marc Schütz
Mar 27, 2014 Regan Heath
Mar 27, 2014 Andrei Alexandrescu
Mar 27, 2014 Steven Schveighoffer
Mar 27, 2014 Andrei Alexandrescu
Mar 27, 2014 monarch_dodra
Mar 27, 2014 Steven Schveighoffer
Mar 27, 2014 Andrei Alexandrescu
Mar 27, 2014 Walter Bright
Mar 27, 2014 Steven Schveighoffer
Mar 27, 2014 Walter Bright
Mar 27, 2014 Steven Schveighoffer
Mar 28, 2014 Walter Bright
Mar 28, 2014 Steven Schveighoffer
Mar 29, 2014 Andrei Alexandrescu
Mar 29, 2014 Steven Schveighoffer
Mar 29, 2014 monarch_dodra
Mar 29, 2014 monarch_dodra
Mar 30, 2014 monarch_dodra
Mar 29, 2014 Walter Bright
Mar 30, 2014 Marco Leise
Mar 31, 2014 Steven Schveighoffer
Mar 27, 2014 Andrei Alexandrescu
Mar 28, 2014 Walter Bright
Mar 28, 2014 Johannes Pfau
Mar 28, 2014 Walter Bright
Mar 28, 2014 Dmitry Olshansky
Mar 28, 2014 Walter Bright
Mar 28, 2014 Dmitry Olshansky
Mar 28, 2014 Walter Bright
Mar 28, 2014 Dmitry Olshansky
Mar 28, 2014 Walter Bright
Mar 29, 2014 w0rp
Mar 29, 2014 Dmitry Olshansky
Mar 29, 2014 Andrei Alexandrescu
Mar 29, 2014 Dmitry Olshansky
Mar 28, 2014 Johannes Pfau
Mar 28, 2014 w0rp
Mar 28, 2014 Johannes Pfau
Mar 28, 2014 QAston
Mar 28, 2014 Dmitry Olshansky
Mar 27, 2014 QAston
Mar 28, 2014 Walter Bright
Mar 28, 2014 Paolo Invernizzi
Mar 28, 2014 Regan Heath
Mar 28, 2014 Paolo Invernizzi
Mar 28, 2014 John Stahara
Mar 28, 2014 Regan Heath
Mar 28, 2014 H. S. Teoh
Mar 28, 2014 Paolo Invernizzi
Mar 27, 2014 Andrei Alexandrescu
Mar 27, 2014 monarch_dodra
Mar 28, 2014 RivenTheMage
Mar 27, 2014 Walter Bright
Mar 27, 2014 Steven Schveighoffer
Mar 27, 2014 Ola Fosheim Grøstad
Mar 27, 2014 Walter Bright
Mar 27, 2014 Steven Schveighoffer
Mar 27, 2014 Andrei Alexandrescu
Mar 27, 2014 Steven Schveighoffer
Mar 27, 2014 Andrei Alexandrescu
Mar 27, 2014 monarch_dodra
Mar 27, 2014 Andrei Alexandrescu
Mar 27, 2014 monarch_dodra
Mar 27, 2014 Walter Bright
Mar 27, 2014 Steven Schveighoffer
Mar 27, 2014 Walter Bright
Mar 27, 2014 Steven Schveighoffer
Mar 27, 2014 Walter Bright
Mar 29, 2014 Steven Schveighoffer
Mar 28, 2014 Marc Schütz
Mar 29, 2014 Steven Schveighoffer
Mar 29, 2014 Marc Schütz
Mar 31, 2014 Steven Schveighoffer
Mar 31, 2014 Marc Schütz
Mar 31, 2014 Steven Schveighoffer
Mar 27, 2014 Timon Gehr
Mar 27, 2014 Timon Gehr
Mar 28, 2014 Walter Bright
Mar 28, 2014 H. S. Teoh
Mar 28, 2014 monarch_dodra
Mar 27, 2014 Rainer Schuetze
Mar 27, 2014 Rainer Schuetze
Mar 27, 2014 Walter Bright
Mar 27, 2014 Rainer Schuetze
Mar 27, 2014 Walter Bright
Mar 28, 2014 Tobias Müller
Mar 28, 2014 Szymon Gatner
Mar 29, 2014 w0rp
Mar 30, 2014 Oscar Martín
Mar 27, 2014 Walter Bright
Mar 27, 2014 Rainer Schuetze
Mar 27, 2014 Walter Bright
Mar 27, 2014 Regan Heath
Mar 27, 2014 Jonathan M Davis
Mar 27, 2014 Walter Bright
Mar 27, 2014 Luís Marques
Mar 30, 2014 Daniel Murphy
Mar 29, 2014 Peter Alexander

March 23, 2014

protocol for using InputRanges

Posted by Walter Bright

Walter Bright

It's become clear to me that we've underspecified what an InputRange is. The normal way to use it is:

    while (!r.empty) {
        auto e = r.front;
        ... do something with e ...
        r.popFront();
    }

no argument there. But there are two issues:

1. If you know the range is not empty, is it allowed to call r.front without calling r.empty first?

If this is true, extra logic will need to be added to r.front in many cases.

2. Can r.front be called n times in a row? I.e. is calling front() destructive?

If true, this means that r.front will have to cache a copy in many cases.

March 23, 2014

Re: protocol for using InputRanges

Posted by Jakob Ovrum
in reply to Walter Bright

Jakob Ovrum

Posted in reply to Walter Bright

On Sunday, 23 March 2014 at 00:50:34 UTC, Walter Bright wrote:
> 2. Can r.front be called n times in a row? I.e. is calling front() destructive?
>
> If true, this means that r.front will have to cache a copy in many cases.

If `front` was destructive there would be little point in having it separate from `popFront`. I think it must be non-destructive to make sense.

March 23, 2014

Re: protocol for using InputRanges

Posted by Adam D. Ruppe
in reply to Walter Bright

Adam D. Ruppe

Posted in reply to Walter Bright

On Sunday, 23 March 2014 at 00:50:34 UTC, Walter Bright wrote:
>         auto e = r.front;

Remember that front is a getter property, which means it should work like a variable. Typically, reading a variable is not destructive and needs no preparation. There's exceptions to the rule, but they almost always work this way, so ranges should too.

> If true, this means that r.front will have to cache a copy in many cases.

Indeed, in many of my quick input ranges, I just make front a regular variable and popFront updates it to the next item.

March 23, 2014

Re: protocol for using InputRanges

Posted by Rikki Cattermole
in reply to Walter Bright

Rikki Cattermole

Posted in reply to Walter Bright

On Sunday, 23 March 2014 at 00:50:34 UTC, Walter Bright wrote:
> It's become clear to me that we've underspecified what an InputRange is. The normal way to use it is:
>
>     while (!r.empty) {
>         auto e = r.front;
>         ... do something with e ...
>         r.popFront();
>     }
>
> no argument there. But there are two issues:
>
> 1. If you know the range is not empty, is it allowed to call r.front without calling r.empty first?
>
> If this is true, extra logic will need to be added to r.front in many cases.
>
> 2. Can r.front be called n times in a row? I.e. is calling front() destructive?
>
> If true, this means that r.front will have to cache a copy in many cases.

It would be nice to have ranges full stop described. And how to use/make them in D. I have yet to learn them because of no official documentation on them.
On this topic we also need to work on common patterns and creating documentation on e.g. the site or wiki for it. Saw that we do have a bit in the wiki under tutorials. Maybe if I get some time I'll work on that.

March 23, 2014

Re: protocol for using InputRanges

Posted by Jonathan M Davis
in reply to Walter Bright

Jonathan M Davis

Posted in reply to Walter Bright

On Saturday, March 22, 2014 17:50:34 Walter Bright wrote:
> It's become clear to me that we've underspecified what an InputRange is. The normal way to use it is:
> 
>      while (!r.empty) {
>          auto e = r.front;
>          ... do something with e ...
>          r.popFront();
>      }
> 
> no argument there. But there are two issues:
> 
> 1. If you know the range is not empty, is it allowed to call r.front without calling r.empty first?
> 
> If this is true, extra logic will need to be added to r.front in many cases.

You definitely don't have to call empty before calling front if you know that it's not empty. Both front and empty should normally be pure (or at least act that way) and essentially act like variables. In most cases, it works best for the work of the range to go in popFront. The exception is when you're dealing with a random-access range, since then any element could be accessed, making it so that you can't be doing the work in popFront. I think that we have a general agreement on this based on previous discussions, though it's certainly not unanimous.

> 2. Can r.front be called n times in a row? I.e. is calling front()
> destructive?
> 
> If true, this means that r.front will have to cache a copy in many cases.

If calling front were destructive, that would break a lot of code. It's probably true that most range-based code should avoid calling front multiple times (in case front has to do more work than just return the value as well as to avoid copying the result if that happens on every call), though if front is auto ref, it could be more efficient to call it multiple times. So, it's not entirely clear-cut.

But again, front and empty should normally function as if they were variables. They should be property functions and should be pure (or at least act like they're pure). I'm sure that a _lot_ of code will break if that isn't followed.

There are corner cases which can get a bit mucky though - e.g.

auto a = map!(to!string)(range);

In this case, front is pure, but it returns a new value each time (albeit a value that's equal each time until popFront is called). And there's no real way to fix that if the resulting range is random access (though if it weren't, the work could go in popFront, which _would_ make it so that front always returned the same result).

And there have been arguments over whether the result of front should be valid after popFront has been called (i.e. whether it's transient or not). A lot of code assumes that it will be, but we have some nasty exceptions (e.g. std.stdio.ByLine) - typically because front's a buffer which gets reused. IIRC, in those cases, Andrei favored saying that input ranges that weren't forward ranges could have a transient front but that forward ranges couldn't (which I tend to agree with, though I'd prefer that _no_ ranges have transient fronts, since it can really cause problems - e.g. std.array.array not working). I don't think that a consensus was reached on that though, since a few folks really liked using transient fronts with more complicated ranges.

In general though, I think that most of us would agree that front and empty should be treated as properties - i.e. as if they were variables - and that they should have try to stick to those semantics as closely as possible. Ranges that stray from that seriously risk not working with a lot of range- based code.

- Jonathan M Davis

March 23, 2014

Re: protocol for using InputRanges

Posted by Marco Leise
in reply to Walter Bright

Marco Leise

Posted in reply to Walter Bright

Am Sat, 22 Mar 2014 17:50:34 -0700
schrieb Walter Bright <newshound2@digitalmars.com>:

> It's become clear to me that we've underspecified what an InputRange is. The normal way to use it is:
> 
>      while (!r.empty) {
>          auto e = r.front;
>          ... do something with e ...
>          r.popFront();
>      }
> 
> no argument there. But there are two issues:
> 
> 1. If you know the range is not empty, is it allowed to call r.front without calling r.empty first?
> 
> If this is true, extra logic will need to be added to r.front in many cases.

Looking at ranges as a user with all the subjectivity it entails, I'd expect .empty to be a read-only property, a @safe-const-pure-nothrow candidate.

Actions that are not logically const are commonly encapsulated in methods (as opposed to read-only properties like .empty) and often also carry a verb in their name like "create", "update" or "append".

> 2. Can r.front be called n times in a row? I.e. is calling front() destructive?
> 
> If true, this means that r.front will have to cache a copy in many cases.

On the other side of it were (IIRC) there are use cases for
return-by-ref on .front:

o .front gives access to a large struct. Return-by-ref can
  avoid a copy on the call site, but may result in
  several .front calls.
o .front is a view into some struct which holds system
  resources and providing a copy requires internals as in
  the File struct.

I both cases you deal with a range over value types that are costly to copy and where "auto e = r.front;" is to be avoided.

I'm 100% unsure how to deal with this.

-- 
Marco

March 23, 2014

Re: protocol for using InputRanges

Posted by w0rp
in reply to Walter Bright

w0rp

Posted in reply to Walter Bright

I understand it like this.

* empty - Are there no more values?
* front - Get me the current value.
* popFront - Advance to the next value.

In terms of how I implement an InputRange in general, I typically end up with this.

* empty - Advance and cache "current value," return true if we ran out of values.
* front - enforce(!empty), which in turn caches the current value, which we then return.
* popFront - enforce(!empty), clear the cached value so we can later advance.

So .front gives you the same thing again and again until you call popFront, you could indeed call .front before .empty, but you may get an exception. This obviously isn't how I implement all InputRanges, as there are better ways to write other ranges, such as ranges which operate on sets of integers or wrap arrays. This is just my last resort general case InputRange implementation.

.front assignment obviously replaces the cached value.

March 23, 2014

Re: protocol for using InputRanges

Posted by Szymon Gatner
in reply to Walter Bright

Szymon Gatner

Posted in reply to Walter Bright

On Sunday, 23 March 2014 at 00:50:34 UTC, Walter Bright wrote:
>
> 1. If you know the range is not empty, is it allowed to call r.front without calling r.empty first?

IMO: yes. Logic of empty() sould be const and not have side effects.

>
> If this is true, extra logic will need to be added to r.front in many cases.
>
> 2. Can r.front be called n times in a row? I.e. is calling front() destructive?
>
> If true, this means that r.front will have to cache a copy in many cases.

Yes, all true. Not sure if there is something like that in Phobos but if for example you had RandomValueRange, ever call to front() should return the same random number until popFront() is called at which point internal front variable is being recalculated and cached.

Since only popFront() is non-const, it is there where all the magic/mutation should happen.

In my code I also have CachingRange which calls front() on underlying range the first time and then stores result internally. I use it for ranges which generate front() on the fly and I knot it is expensive. I could cache that calculation directly in that underlying range but CachingRange is reusable.

March 23, 2014

Re: protocol for using InputRanges

Posted by w0rp
in reply to w0rp

w0rp

Posted in reply to w0rp

I should add that I have implemented some ranges where .front and .popFront are both nothrow, as !empty doesn't "advance and cache" for these ranges and the check is moved into an in{} contract. For these ranges, they tend to behave like arrays with bounds checking, only now the bounds checking is turned off by virtue of -release.

March 23, 2014

Re: protocol for using InputRanges

Posted by Szymon Gatner
in reply to w0rp

Szymon Gatner

Posted in reply to w0rp

On Sunday, 23 March 2014 at 09:26:53 UTC, w0rp wrote:
> I understand it like this.
>
> * empty - Are there no more values?
> * front - Get me the current value.
> * popFront - Advance to the next value.

That is correct.

>
> In terms of how I implement an InputRange in general, I typically end up with this.
>
> * empty - Advance and cache "current value," return true if we ran out of values.
> * front - enforce(!empty), which in turn caches the current value, which we then return.
> * popFront - enforce(!empty), clear the cached value so we can later advance.
>
> So .front gives you the same thing again and again until you call popFront, you could indeed call .front before .empty, but you may get an exception. This obviously isn't how I implement all InputRanges, as there are better ways to write other ranges, such as ranges which operate on sets of integers or wrap arrays. This is just my last resort general case InputRange implementation.
>
> .front assignment obviously replaces the cached value.

That is not consistent with the first part of your reply and is incorrect imho. Calling empty() twice should not destroy anything, even according to your understanding. Neither should front(). User should be able to call them as many times as he wishes getting same value every time. Only popFront() mutate the range.

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation