range behaviour - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » range behaviour

Thread overview

range behaviour
May 13, 2014 Benjamin Thaut
May 13, 2014 Jonathan M Davis
May 13, 2014 Steven Schveighoffer
May 13, 2014 H. S. Teoh
May 13, 2014 Steven Schveighoffer
May 13, 2014 luka8088
May 13, 2014 Jonathan M Davis
May 13, 2014 H. S. Teoh
May 14, 2014 Ola Fosheim Grøstad
May 14, 2014 Marc Schütz
May 14, 2014 Ola Fosheim Grøstad
May 14, 2014 Marc Schütz
May 14, 2014 Ola Fosheim Grøstad
May 14, 2014 Dicebot
May 14, 2014 Ola Fosheim Grøstad
May 15, 2014 Dicebot
May 15, 2014 Ola Fosheim Grøstad
May 15, 2014 Dicebot
May 16, 2014 Ola Fosheim Grøstad
May 13, 2014 Jonathan M Davis
May 13, 2014 Dicebot

May 13, 2014

range behaviour

Posted by Benjamin Thaut

Benjamin Thaut

I know that there was a recent discussion about how the methods of ranges should behave.

E.g.

- Does empty always have to be called before calling front or popFront?
- Is it allowed to call front multiple times between two calls to popFront?

Was there a result of that discussion? Is it documented somewhere?

Kind Regards
Benjamin Thaut

May 13, 2014

Re: range behaviour

Posted by Jonathan M Davis
in reply to Benjamin Thaut

Jonathan M Davis

Posted in reply to Benjamin Thaut

On Tue, 13 May 2014 18:38:44 +0200
Benjamin Thaut via Digitalmars-d <digitalmars-d@puremagic.com> wrote:

> I know that there was a recent discussion about how the methods of ranges should behave.
>
> E.g.
>
> - Does empty always have to be called before calling front or popFront?

Certainly, ranges are pretty much always used this way, but there was some debate as to whether empty could have work done in it (and thus _had_ to be called). However, I believe that the consensus was that yes, empty had to be called (certainly, both Walter and Andrei felt that way).

> - Is it allowed to call front multiple times between two calls to popFront?

Definitely. _Lots_ of range-based code would break otherwise - though there
are casese where that can cause problems depending on what you rely on (e.g.
map!(a => to!string(a)) will return equal strings, but they aren't the _same_
string).

> Was there a result of that discussion? Is it documented somewhere?

AFAIK, there's just the semi-recent newsgroup discussion on the matter, though maybe someone put something up on the wiki.

- Jonathan M Davis

May 13, 2014

Re: range behaviour

Posted by Steven Schveighoffer
in reply to Jonathan M Davis

Steven Schveighoffer

Posted in reply to Jonathan M Davis

On Tue, 13 May 2014 12:58:09 -0400, Jonathan M Davis via Digitalmars-d <digitalmars-d@puremagic.com> wrote:

> On Tue, 13 May 2014 18:38:44 +0200
> Benjamin Thaut via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
>
>> I know that there was a recent discussion about how the methods of
>> ranges should behave.
>>
>> E.g.
>>
>> - Does empty always have to be called before calling front or
>> popFront?
>
> Certainly, ranges are pretty much always used this way, but there was some
> debate as to whether empty could have work done in it (and thus _had_ to be
> called). However, I believe that the consensus was that yes, empty had to be
> called (certainly, both Walter and Andrei felt that way).

I don't agree there was a consensus. I think empty should not have to be called if it's already logically known that the range is not empty. The current documentation states that, and I don't think there was an agreement that we should change it, despite the arguments from Walter.

In any case, I think generic code for an unknown range type in an unknown condition should have to call empty, since it cannot logically prove that it's not.

Even if it was required, it would be an unenforceable policy, just like range.save.

-Steve

May 13, 2014

Re: range behaviour

Posted by H. S. Teoh

H. S. Teoh

On Tue, May 13, 2014 at 09:58:09AM -0700, Jonathan M Davis via Digitalmars-d wrote:
> On Tue, 13 May 2014 18:38:44 +0200
> Benjamin Thaut via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
> 
> > I know that there was a recent discussion about how the methods of ranges should behave.
> >
> > E.g.
> >
> > - Does empty always have to be called before calling front or popFront?
> 
> Certainly, ranges are pretty much always used this way, but there was some debate as to whether empty could have work done in it (and thus _had_ to be called). However, I believe that the consensus was that yes, empty had to be called (certainly, both Walter and Andrei felt that way).
> 
> > - Is it allowed to call front multiple times between two calls to popFront?
> 
> Definitely. _Lots_ of range-based code would break otherwise - though
> there are casese where that can cause problems depending on what you
> rely on (e.g.  map!(a => to!string(a)) will return equal strings, but
> they aren't the _same_ string).
> 
> > Was there a result of that discussion? Is it documented somewhere?
> 
> AFAIK, there's just the semi-recent newsgroup discussion on the matter, though maybe someone put something up on the wiki.
[...]

I second all of the above.

In generic code (all range-based code is generic, since otherwise there's no point in using the range abstraction, you should just use the concrete type directly), you cannot assume anything about what .empty, .front, and .popFront() might do apart from what is specified in the range API.

Therefore, since .front and .popFront have unspecified behaviour if .empty is false, the only correct way to write range-based code is to call .empty first to determine whether it's OK to call .front and .popFront.

Furthermore, it doesn't make sense to restrict .front to be called only once, since otherwise we should simplify the API by merging .front and .popFront and have the same function both return the current element and advance to the next element. The fact that they were designed to be separate calls implies that .front can be called multiple times.

Of course, for efficiency purposes range-based code (esp. Phobos code) should try their best to only call .front once. But it should be perfectly permissible to call .front multiple times.

Lastly, since the range API is an *abstraction*, it should not dictate any concrete implementation details such as whether .empty can do non-trivial initialization work. Properly-written range-based code should be able to handle all possible implementations of the range API, including those that do non-trivial work in .empty.

Of course, that doesn't mean it's a *good* idea for .empty to do non-trivial work. The code is probably cleaner if that work is done somewhere else (like in the ctor or the function that returns the range). But that's an issue for the implementer of the range, not the consumers. The consumers of the range should not depend on one behaviour or the other, so that they will still work correctly when given a range that, for whatever reason, needs to do non-trivial work in .empty.

T

-- 
Prosperity breeds contempt, and poverty breeds consent. -- Suck.com

May 13, 2014

Re: range behaviour

Posted by H. S. Teoh
in reply to Steven Schveighoffer

H. S. Teoh

Posted in reply to Steven Schveighoffer

On Tue, May 13, 2014 at 01:29:32PM -0400, Steven Schveighoffer via Digitalmars-d wrote: [...]
> I don't agree there was a consensus. I think empty should not have to be called if it's already logically known that the range is not empty.

I only partially agree with this, up to the point where .empty has been checked previously, such as:

	void f1() {
		void f2(R)(R r) if (isInputRange!R) {
			doSomething(r.front);
		}

		auto r = makeMeARange();
		if (!r.empty)
			f2(r);
		...
	}

Even in this case, I'd put an in-contract on f2 that verifies that the range is indeed non-empty:

	...
		void f2(R)(R r)
			if (isInputRange!R)
		in { assert(!r.empty); }
		body {
			doSomething(r.front);
		}

[...]
> In any case, I think generic code for an unknown range type in an unknown condition should have to call empty, since it cannot logically prove that it's not.
[...]

In my mind, *all* range-based code is generic. If you need to depend on something about your range beyond what the range API guarantees, then you should be using the concrete type rather than a template argument, which means that your code is no longer range-based code -- it's operating on a concrete type. (Of course, if the concrete type implements the range API, then there's nothing wrong with using range API methods, but one shouldn't be under the illusion that one is writing range-based code. It is type-dependent code that just happens to have range-based methods. If the code will break if the range is substituted by a different range that doesn't have the same guarantees (outside of what the range API specifies), then it's not range-based code.)

T

-- 
Many open minds should be closed for repairs. -- K5 user

May 13, 2014

Re: range behaviour

Posted by Jonathan M Davis
in reply to Steven Schveighoffer

Jonathan M Davis

Posted in reply to Steven Schveighoffer

On Tue, 13 May 2014 13:29:32 -0400
Steven Schveighoffer via Digitalmars-d <digitalmars-d@puremagic.com>
wrote:

> On Tue, 13 May 2014 12:58:09 -0400, Jonathan M Davis via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
>
> > On Tue, 13 May 2014 18:38:44 +0200
> > Benjamin Thaut via Digitalmars-d <digitalmars-d@puremagic.com>
> > wrote:
> >
> >> I know that there was a recent discussion about how the methods of ranges should behave.
> >>
> >> E.g.
> >>
> >> - Does empty always have to be called before calling front or popFront?
> >
> > Certainly, ranges are pretty much always used this way, but there
> > was some
> > debate as to whether empty could have work done in it (and thus
> > _had_ to be
> > called). However, I believe that the consensus was that yes, empty
> > had to be
> > called (certainly, both Walter and Andrei felt that way).
>
> I don't agree there was a consensus. I think empty should not have to be called if it's already logically known that the range is not empty. The current documentation states that, and I don't think there was an agreement that we should change it, despite the arguments from Walter.
>
> In any case, I think generic code for an unknown range type in an unknown condition should have to call empty, since it cannot logically prove that it's not.
>
> Even if it was required, it would be an unenforceable policy, just like range.save.

Yeah, and they're both cases where the common case will work just fine if you do it wrong but which will break in less common cases, meaning that the odds are much higher that the bug won't be caught. :(

- Jonathan M Davis

May 13, 2014

Re: range behaviour

Posted by Jonathan M Davis

Jonathan M Davis

On Tue, 13 May 2014 10:30:47 -0700
"H. S. Teoh via Digitalmars-d" <digitalmars-d@puremagic.com> wrote:
> Of course, for efficiency purposes range-based code (esp. Phobos code) should try their best to only call .front once. But it should be perfectly permissible to call .front multiple times.

Oh, but that's part of the fun. If the range's front returns by ref or auto ref - or if it actually made front a public member variable - then it would be _less_ efficient to assign the result of front to a variable in order to avoid calling front multiple times. Much as ranges have a defined API, there's enough flexibility in the API and in how it's implemented for any given range that generalities about what is more or less efficient or which is more or less likely to avoid unintended behaviors isn't a sure thing - which is part of why that particular discussion was instigated in the first place and why discussions about stuff like whether front can be transitive or not keep popping up from time to time.

In general, it's probably better to avoid calling front multiple times though - the issue with map being a prime example of where the problem can be worse than simply an issue of efficiency - and in most cases, ranges don't have a front which returns by ref or auto ref. But unfortunately, because that's just _most_ cases, it does make it so that we can't make a blanket statement about what is more or less efficient.

- Jonathan M Davis

May 13, 2014

Re: range behaviour

Posted by Steven Schveighoffer
in reply to H. S. Teoh

Steven Schveighoffer

Posted in reply to H. S. Teoh

On Tue, 13 May 2014 13:40:55 -0400, H. S. Teoh via Digitalmars-d <digitalmars-d@puremagic.com> wrote:

> On Tue, May 13, 2014 at 01:29:32PM -0400, Steven Schveighoffer via Digitalmars-d wrote:
> [...]

>> In any case, I think generic code for an unknown range type in an
>> unknown condition should have to call empty, since it cannot logically
>> prove that it's not.
> [...]
>
> In my mind, *all* range-based code is generic. If you need to depend on
> something about your range beyond what the range API guarantees, then
> you should be using the concrete type rather than a template argument,
> which means that your code is no longer range-based code -- it's
> operating on a concrete type.

You can be generic, but still know that empty is not necessary.

This is a potential use case (somewhat similar to yours):

void foo(R)(R r)
{
   if(!r.empty)
   {
      auto r2 = map!(x => x.bar)(r); // or some similar translation range
      // have to check r2.empty here? The "empty must be called" rules say yes.
   }
}

I will note, there are cases in phobos that don't check empty because it's provably not necessary.

The issue arose because of the filter range, which someone was trying to make fully lazy. The whole thrust of the argument is that we want to force people to call empty not because we want them to write generically safe code but because we want to *instrument* empty to do something other than check for emptiness. In other words, if we are guaranteed that empty will be called before anything else, we can add extra bits to empty for e.g. lazy initialization. Otherwise, we have to add the bits to all the primitives.

It also results in an awkward call when it's not strictly necessary:

r2.empty;

-Steve

May 13, 2014

Re: range behaviour

Posted by Dicebot
in reply to Benjamin Thaut

Dicebot

Posted in reply to Benjamin Thaut

On Tuesday, 13 May 2014 at 16:38:43 UTC, Benjamin Thaut wrote:
> I know that there was a recent discussion about how the methods of ranges should behave.
>
> E.g.
>
> - Does empty always have to be called before calling front or popFront?
> - Is it allowed to call front multiple times between two calls to popFront?
>
> Was there a result of that discussion? Is it documented somewhere?
>
> Kind Regards
> Benjamin Thaut

I don't remember any final decision made from that discussion.

May 13, 2014

Re: range behaviour

Posted by luka8088
in reply to H. S. Teoh

luka8088

Posted in reply to H. S. Teoh

On 13.5.2014. 19:40, H. S. Teoh via Digitalmars-d wrote:
> On Tue, May 13, 2014 at 01:29:32PM -0400, Steven Schveighoffer via Digitalmars-d wrote:
> [...]
> Even in this case, I'd put an in-contract on f2 that verifies that the
> range is indeed non-empty:
> 
> 	...
> 		void f2(R)(R r)
> 			if (isInputRange!R)
> 		in { assert(!r.empty); }
> 		body {
> 			doSomething(r.front);
> 		}
> 
> [...]

This is a potential issue because if it turns out that empty _must_ be called than the author could put the front population logic inside empty.

Consider:

struct R {
  bool empty () { front = 1; return false; }
  int front = 0;
  void popFront () { front = 0; }
}

This is a valid code if empty _must_ be called, but it will behave differently if passed to f2 in case asserts are compiled out. In case asserts are compiled out empty is never called and front in never populated.

Because of this I think that it is necessary to document range behavior.

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation