View mode: basic / threaded / horizontal-split · Log in · Help
July 03, 2012
Re: Proposal: takeFront and takeBack
On Tuesday, July 03, 2012 19:29:44 Tobias Pankrath wrote:
> consumeFront?

That seems like a good suggestion.

- Jonathan M Davis
July 03, 2012
Re: Proposal: takeFront and takeBack
On Tuesday, 3 July 2012 at 17:40:24 UTC, Jonathan M Davis wrote:
> On Tuesday, July 03, 2012 17:31:21 Christophe Travert wrote:
>> takeFront implementation is dangerous for ranges which 
>> invalidates their
>> front value when popFront is called, for instance, 
>> File.byLine. Thus
>> takeFront will have to be used with care: any range implement 
>> takeFront
>> (because of the template and USFC), but it may not be valid. 
>> That makes
>> the range interface more complicated: There is a takeFront 
>> property, but
>> you have to check it is safe to use... how do you check that 
>> by the way?
>
> Hmm. I hadn't thought of that. That could be a good reason not 
> to do this. I'm
> not quite sure how to get around that. Hmmm... It would 
> arguably be a bit
> ugly, but a range which couldn't safely be used with takeFront 
> could have an
> enum on it which indicated that, and takeFront would fail to 
> compile if used
> with such a range. Of course, that would mean that you'd have 
> to special case
> such ranges in any range-based function which used takeFront so 
> that there was
> a branch which didn't use takeFront for ranges which couldn't 
> use it.

Just an idea (feels dirty, but still): such range could redefine 
takeFront, and maintain state. takeFront would defer a call to 
popFront until the next operation. All of front, popFront, 
takeFront and empty would check state (whether popFront has been 
deferred), and execute popFront first. Side effect would be that 
a call to empty would invalidate result of takeFront, but that 
could be changed also: empty instead of calling popFront would 
check whether there is anything else after the front element.

This is much more complicated, but would only be necessary for 
such containers.
July 03, 2012
Re: Proposal: takeFront and takeBack
On Tuesday, July 03, 2012 10:40:07 Jonathan M Davis wrote:
> On Tuesday, July 03, 2012 17:31:21 Christophe Travert wrote:
> > takeFront implementation is dangerous for ranges which invalidates their
> > front value when popFront is called, for instance, File.byLine. Thus
> > takeFront will have to be used with care: any range implement takeFront
> > (because of the template and USFC), but it may not be valid. That makes
> > the range interface more complicated: There is a takeFront property, but
> > you have to check it is safe to use... how do you check that by the way?
> 
> Hmm. I hadn't thought of that. That could be a good reason not to do this.
> I'm not quite sure how to get around that. Hmmm... It would arguably be a
> bit ugly, but a range which couldn't safely be used with takeFront could
> have an enum on it which indicated that, and takeFront would fail to
> compile if used with such a range. Of course, that would mean that you'd
> have to special case such ranges in any range-based function which used
> takeFront so that there was a branch which didn't use takeFront for ranges
> which couldn't use it.
> 
> I don't know. It's certainly something to think about. We may need a
> different solution.

An alternative would be to make it so that takeFront (or consumeFront, as 
someone suggested, since that's a better name) would be defined only for ranges 
which it helps. So, similar to hasSlicing, you get something like hasConsume, 
and then a range-based function which wants to use consumeFront or consumeBack 
checks hasConsume!R and uses it on that particular static if branch if it does 
and uses another branch with front and popFront if it doesn't. 
std.range.consumeFront then works only with strings (and maybe arrays).

However, it seemed to me that a major upside of consumeFront as I proposed it 
was that you could always just use it, and it would do the most efficient thing, 
so you _didn't_ have to special case for it, whereas with hasConsume, you 
would. But if you can't use consumeFront safely on all ranges, then that 
doesn't really work. It's still worth something but not as much.

You can already special case strings (and in many cases need to). What 
consumeFront does is make it so that functions operating on wrappers around 
strings (i.e. the results of functions like filter or map) can operate on them 
more efficiently as well. It also makes it so that user-defined ranges can get 
that benefit from algorithms that know nothing about those ranges and therefore 
don't special case them (e.g. std.algorithm functions can then be more efficient 
for user-defined ranges which can define a more efficient consumeFront). So, even 
if we had hasConsume and functions had to special case in order to use 
consumeFront or consumeBack, it would still be beneficial. However, it would be 
nice to have a solution which _didn't_ require special casing.

- Jonathan M Davis
July 03, 2012
Re: Proposal: takeFront and takeBack
On Tuesday, 3 July 2012 at 17:40:24 UTC, Jonathan M Davis wrote:
> On Tuesday, July 03, 2012 17:31:21 Christophe Travert wrote:
>> takeFront implementation is dangerous for ranges which 
>> invalidates their
>> front value when popFront is called, for instance, 
>> File.byLine. Thus
>> takeFront will have to be used with care: any range implement 
>> takeFront
>> (because of the template and USFC), but it may not be valid. 
>> That makes
>> the range interface more complicated: There is a takeFront 
>> property, but
>> you have to check it is safe to use... how do you check that 
>> by the way?
>
> Hmm. I hadn't thought of that. That could be a good reason not 
> to do this. I'm
> not quite sure how to get around that. Hmmm... It would 
> arguably be a bit
> ugly, but a range which couldn't safely be used with takeFront 
> could have an
> enum on it which indicated that, and takeFront would fail to 
> compile if used
> with such a range. Of course, that would mean that you'd have 
> to special case
> such ranges in any range-based function which used takeFront so 
> that there was
> a branch which didn't use takeFront for ranges which couldn't 
> use it.
>
> I don't know. It's certainly something to think about. We may 
> need a different
> solution.

An alternative is not to define a free function. So if 
consumeFront is defined, it may be used by the algorithm, 
otherwise simply call to front and popFront in the algorithm. If 
branching is necessary, then the is no value in a free function 
(it wouldn't optimize for speed any way).
July 03, 2012
Re: Proposal: takeFront and takeBack
On Tuesday, 3 July 2012 at 18:00:55 UTC, Jonathan M Davis wrote:
> On Tuesday, July 03, 2012 10:40:07 Jonathan M Davis wrote:
>> On Tuesday, July 03, 2012 17:31:21 Christophe Travert wrote:
>> > takeFront implementation is dangerous for ranges which 
>> > invalidates their
>> > front value when popFront is called, for instance, 
>> > File.byLine. Thus
>> > takeFront will have to be used with care: any range 
>> > implement takeFront
>> > (because of the template and USFC), but it may not be valid. 
>> > That makes
>> > the range interface more complicated: There is a takeFront 
>> > property, but
>> > you have to check it is safe to use... how do you check that 
>> > by the way?
>> 
>> Hmm. I hadn't thought of that. That could be a good reason not 
>> to do this.
>> I'm not quite sure how to get around that. Hmmm... It would 
>> arguably be a
>> bit ugly, but a range which couldn't safely be used with 
>> takeFront could
>> have an enum on it which indicated that, and takeFront would 
>> fail to
>> compile if used with such a range. Of course, that would mean 
>> that you'd
>> have to special case such ranges in any range-based function 
>> which used
>> takeFront so that there was a branch which didn't use 
>> takeFront for ranges
>> which couldn't use it.
>> 
>> I don't know. It's certainly something to think about. We may 
>> need a
>> different solution.
>
> An alternative would be to make it so that takeFront (or 
> consumeFront, as
> someone suggested, since that's a better name) would be defined 
> only for ranges
> which it helps. So, similar to hasSlicing, you get something 
> like hasConsume,
> and then a range-based function which wants to use consumeFront 
> or consumeBack
> checks hasConsume!R and uses it on that particular static if 
> branch if it does
> and uses another branch with front and popFront if it doesn't.
> std.range.consumeFront then works only with strings (and maybe 
> arrays).
>
> However, it seemed to me that a major upside of consumeFront as 
> I proposed it
> was that you could always just use it, and it would do the most 
> efficient thing,
> so you _didn't_ have to special case for it, whereas with 
> hasConsume, you
> would. But if you can't use consumeFront safely on all ranges, 
> then that
> doesn't really work. It's still worth something but not as much.
>
> You can already special case strings (and in many cases need 
> to). What
> consumeFront does is make it so that functions operating on 
> wrappers around
> strings (i.e. the results of functions like filter or map) can 
> operate on them
> more efficiently as well. It also makes it so that user-defined 
> ranges can get
> that benefit from algorithms that know nothing about those 
> ranges and therefore
> don't special case them (e.g. std.algorithm functions can then 
> be more efficient
> for user-defined ranges which can define a more efficient 
> consumeFront). So, even
> if we had hasConsume and functions had to special case in order 
> to use
> consumeFront or consumeBack, it would still be beneficial. 
> However, it would be
> nice to have a solution which _didn't_ require special casing.
>
> - Jonathan M Davis

You bet me by 19 seconds and a better explanation :)

So far we have two options: cleaner code because there would be 
no need to hack containers which invalidate their front value on 
a call to popFront, or a hack which I proposed above, but 
localized in such containers. By the way, some of such containers 
could potentially defer invalidating front till the next call to 
front.
July 03, 2012
Re: Proposal: takeFront and takeBack
Range have been designed with the idea that front is valid until next 
call to popFront. If popFront was to be called right after front, then 
it would just be a popFront that returns a value, and maybe a justPop or 
something if you don't want to copy the value.

It's delicate to come now and decide that if a previously returned front 
value is no longer valid after a call to popFront, it should be 
documented by an enum. Who is going to review all libraries (written and 
to come) to make sure the enum is placed when it should be? The reverse 
should be done instead: if you want you range to be optimized by calling 
takeFront, define something (for example... takeFront). Then use 
algorithms that are specialized for takeFront.
July 03, 2012
Re: Proposal: takeFront and takeBack
On Tuesday, 3 July 2012 at 18:12:54 UTC, 
travert@phare.normalesup.org (Christophe Travert) wrote:
>
> Range have been designed with the idea that front is valid 
> until next
> call to popFront. If popFront was to be called right after 
> front, then
> it would just be a popFront that returns a value, and maybe a 
> justPop or
> something if you don't want to copy the value.
>
> It's delicate to come now and decide that if a previously 
> returned front
> value is no longer valid after a call to popFront, it should be
> documented by an enum. Who is going to review all libraries 
> (written and
> to come) to make sure the enum is placed when it should be? The 
> reverse
> should be done instead: if you want you range to be optimized 
> by calling
> takeFront, define something (for example... takeFront). Then use
> algorithms that are specialized for takeFront.

takeFront -> consumeFront

hasConsume has been proposed by Jonathan already.

But not having to modify all existing ranges which invalidate 
value returned by front might be a good argument against my hack 
also.

However I'm not sure there are really many of such ranges. Those 
must return something with reference semantics in order to be 
able to invalidate it. Most common case would be when front value 
is always stored in the same memory (a buffer).

The benefit of my hacking idea is that clients would need no 
special casing.
July 03, 2012
Re: Proposal: takeFront and takeBack
Still, ranges which invalidate front on popFront could defer that 
(store a flag that popFront or consumeFront has been called and 
invalidate value previously returned from front in the next call 
to front). That would be intrusive with respect to such ranges, 
and a breaking change. But it would simplify client code 
significantly.

(This doesn't mean that I'm insisting, or strongly prefer some of 
my two ideas, just wanted to clearly restate the second one.)
July 03, 2012
Re: Proposal: takeFront and takeBack
On 03-Jul-12 20:37, Jonathan M Davis wrote:
> This seems like it probably merits a bit of discussion, so I'm bringing it up
> here rather than simply opening a pull request.
>
> At present, for some ranges (variably-lengthed ranges such as strings in
> particular), calling front incurs a cost which popFront at least partially
> duplicates. So, the range primitives are inherently inefficient in that they
> force you to incur that extra cost as you iterate over the range. Ideally,
> there would be a way to get front and pop it off at the same time so that you
> incur the cost only once (either that or have front cache its result in a way
> that lets it avoid the extra cost in popFront when popFront is called - though
> that wouldn't work with strings, since for them, the range primitives are free
> functions). So, I'm proposing takeFront and takeBack:
>

I was about to propose fetchFront/fetchBack with similar semantics. 
Thanks for pushing this proposal it looks good.

My initial intent however was to add Variable Length Range as a concept. 
... Now I think binary predicate hasFetch a-la hasSlicing is better.

> https://github.com/jmdavis/phobos/commit/5bfa8126fa14a539fee67807821ec0a10503f27b
>
> For most ranges, takeFront does this:
>
> auto takeFront(R)(ref R range)
>      if(isInputRange!R && !isNarrowString!R)
> {
>      assert(!range.empty);
>      auto retval = range.front;
>      range.popFront();
>      return retval;
> }
>

Aye. Yet there is indeed a problem with pseudo-ranges that reuse 'slot' 
for front on each popFront. Well they always been brittle.

>
> So, for strings, it'll be more efficient to use takeFront than calling front and
> popFront separately. The idea then is that any user-defined range which can
> implement takeFront more efficiently than the default will define it. Then range-
> based functions use takeFront - e.g. range.takeFront() - and if the user-
> defined range implements a more efficient version, that one is used and they gain
> the extra efficiency, or if they don't, then the free function is used with
> UFCS, and they incur the same cost that they'd incur calling front and
> popFront separately. So, it's invisible to range-based functions whether a
> range actually implements takeFront itself. takeBack does the same thing as
> takeFront but it does it with back and popBack for bidirectional ranges.
>
> I _think_ that this is a fairly clean solution to the problem, but someone
> else might be able to point out why this is a bad idea, or they might have a
> better idea. And this will have a definite impact on how ranges are normally
> used if we add this, so I'm bringing it up here for discussion. Opinions?
> Thoughts? Insights?
>
> Oh, and if we go with this, ideally, the compiler would be updated to use
> takeFront for foreach instead of front and popFront if a range implements it
> (but still do it the current way if it doesn't). So, if typeof(range)
> implements takeFront,

Right. In fact as we've seen above with e.g. stdin.byLine not every 
range can do fetchFront correctly so it's a strict subset. That was the 
reason for current trio of basic operations BTW.


-- 
Dmitry Olshansky
July 03, 2012
Re: Proposal: takeFront and takeBack
monarch_dodra:

> As far as I can recall, I've always been taught that pop does 
> NOT (should not) return a value. Rationale being it makes you 
> pay for a read/copy you may not have asked for.

I think it's also a matter of exception safety.


> That's the way C++ does it, and is what I've come to expect 
> from any language.

I expect a language to have a pop() in its collections, that 
returns the first item and removes it from the collection, as in 
Python. In D sometimes I put the front and popfront on the same 
line of code, because I think of them almost as a single 
operation :-)

Bye,
bearophile
1 2 3 4
Top | Discussion index | About this forum | D home