protocol for using InputRanges (page 5) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » protocol for using InputRanges (page 5)

March 26, 2014

Re: protocol for using InputRanges

Posted by Steven Schveighoffer
in reply to Regan Heath

Steven Schveighoffer

Posted in reply to Regan Heath

On Wed, 26 Mar 2014 11:09:04 -0400, Regan Heath <regan@netmail.co.nz> wrote:

> On Wed, 26 Mar 2014 12:30:53 -0000, Steven Schveighoffer <schveiguy@yahoo.com> wrote:
>
>> Gah, I didn't cut out the right rules. I meant the two rules that empty must be called before others. Those are not necessary.
>
> I see.  I was thinking we ought to make empty mandatory to give more guaranteed structure for range implementors, so lazy initialisation can be done in one place only, etc etc.

Yes, but when you know that empty is going to return false, there isn't any logical reason to call it. It is an awkward requirement.

I had the same thinking as you, why pay for an extra check for all 3 calls? But there was already evidence that people were avoiding empty.

-Steve

March 26, 2014

Re: protocol for using InputRanges

Posted by monarch_dodra
in reply to Steven Schveighoffer

monarch_dodra

Posted in reply to Steven Schveighoffer

On Wednesday, 26 March 2014 at 15:37:38 UTC, Steven Schveighoffer wrote:
> Yes, but when you know that empty is going to return false, there isn't any logical reason to call it. It is an awkward requirement.
>
> -Steve

Not only that, but it's also a performance criteria: If you are iterating on two ranges at once (think "copy"), then you *know* "range2" is longer than "range1", even if you don't know its length.

Why pay for "range2.empty", when you know it'll always be false? There is a noticeable performance difference if you *don't* check.

March 26, 2014

Re: protocol for using InputRanges

Posted by Regan Heath
in reply to Steven Schveighoffer

Regan Heath

Posted in reply to Steven Schveighoffer

On Wed, 26 Mar 2014 15:37:38 -0000, Steven Schveighoffer <schveiguy@yahoo.com> wrote:

> On Wed, 26 Mar 2014 11:09:04 -0400, Regan Heath <regan@netmail.co.nz> wrote:
>
>> On Wed, 26 Mar 2014 12:30:53 -0000, Steven Schveighoffer <schveiguy@yahoo.com> wrote:
>>
>>> Gah, I didn't cut out the right rules. I meant the two rules that empty must be called before others. Those are not necessary.
>>
>> I see.  I was thinking we ought to make empty mandatory to give more guaranteed structure for range implementors, so lazy initialisation can be done in one place only, etc etc.
>
> Yes, but when you know that empty is going to return false, there isn't any logical reason to call it. It is an awkward requirement.

Sure, it's not required for some algorithms in some situations.

> I had the same thinking as you, why pay for an extra check for all 3 calls? But there was already evidence that people were avoiding empty.

Sure, as above, makes perfect sense.

It seemed from this thread that there was some confusion about how ranges should be written and used, and I thought it might help if the requirements were more fixed, that's all.

If r.empty was mandatory then every range implementer would have a place to lazily initialise, r.front would be simpler, r.popFront too.  Basically it would lower the bar for "good" range implementations.

We might just need better documentation tho.

R

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/

March 26, 2014

Re: protocol for using InputRanges

Posted by Regan Heath
in reply to monarch_dodra

Regan Heath

Posted in reply to monarch_dodra

On Wed, 26 Mar 2014 16:38:57 -0000, monarch_dodra <monarchdodra@gmail.com> wrote:

> On Wednesday, 26 March 2014 at 15:37:38 UTC, Steven Schveighoffer wrote:
>> Yes, but when you know that empty is going to return false, there isn't any logical reason to call it. It is an awkward requirement.
>>
>> -Steve
>
> Not only that, but it's also a performance criteria: If you are iterating on two ranges at once (think "copy"), then you *know* "range2" is longer than "range1", even if you don't know its length.

What guarantees range2 is longer than range1?  The isArray case checks explicitly, but the generic one doesn't.  Is it a property of being an output range that it will expand as required, or..

> Why pay for "range2.empty", when you know it'll always be false? There is a noticeable performance difference if you *don't* check.

But aren't you instead paying for 2 checks in front and 2 in popFront, so 4 checks vs 1?  Or is the argument that these 4 checks cannot be removed even if we mandate r.empty is called before r.front/popFront.

R

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/

March 26, 2014

Re: protocol for using InputRanges

Posted by monarch_dodra
in reply to Regan Heath

monarch_dodra

Posted in reply to Regan Heath

On Wednesday, 26 March 2014 at 16:55:48 UTC, Regan Heath wrote:
> On Wed, 26 Mar 2014 16:38:57 -0000, monarch_dodra
>> Not only that, but it's also a performance criteria: If you are iterating on two ranges at once (think "copy"), then you *know* "range2" is longer than "range1", even if you don't know its length.
>
> What guarantees range2 is longer than range1?  The isArray case checks explicitly, but the generic one doesn't.  Is it a property of being an output range that it will expand as required, or..

The interface: The target *shall* have enough room to accommodate origin. Failure to meet that criteria is an Error.

Output ranges may or may not auto expand as required. Arguably, it's a design flaw I don't want to get started on.

>> Why pay for "range2.empty", when you know it'll always be false? There is a noticeable performance difference if you *don't* check.
>
> But aren't you instead paying for 2 checks in front and 2 in popFront, so 4 checks vs 1?  Or is the argument that these 4 checks cannot be removed even if we mandate r.empty is called before r.front/popFront.

I don't know what checks you are talking about. Most ranges don't check anything on front or popFront. They merely assume they are in a state that where they can do their job. Failure to meet this condition prior to the call (note I didn't say "failure to check for"), means Error.

It's really not much different from when doing an strcpy. "++p" and "*p" don't check anything. The fact that ranges *can* check doesn't always mean they should.

March 26, 2014

Re: protocol for using InputRanges

Posted by Andrei Alexandrescu
in reply to Steven Schveighoffer

Andrei Alexandrescu

Posted in reply to Steven Schveighoffer

On 3/26/14, 8:37 AM, Steven Schveighoffer wrote:
> On Wed, 26 Mar 2014 11:09:04 -0400, Regan Heath <regan@netmail.co.nz>
> wrote:
>
>> On Wed, 26 Mar 2014 12:30:53 -0000, Steven Schveighoffer
>> <schveiguy@yahoo.com> wrote:
>>
>>> Gah, I didn't cut out the right rules. I meant the two rules that
>>> empty must be called before others. Those are not necessary.
>>
>> I see.  I was thinking we ought to make empty mandatory to give more
>> guaranteed structure for range implementors, so lazy initialisation
>> can be done in one place only, etc etc.
>
> Yes, but when you know that empty is going to return false, there isn't
> any logical reason to call it. It is an awkward requirement.
>
> I had the same thinking as you, why pay for an extra check for all 3
> calls? But there was already evidence that people were avoiding empty.

I think requiring users to call empty before front on input ranges is a concession we should make.

Andrei

March 26, 2014

Re: protocol for using InputRanges

Posted by Ola Fosheim Grøstad
in reply to Andrei Alexandrescu

Ola Fosheim Grøstad

Posted in reply to Andrei Alexandrescu

On Wednesday, 26 March 2014 at 17:36:08 UTC, Andrei Alexandrescu wrote:
> I think requiring users to call empty before front on input ranges is a concession we should make.

Then the name should change to "ready". It makes sense to require the user to check that the range is "ready", but not to check that it is "not empty". This will also make more sense for async implementations that will block if "not ready".

IMO the whole interface needs rethinking if you want to gracefully support async data streams where you need to distinguish between: "ready" vs "empty", "front" vs "firstavailable". Both quick-sort, merge-sort, filter and map work well with async data streams.

March 26, 2014

Re: protocol for using InputRanges

Posted by Regan Heath
in reply to monarch_dodra

Regan Heath

Posted in reply to monarch_dodra

On Wed, 26 Mar 2014 17:32:30 -0000, monarch_dodra <monarchdodra@gmail.com> wrote:

> On Wednesday, 26 March 2014 at 16:55:48 UTC, Regan Heath wrote:
>> On Wed, 26 Mar 2014 16:38:57 -0000, monarch_dodra
>>> Not only that, but it's also a performance criteria: If you are iterating on two ranges at once (think "copy"), then you *know* "range2" is longer than "range1", even if you don't know its length.
>>
>> What guarantees range2 is longer than range1?  The isArray case checks explicitly, but the generic one doesn't.  Is it a property of being an output range that it will expand as required, or..
>
> The interface: The target *shall* have enough room to accommodate origin. Failure to meet that criteria is an Error.

Ok.  So long as *something* is throwing that Error I am down with this.

> Output ranges may or may not auto expand as required. Arguably, it's a design flaw I don't want to get started on.

:)

>>> Why pay for "range2.empty", when you know it'll always be false? There is a noticeable performance difference if you *don't* check.
>>
>> But aren't you instead paying for 2 checks in front and 2 in popFront, so 4 checks vs 1?  Or is the argument that these 4 checks cannot be removed even if we mandate r.empty is called before r.front/popFront.
>
> I don't know what checks you are talking about. Most ranges don't check anything on front or popFront. They merely assume they are in a state that where they can do their job. Failure to meet this condition prior to the call (note I didn't say "failure to check for"), means Error.

Ok.. but lets take a naive range of say int with a 1 element cache in the member variable "int cache;".  The simplest possible front would just be "return cache;".  But, if cache hasn't been populated yet it's not going to throw an Error, it's just going to be wrong.

So, presumably front has to check another boolean to verify it's been populated and throw an Error if not.  That's one of the checks I meant.  A typical loop over a range will call front one or more times, so you pay for that check 1 or more times per loop.

popFront in this example doesn't need to check anything, it just populates cache regardless, as does empty.

But, I imagine there are ranges which need some initial setup, and they have to do it somewhere, and they need to check they have done it in empty, front and popFront for every call.  It's those checks we'd like to avoid if we can.

So.. if we mandate that empty MUST be called, then they could just be done in one place, empty.

However, in this situation nothing would be enforcing that requirement (in release anyway) and things could just go wrong.  So, perhaps the checks always need to be there and we gain nothing by mandating empty is called first.

idunno.

> It's really not much different from when doing an strcpy. "++p" and "*p" don't check anything. The fact that ranges *can* check doesn't always mean they should.

Sure.  For performance reasons they might not, but isn't this just a tiny bit safer if we mandate empty must be called and do one check there..

R

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/

March 27, 2014

Re: protocol for using InputRanges

Posted by Steven Schveighoffer
in reply to Andrei Alexandrescu

Steven Schveighoffer

Posted in reply to Andrei Alexandrescu

On Wed, 26 Mar 2014 13:36:08 -0400, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:

> On 3/26/14, 8:37 AM, Steven Schveighoffer wrote:
>> On Wed, 26 Mar 2014 11:09:04 -0400, Regan Heath <regan@netmail.co.nz>
>> wrote:
>>
>>> On Wed, 26 Mar 2014 12:30:53 -0000, Steven Schveighoffer
>>> <schveiguy@yahoo.com> wrote:
>>>
>>>> Gah, I didn't cut out the right rules. I meant the two rules that
>>>> empty must be called before others. Those are not necessary.
>>>
>>> I see.  I was thinking we ought to make empty mandatory to give more
>>> guaranteed structure for range implementors, so lazy initialisation
>>> can be done in one place only, etc etc.
>>
>> Yes, but when you know that empty is going to return false, there isn't
>> any logical reason to call it. It is an awkward requirement.
>>
>> I had the same thinking as you, why pay for an extra check for all 3
>> calls? But there was already evidence that people were avoiding empty.
>
> I think requiring users to call empty before front on input ranges is a concession we should make.

if(!r.empty)
{
   auto r2 = map!(x => x * 2)(r);
   do
   {
      auto x = r2.front;
      ...
   } while(!r2.empty);
}

Should we be required to re-verify that r2 is not empty before using it? It clearly is not, and would be an artificial requirement (one that map likely would not enforce!).

This sounds so much like a convention that simply won't be followed, and the result will be perfectly valid code.

-Steve

March 27, 2014

Re: protocol for using InputRanges

Posted by Daniel Murphy
in reply to Regan Heath

Daniel Murphy

Posted in reply to Regan Heath

"Regan Heath"  wrote in message news:op.xdb9a9v354xghj@puck.auriga.bhead.co.uk...

> What guarantees range2 is longer than range1?  The isArray case checks explicitly, but the generic one doesn't.  Is it a property of being an output range that it will expand as required, or..

Some ranges will give you their length...

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation