Proposal for SentinelInputRange (page 14) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » Proposal for SentinelInputRange (page 14)

March 01, 2013

Re: Proposal for SentinelInputRange

Posted by Andrei Alexandrescu
in reply to Walter Bright

Andrei Alexandrescu

Posted in reply to Walter Bright

On 2/28/13 7:34 PM, Walter Bright wrote:
> On 2/28/2013 4:07 PM, Jonathan M Davis wrote:
>> I just don't see how you're going to get a
>> performance gain from much of anything other than strings.
>
> I gave you other examples already. We're just going around in circles.

We're talking about an abstraction that doesn't exist, exemplifying with code that doesn't use it. No wonder there's going to be miscommunication. The solution is to spin some code.

Andrei

March 01, 2013

Re: Proposal for SentinelInputRange

Posted by Walter Bright
in reply to H. S. Teoh

Walter Bright

Posted in reply to H. S. Teoh

On 2/28/2013 4:47 PM, H. S. Teoh wrote:
> On Thu, Feb 28, 2013 at 04:36:03PM -0800, Walter Bright wrote:
>> On 2/28/2013 3:14 PM, jerro wrote:
>>> ldc2 -O3 -release compiles it to:
>>
>> As I posted to Timon, clang (i.e. ldc2) and gcc do do this
>> optimization, dmc and vc do not.
>
> Couldn't dmc be made to do this optimization?

Yes, it could.

March 01, 2013

Re: Proposal for SentinelInputRange

Posted by Walter Bright
in reply to Andrei Alexandrescu

Walter Bright

Posted in reply to Andrei Alexandrescu

On 2/28/2013 5:29 PM, Andrei Alexandrescu wrote:
> On 2/28/13 7:34 PM, Walter Bright wrote:
>> On 2/28/2013 4:07 PM, Jonathan M Davis wrote:
>>> I just don't see how you're going to get a
>>> performance gain from much of anything other than strings.
>>
>> I gave you other examples already. We're just going around in circles.
>
> We're talking about an abstraction that doesn't exist, exemplifying with code
> that doesn't use it. No wonder there's going to be miscommunication. The
> solution is to spin some code.

std.regexp uses sentinels in the bytecode.

March 01, 2013

Re: Proposal for SentinelInputRange

Posted by Walter Bright
in reply to deadalnix

Walter Bright

Posted in reply to deadalnix

On 2/28/2013 5:00 PM, deadalnix wrote:
> On Friday, 1 March 2013 at 00:34:25 UTC, Walter Bright wrote:
>>> I just don't see how you're going to get a
>>> performance gain from much of anything other than strings.
>>
>> I gave you other examples already. We're just going around in circles.
>
> You didn't posted a single example that wasn't optimized by LLVM.

Search for [1] in lexer.c, i.e. lookahead cases, although these are less important.

I admit I was surprised that LLVM did that, though the other compilers did not. There are some lingering issues with ordering. The sentinel is going to be the rare case, and you'd often want to sort the cases so that the most likely ones are tested first (if it doesn't all go into a jump table indexed by the value).

> I do understand that some compiler may not do the optimization, but it is show
> already that this is clearly possible and in fact done. That is not an
> theoretical improvement.
>
> I don't see the point in creating a new type of range simply because some
> compiler don't optimize properly.

It's a good point.

March 01, 2013

Re: Proposal for SentinelInputRange

Posted by deadalnix
in reply to Walter Bright

deadalnix

Posted in reply to Walter Bright

On Friday, 1 March 2013 at 01:46:34 UTC, Walter Bright wrote:
> On 2/28/2013 5:29 PM, Andrei Alexandrescu wrote:
>> On 2/28/13 7:34 PM, Walter Bright wrote:
>>> On 2/28/2013 4:07 PM, Jonathan M Davis wrote:
>>>> I just don't see how you're going to get a
>>>> performance gain from much of anything other than strings.
>>>
>>> I gave you other examples already. We're just going around in circles.
>>
>> We're talking about an abstraction that doesn't exist, exemplifying with code
>> that doesn't use it. No wonder there's going to be miscommunication. The
>> solution is to spin some code.
>
> std.regexp uses sentinels in the bytecode.

I think nobody says that this technic is useless. Simply that a good compiler can take advantage of it without introducing an higher level concept of sentinel input range.

March 01, 2013

Re: Proposal for SentinelInputRange

Posted by Dmitry Olshansky
in reply to Walter Bright

Dmitry Olshansky

Posted in reply to Walter Bright

01-Mar-2013 02:57, Walter Bright пишет:
> On 2/28/2013 10:39 AM, Dmitry Olshansky wrote:
>> That would mean that when you see 0 or 0x1A you do a check to see if
>> that's the
>> end of input then decide it's really the end of input.
>
> A 0 or a 0x1A is the end of valid D source. Period.

So I thought.

Well, I can't see the difference between 2 sentinel values: both 0 and 0x1A has to both work. For me that means both values have to be accepted as sentinels I'd call that a tuple of sentinels.

> But that doesn't mean the sentinel has to be a tuple.

Should I say you are getting cryptic?

-- 
Dmitry Olshansky

March 01, 2013

Re: Proposal for SentinelInputRange

Posted by Chris Nicholson-Sauls
in reply to deadalnix

Chris Nicholson-Sauls

Posted in reply to deadalnix

A use case that comes immediately to mind: a sentinal range that, yes wraps, an infinite (but predictable!) range, effectively allowing you to take a head-slice of the infinite range.

auto foo = infiniteRangeOfEvenNumbers();
auto upto1000 = GenericSentinalInputRange!1000( foo );

Yes, in this contrived example, one could use take(), but what about an infinite range that can be predicted to always hit certain points, but is not restricted to those points?


As for other cases of sentinal ranges wrapping others, in particular arrays, there seems to be no mention of what I expect would be the most common use case: wherein the code which instantiates the sentinal range can guarantee that the sentinal already exists in the source data/range, because it *put* it there -- the lexer remains a fine example, as it tacks the terminating \0 on.  There is no need for extra checks, because there is no need to *replace* the source data with the sentinel. It is naturally reduced to it through normal processing.


Other apparent use cases: list processing (Andrei has mentioned this already, for singly-linked lists); protocol processing with a terminate-transaction message, or similar; state sequences (which can include lexers and parsers, as it happens).  This is just off the top of my head.

Can we do it now, without a special type?  Yes.  Is there any gain from introducing the special type, versus what can be done now?  Yes, sometimes.  Is there any harm in introducing the special type?  No.  Is there significant cost in adding it?  No.  Should this just be handled by compiler optimization?  Yes and no -- yes in that a great compiler should be able to optimize *many* (but not neccessarily all!) cases; but no in that including implementation details in a language spec is usually bad mojo.  Of course, any compiler that could optimize the common cases, could at least as readily optimize the use of a sentinal range -- in fact, I'd expect it to open optimization paths for some of the corner cases that might otherwise not have been optimized.

So... I could live without a standard sentinal range concept (have so far, using sentinal injection with input ranges, which as Walter pointed out is really an incorrect use (abuse?) of input ranges), but I also know I'd be using it if it existed (and thereby cleaning up some code versus how I do it now).

March 01, 2013

Re: Proposal for SentinelInputRange

Posted by deadalnix
in reply to Chris Nicholson-Sauls

deadalnix

Posted in reply to Chris Nicholson-Sauls

On Friday, 1 March 2013 at 06:19:19 UTC, Chris Nicholson-Sauls wrote:
> A use case that comes immediately to mind: a sentinal range that, yes wraps, an infinite (but predictable!) range, effectively allowing you to take a head-slice of the infinite range.
>
> auto foo = infiniteRangeOfEvenNumbers();
> auto upto1000 = GenericSentinalInputRange!1000( foo );
>

struct GenericSentinelRange(R, Sentinel) {
    R r;

    @property auto front() {
        return r.front;
    }

    void popFront() {
        r.popFront();
    }

    @property empty() {
        return r.front == Sentinel;
    }
}

We don't need a new type of range at all. You confuse legitimate uses case for using a sentinel to terminate a range and uses case where an actual sentinel range is needed.

> So... I could live without a standard sentinal range concept (have so far, using sentinal injection with input ranges, which as Walter pointed out is really an incorrect use (abuse?) of input ranges), but I also know I'd be using it if it existed (and thereby cleaning up some code versus how I do it now).

You can live without, and guess what : if you use LDC (and probably GDC) you'll get the performance boost Walter is talking about.

March 01, 2013

Re: Proposal for SentinelInputRange

Posted by Chris Nicholson-Sauls
in reply to deadalnix

Chris Nicholson-Sauls

Posted in reply to deadalnix

On Friday, 1 March 2013 at 06:33:42 UTC, deadalnix wrote:
>
> struct GenericSentinelRange(R, Sentinel) {
>     R r;
>
>     @property auto front() {
>         return r.front;
>     }
>
>     void popFront() {
>         r.popFront();
>     }
>
>     @property empty() {
>         return r.front == Sentinel;
>     }
> }

That's exactly what I wrote as well, when I played with the idea before posting.  Except I just called the sentinel constant S and added 'enum sentinel = S;' -- not significant extra work.

> We don't need a new type of range at all. You confuse legitimate uses case for using a sentinel to terminate a range and uses case where an actual sentinel range is needed.

I'm not confused. The use case for a sentinel range /type/ is always going to be the same, and singular: automating legitimate use cases for sentinels.  So, in effect, an argument for one is as good as for the other, when I'm only responding to the calls for use case examples. My mistake was in not making it clear that I was responding just to those, sorry.

> You can live without, and guess what : if you use LDC (and probably GDC) you'll get the performance boost Walter is talking about.

Which presupposes that I *can* use LDC (or GDC). For many people, it may not be an option.  And last time I tried (which has been months now, admittedly) I couldn't get LDC to work for me at all, which is a shame since I'm actually very much interested in it.  Either way, reliance on implementation details is not a good thing.

March 01, 2013

Re: Proposal for SentinelInputRange

Posted by Marco Leise
in reply to Walter Bright

Marco Leise

Posted in reply to Walter Bright

Am Thu, 28 Feb 2013 14:27:28 -0800
schrieb Walter Bright <newshound2@digitalmars.com>:

> Two corporate users of D have also told me that their motivating case to adopt D was - compile speed!

That's what I thought and why I keep telling that Carthago has to..., I mean that Phobos and the GC is not the ultimate performance platform. You just don't get fast + safe + convenient in one package, it's a compromise.

> Throwing compile speed under the bus is what other compiler vendors do, and we're not going to do that.

I certainly hope so :D. It's just amazing that you barely notice a delay when compiling a medium sized program. DMD practically competes with interpreted languages here in terms of turn-around time.

-- 
Marco

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation