std.v2020.algorithm etc[ WAS: Is run.d going to be expand for runtime and the phobos library?] (page 20)

On Sunday, 28 June 2020 at 15:07:21 UTC, H. S. Teoh wrote: > On Sun, Jun 28, 2020 at 02:00:42PM +0000, Stanislav Blinov via Digitalmars-d wrote: [...] >> What makes an input range distinct is that, unlike other kinds, it can only be consumed once - there's no need for any other assumptions. The current range API does not enforce this in any way, hence the copying conundrum and Jonathan's "don't use original after you make a copy" or other similar statements that amount to "you should know what you're doing". > > This is not a sufficient assumption, because there remains the question of what state a range should be in after you iterate over it partially, say, using a foreach loop: > > MyRange r = ...; > foreach (e; r) { > if (someCondition) break; > } > foreach (e; r) { > // Do we get the remaining elements of r here, or do we > // get all the elements from the beginning again? > } > Even if we grant that r is a forward range, the semantics of the above code is still underspecified. Of course it isn't, because there's a problem with `foreach`. It iterates over a copy, except if there's an opApply. I'm not sure what your argument is. I repeat: "The current range API does not enforce [input range behavior] in any way, hence the copying conundrum..." `foreach` is very much a part of range API. > And it's not just a matter of specifying that the semantics should be; there's also the question of whether we should specify anything at all, keeping in mind that the more specific the definition, the less room there is for efficient implementation of ranges (i.e., some optimizations may be excluded if there are too many semantic requirements on ranges). Yes, we should. For example, non-copyability of input ranges, which enables *more* efficient implementations (whereas expecting them to remain copyable dictates the opposite). > ...there is a great deal of advantage in using the subset of semantics in which input ranges *are* subsets of forward ranges, because this allows us to unify a lot of iteration logic. That's not based on range semantics, it's based solely on interface. An input range and a forward range can have the same interface for iteration, i.e. a `fetchNext` or its variant. The only difference is that you can't fork the former. > Otherwise, like you say below, every algorithm will need to be duplicated, once for input ranges, once for forward ranges, which will instantly double the size of Phobos algorithms. That's a gross exaggerration. Algorithms already do the semantic duplication, for arrays, for hasLvalueElements, etc., etc., ultimately, when everything else fails, arriving at the "while (!r.empty) r.popFront()".

On Sunday, 28 June 2020 at 15:33:48 UTC, Stanislav Blinov wrote: > On Sunday, 28 June 2020 at 14:12:39 UTC, Paul Backus wrote: > >> It sounds like what you are really trying to say here is that input ranges and forward ranges should not use the same interface, and that input ranges should instead implement only `Option!T next()`. > > Not exactly. A given forward range may also implement such a `next`. [...] We should differentiate by some other artifact. I maintain that artifact should be copyability. Or we can keep the distinction based on the presence of save(), and live forever with the ambiguous semantics and the "don't use original after copy unless you know what you're doing" convention (yuck). Are there even other options besides accepting UB? If input ranges are only required to implement next(), then the distinction will be that forward ranges implement head()/tail() and input ranges don't. (In this scenario, head() and tail() will of course be required to return the same result if called multiple times.) For *some* forward ranges, it will be possible to implement next() as well, in which case those ranges will be capable of functioning as both input ranges and forward ranges. I don't see anything wrong with that. >> My example was a response to the specific claim that "an immutable input range cannot exist." Specifically, it was a counterexample demonstrating that the claim is false. As I'm sure you can see, the claim is still false even if we agree to use next() instead of tail() for input ranges, since next() is obviously allowed to be impure. > > I can't see that. I don't see how you're falsifying that claim when in order to implement the range you need mutable state. Ok, I'll show you. Here's the same range implemented using next() instead of front() and tail(): // Just enough for the example to compile struct Optional(T) { bool empty; T value; } Optional!T some(T)(T value) { return Optional!T(false, value); } Optional!T no(T)() { return Optional!T(true, T.init); } auto fdByChar(int fd) { import core.sys.posix.unistd; static struct Result { private int fd; Optional!char next() const { char c; ssize_t nread = read(fd, cast(void*) &c, 1); if (nread == 0) return no!char; else if (nread == 1) return some(c); else throw new Exception("Error reading file"); } } return Result(fd); } And here's the usage: void each(alias fun, Range)(Range range) { for (auto e = range.next; !e.empty; e = range.next) fun(e.value); } void main() { import std.stdio; immutable stdinByChar = fdByChar(0); stdinByChar.each!writeln; } Once again, `stdinByChar` is an input range, and it is immutable. QED.

June 29, 2020

Re: std.v2020.algorithm etc[ WAS: Is run.d going to be expand for runtime and the phobos library?]

Posted by Stanislav Blinov
in reply to Paul Backus

Permalink

Stanislav Blinov

Posted in reply to Paul Backus

Permalink

On Sunday, 28 June 2020 at 16:37:23 UTC, Paul Backus wrote:
> On Sunday, 28 June 2020 at 15:33:48 UTC, Stanislav Blinov wrote:
>> On Sunday, 28 June 2020 at 14:12:39 UTC, Paul Backus wrote:
>>
>>> It sounds like what you are really trying to say here is that input ranges and forward ranges should not use the same interface, and that input ranges should instead implement only `Option!T next()`.
>>
>> Not exactly. A given forward range may also implement such a `next`. [...] We should differentiate by some other artifact. I maintain that artifact should be copyability. Or we can keep the distinction based on the presence of save(), and live forever with the ambiguous semantics and the "don't use original after copy unless you know what you're doing" convention (yuck). Are there even other options besides accepting UB?
>
> If input ranges are only required to implement next(), then the distinction will be that forward ranges implement head()/tail() and input ranges don't. (In this scenario, head() and tail() will of course be required to return the same result if called multiple times.)

Just as forward ranges may implement `next()`, some input ranges may want to be buffered, i.e.:

struct Wrapped
{
    private typeof(src.next()) buf; // Optional!ElementType
    private Input src; // Input is a `next()` input range

    @property bool empty() const { return !buf; }
    T moveFront() { return buf.replace(T.init); }
    void popFront() { buf = src.next; }

    // no front, only moveFront
}

If we put a hard requirement on input ranges to *only* exist with the next() interface, this will shut down possible optimizations that buffering enables. We can't implement a buffering range in terms of the `next()` interface, a staggered fetch + advance is required. (next() may overwrite the internals of the buffer before client has a chance of reading the value).

>>> My example was a response to the specific claim that "an immutable input range cannot exist." Specifically, it was a counterexample demonstrating that the claim is false. As I'm sure you can see, the claim is still false even if we agree to use next() instead of tail() for input ranges, since next() is obviously allowed to be impure.
>>
>> I can't see that. I don't see how you're falsifying that claim
>> when in order to implement the range you need mutable state.
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> Ok, I'll show you. Here's the same range implemented using next() instead of front() and tail():
>
>             Optional!char next() const
>             {
>                 char c;
>                 ssize_t nread = read(fd, cast(void*) &c, 1);
                                  ^^^^
>
>                 if (nread == 0)
>                     return no!char;
>                 else if (nread == 1)
>                     return some(c);
>                 else
>                     throw new Exception("Error reading file");
>             }
>         }
> Once again, `stdinByChar` is an input range, and it is immutable. QED.

:) Fine. An input range can't exist without mutable state. Is that statement more agreeable for you?

On Monday, 29 June 2020 at 15:40:28 UTC, Stanislav Blinov wrote: > We can't implement a buffering range in terms of the `next()` interface, a staggered fetch + advance is required. (next() may overwrite the internals of the buffer before client has a chance of reading the value). Are you referring to ranges like `byLine` that invalidate the previous front when popFront is called? If so, I don't see what difference it makes whether they're implemented with next() or some other interface. If that's not what you're referring to, can you give an example? > :) Fine. An input range can't exist without mutable state. Is that statement more agreeable for you? Yes, that's much more agreeable. :) Personally, I am concerned here with how to make ranges work with D's `const` and `immutable` qualifiers. So the fact that there may be mutable state out there somewhere in libc/the kernel/the filesystem doesn't bother me.

On Monday, 29 June 2020 at 17:49:29 UTC, Paul Backus wrote: > Are you referring to ranges like `byLine` that invalidate the previous front when popFront is called? Yes. > If so, I don't see what difference it makes whether they're implemented with next() or some other interface. Efficiency. A given algorithm may need to stagger the range in order to preserve the head. To keep a `next()` interface and avoid buffer corruption, additional branch is needed for staggering (see [1]). This can be avoided by switching to front/popFront buffering where caller can explicitly announce they're done reading the element (not in the gist (yet)). [1] https://gist.github.com/radcapricorn/d76d29c6df6fa822d7889e799937f39d#file-inputrange-d-L180 > Personally, I am concerned here with how to make ranges work with D's `const` and `immutable` qualifiers. So the fact that there may be mutable state out there somewhere in libc/the kernel/the filesystem doesn't bother me. I understand that. Forward ranges are good candidates for const/immutable. Input ranges though - I'd prefer to tread lightly there.

Forums