June 24, 2020
On Wednesday, 24 June 2020 at 14:17:17 UTC, Joseph Rushton Wakeling wrote:
> [snip]
>
> I'm struck that no one AFAICS has suggested the following alternative: instead of `tail`, to allow popFront() (and if implemented, popBack()) to return an `Option!(ElementType, None)`.
>
> [snip]

This is similar to what I had said previously [1] about cursors (though Steven disagreed with my initial take). However, it sounds like an Option is more elegant.

[1] https://forum.dlang.org/post/nudkmmtoeqtznuorneev@forum.dlang.org
June 24, 2020
On Wednesday, 24 June 2020 at 13:14:22 UTC, Paul Backus wrote:
> On Wednesday, 24 June 2020 at 11:49:11 UTC, Stanislav Blinov wrote:
>> On Monday, 22 June 2020 at 16:03:54 UTC, Andrei Alexandrescu wrote:
>>> It is quite clear to me that we can't propose a design with noncopyable input ranges without effectively making them pariahs that everybody will take pains to use and do their best to avoid.
>>
>> Then we should propose a design that is not painful to use:
>>
>> // error: `input` cannot be copied
>> // auto data = input.filter!somehow.map!something.array;
>> // Ok:
>> auto data = input.move.filter!somehow.map!something.array;
>>
>> If we need partial consumption, i.e. preservation of the remainder, terminal primitives can give it back to us (after all, wrapping range is holding onto it):
>>
>> auto data = input.move.filter!somehow.take(someAmount).map!something.array(input);
>
> IMO if the user has to manually call `move`, you have already failed at usability.
>
> It may seem "easy" or "obvious" to you and I, but for beginners, this is going to be a huge stumbling block.

Libraries shan't be for "beginners" nor "advanced users". And that use case is already a non-beginner. Beginners would write this:

auto data = createSomeInput(args).filter!somehow.map!something.array;

Create input, consume input, with no explicit moves in sight. Piecemeal consumption of an input range is something an "advanced" user would be doing :)

I am also of the opinion that a huge reminder "YOU SHALL NOT COPY UNLESS YOU MUST" needs to be present on every page of any introductory (and not only) literature, and not just for D. The problem with lax copying extends far outside the realm of ranges.

struct CopyCount
{
    int count;
    this(ref typeof(this) other) { count = other.count + 1; }
}

CopyCount cc;
writeln(cc);

In my book, that should print CopyCount(0) (i.e. you print that which you can parse and get the original). That's not what Phobos writeln would print though. It won't even print CopyCount(1).

People need to be taught to not squander their values, and how not to.

> You write some code that looks like it should obviously work, you get a mysterious error message that "std.algorithm.whatever!(some, args).Result is not copyable because it is annotated with @disable"

Yup, because in that hypothetical universe you're using a type that is documented to be an input range, a category of ranges which is documented to be non-copyable, which means you just wrote what otherwise would've been a bug, but the compiler stopped you.

> and the solution is that you have to add ".move" to your code? Can you imagine having to explain that to someone in the Learn forum? I don't think I could do it with a straight face--it's too absurd.

Nothing to it - you point such user to the documentation of ranges which states that input ranges are non-copyable, explains why they're non-copyable, and has usage examples for rvalues and lvalues.
You're aware of a recent question in .learn there, where the user attempted to iterate a non-copyable range (the thread which turned into another prolonged discussion: https://forum.dlang.org/post/kpncjzadrwpvxupsdmle@forum.dlang.org). A range which, by the way, was an input range. The explanations quickly went into "common pitfalls" territory. Jonathan M Davis' response was of particular interest: "In general, you should never use a range after it's been copied unless you know exactly what type of range you're dealing with and what its copying behavior is. If you want an independent copy, you need to use save."
I don't think explaining that to a "beginner" is more practical than explaining that input ranges are consumed once and may be yielding values that are only valid between iterations.

> The only way non-copyable input ranges can work is if the compiler is able to implicitly move them in cases like the above. In other words, we would need something like Walter's "Copying, Moving, and Forwarding" DIP [1].
>
> [1] https://github.com/WalterBright/DIPs/blob/13NNN-WGB.md/DIPs/13NNN-WGB.md

The way that proposal exists at the moment, it won't help practically whenever you're not at "last use" (i.e. the second case - wanting to keep the lvalue for the remainder). Compiler will want to copy then. It'll certainly help for the first case though, as well as general implementation of ranges.
June 24, 2020
On Wednesday, 24 June 2020 at 15:26:40 UTC, Stanislav Blinov wrote:
>
> Libraries shan't be for "beginners" nor "advanced users". And that use case is already a non-beginner. Beginners would write this:
>
> auto data = createSomeInput(args).filter!somehow.map!something.array;
>
> Create input, consume input, with no explicit moves in sight. Piecemeal consumption of an input range is something an "advanced" user would be doing :)

Ok. Now they refactor their code:

auto input = createSomeInput(args);
auto data = input.filter!somehow.map!something.array;

Kaboom! Compile error. This is why Walter's proposal is relevant here.

> You're aware of a recent question in .learn there, where the user attempted to iterate a non-copyable range (the thread which turned into another prolonged discussion: https://forum.dlang.org/post/kpncjzadrwpvxupsdmle@forum.dlang.org). A range which, by the way, was an input range. The explanations quickly went into "common pitfalls" territory. Jonathan M Davis' response was of particular interest: "In general, you should never use a range after it's been copied unless you know exactly what type of range you're dealing with and what its copying behavior is. If you want an independent copy, you need to use save."

To be clear: I agree with you *in principle* that pure input ranges should be non-copyable. It's just that currently, non-copyable types are rather cumbersome to work with in D, and IMO it would be a step backwards in terms of usability to force them on users of std.algorithm and std.range.

The solution is to make working with non-copyable types more ergonomic, and then make pure input ranges non-copyable.

This will also improve the signal-to-noise ratio of compile errors: with automatic move-on-last-use, when the compiler complains about copying a range, you can be sure you have actually made a mistake somewhere in your logic, rather than just forgetting to type ".move".
June 24, 2020
On 6/24/20 10:17 AM, Joseph Rushton Wakeling wrote:
> 
> I'm struck that no one AFAICS has suggested the following alternative: instead of `tail`, to allow popFront() (and if implemented, popBack()) to return an `Option!(ElementType, None)`.
> 
> What is returned would be either the popped element, or None if no more elements remain (with the nice by-product of no longer getting assert failures if pop{Front,Back} are called when the range is empty).

I think the discussion about tail() had to do with immutable ranges. Having popXxx() return something would still have it mutate the range.
June 24, 2020
On 6/24/20 11:55 AM, Paul Backus wrote:
> To be clear: I agree with you *in principle* that pure input ranges should be non-copyable.

By what principle two input ranges should absolutely never use the same data feed?
June 24, 2020
I'm barely reading this thread, but could @live be useful with input ranges too?
June 24, 2020
On 6/24/20 12:25 PM, Adam D. Ruppe wrote:
> I'm barely reading this thread, but could @live be useful with input ranges too?

I'm not sure, but with a no-quarter-given approach to copying input ranges, simple tasks like "feed input from this file, or from stdin, and output to this other file or to stdout" mutate from trivial into little research projects (btw by the same (misguded imho) arguments "pure" output ranges ought to also be noncopyable).
June 24, 2020
On Wednesday, 24 June 2020 at 16:17:58 UTC, Andrei Alexandrescu wrote:
> On 6/24/20 11:55 AM, Paul Backus wrote:
>> To be clear: I agree with you *in principle* that pure input ranges should be non-copyable.
>
> By what principle two input ranges should absolutely never use the same data feed?

Point taken. An input range should either be non-copyable, or have reference semantics.
June 24, 2020
On Wed, Jun 24, 2020 at 12:29:13PM -0400, Andrei Alexandrescu via Digitalmars-d wrote:
> On 6/24/20 12:25 PM, Adam D. Ruppe wrote:
> > I'm barely reading this thread, but could @live be useful with input ranges too?
> 
> I'm not sure, but with a no-quarter-given approach to copying input ranges, simple tasks like "feed input from this file, or from stdin, and output to this other file or to stdout" mutate from trivial into little research projects (btw by the same (misguded imho) arguments "pure" output ranges ought to also be noncopyable).

I second this.  Whatever we decide to do about the input range / forward range distinction, it should absolutely not cripple input ranges or make them so different that you have to write two versions of the same algorithm just to handle both cases.

For all of its flaws, the current range API does a beautiful job of unifying the two (to the extent possible, modulo well-known bugs / design limitations) so that, for the most part, you can ignore the difference.  As long as .save isn't required, you could freely feed either an input range or a forward range to any range algorithm, and freely compose such algorithms, and it all Just Works(tm).

I propose that whatever new design we settle on for input ranges, it should preserve this user-facing simplicity.  What we want is a design where simple things are simple, and hard things are possible, not a design where simple things are hard and hard things are nigh impossible. To the extent possible, we should try to preserve symmetry between input and forward ranges. (By "symmetry" I mean that as long as the equivalent of .save isn't required, an input range and an output range ought to be transparently interchangeable. A kind of Liskov Substitution like principle applied to ranges.)


T

-- 
Latin's a dead language, as dead as can be; it killed off all the Romans, and now it's killing me! -- Schoolboy
June 24, 2020
On Tuesday, 23 June 2020 at 21:19:39 UTC, Paul Backus wrote:
> Do you (or anyone else reading this--feel free to chime in) have an example in mind of such an "elaborate range"? If so, I'd be happy to do the legwork of running additional experiments. No need to settle for speculation here when we can have actual data.

I don't think they'll satisfy what you are looking for, but I do have a few examples of ranges doing non-trivial things in public repos. You are welcome to take a look:

* bufferedByLine - https://github.com/eBay/tsv-utils/blob/master/common/src/tsv_utils/common/utils.d#L831
* inputSourceRange - https://github.com/eBay/tsv-utils/blob/master/common/src/tsv_utils/common/utils.d#L1443
* parseFieldList - https://github.com/eBay/tsv-utils/blob/master/common/src/tsv_utils/common/fieldlist.d#L291
* findFieldGroups - https://github.com/eBay/tsv-utils/blob/master/common/src/tsv_utils/common/fieldlist.d#L1045
* parseNumericFieldList - https://github.com/eBay/tsv-utils/blob/master/common/src/tsv_utils/common/fieldlist.d#L2029

None of them are used in performance sensitive regions of code, so they haven't been benchmarked or optimized.