June 23, 2020
On 6/23/20 8:19 AM, Steven Schveighoffer wrote:
> On 6/23/20 5:45 AM, Mathias LANG wrote:
>> On Tuesday, 23 June 2020 at 05:26:21 UTC, Andrei Alexandrescu wrote:
>>> https://run.dlang.io/is/KgeosK
>>>
>>> Yah it's kinda surprising innit. For regular functions it's business as usual - exact match is attempted, conversions are considered etc. const(T[]) is convertible to const(T)[], nothing new about that.
>>>
>>> For template functions however, although templates always do exact match, arrays surreptitiously change their type from const(T[]) to const(T)[], without a conversion in sight.
>>>
>>> We need to offer that chance to other ranges if we want them to enjoy a similar treatment.
>>
>> So it's not on function call, but it's about the type being deduced. There's quite a big difference here.
> 
> Yes, I agree. It works because of the implicit conversion.

No, it doesn't work because of the implicit conversion, although implicit conversion is part of it. It works because there's a hack in the compiler that introduces the conversion from const(T[]) to const(T)[] automatically for template functions. It never happens for any other type - templates in D bind types "perfectly" otherwise. This is one exception.

> It's this case where it affects the result:
> 
> void fun(T)(T a) { pragma(msg, T.stringof); }
> 
> void main()
> {
>     const int[] a;
>     const(int)[] b;
>     fun(a);
>     fun(b);
> }
> 
> You only get one instantiation there.

Yes, this is illustrating the hack.

> The case makes complete sense, because the head is always a different memory location (a copy). What we lack is a way to make the head mutable of any type via a type modifier, so it's a "special case" in the sense that it's possible only on T[] and T*. If we change the special case to a general case, then it naturally makes things a lot more usable, and gets rid of a lot of the complaints about const ranges.

Correct, thanks. That's where T.opOnCall() would fit in - the function would effect a change of type when an object of type T is passed by value to a template function. It sounds awfully particular, but I'm encouraged by the fact that it has worked so well for so many years with arrays and pointers.
June 23, 2020
On 6/23/20 9:07 AM, Paul Backus wrote:
> On Tuesday, 23 June 2020 at 05:15:49 UTC, Andrei Alexandrescu wrote:
>>> Regardless, my broader point is that once we're open to the possibility of designing a new range API, all of this can be solved without any language changes by using an API that doesn't require mutation (i.e., tail()).
>>
>> Using only tail() makes iteration with mutable ranges inefficient
> 
> I don't think it's a given that tail() is less efficient than popFront(). Here's a program where both tail() and popFront() produce the same assembly with `ldc -O2`:
> 
> https://d.godbolt.org/z/-S_JoF

With arrays, no problem. Two words to copy, easy to track, enregister etc. With elaborate ranges, definitely a problem.

> Of course, if it does turn out to make a difference in some cases, mutable ranges are free to implement popFront as well. There will be a generic version in std.v2.algorithm to ensure that `mutableRange.popFront()` is always valid.
> 
>> and iteration with immutable ranges difficult (must use recursion everywhere). Hardly a winner of a design contest.
> 
> Even if we assume for the sake of argument that "recursion == difficult", I would still say that "difficult" is an improvement over "impossible".

There's "onerous" somewhere in there, too. I agree with Steve who said implementing recursive variants of most range algorithms in Phobos would be onerous.

The each() primitive I discussed may work nice actually because it allows recursion to be done in only itself, instead of every algorithm.
June 23, 2020
On 6/23/20 11:50 AM, Andrei Alexandrescu wrote:
>> E.g. the following code will only instantiate a single template:
>> ```
>> void fun2(T)(const T[] a) { pragma(msg, T.stringof); }
>>
>> void main() {
>>      immutable(int[]) a;
>>      fun2(a);
>>      int[] b;
>>      fun2(b);
>> }
>> ```
> 
> This example is actually not illustrative. The first call involves a conversion from immutable(int[]) to immutable(int)[].

Oops, I meant from immutable(int[]) to const(int[]).
June 23, 2020
On 6/23/20 11:54 AM, Andrei Alexandrescu wrote:
> No, it doesn't work because of the implicit conversion, although implicit conversion is part of it.

Whether the compiler hacks it by converting the type before invoking the type detection, or whether it hacks by adjusting the IFTI detection is not really important.

In the general case, though, there is no implicit conversion -- the conversion function must be used. So it would be important to call the conversion function before IFTI is invoked.

But another solution would be define implicit conversions that are preferred for IFTI/auto. Like an opCanonical or something. This I think sits better and is more consistent.

There are other cases aside from tail-const that could make IFTI more usable. For example: https://issues.dlang.org/show_bug.cgi?id=4998

This wouldn't be doable without some more help from the compiler during IFTI with possibly template constraints.

-Steve
June 23, 2020
On 6/23/20 1:01 PM, Steven Schveighoffer wrote:
> On 6/23/20 11:54 AM, Andrei Alexandrescu wrote:
>> No, it doesn't work because of the implicit conversion, although implicit conversion is part of it.
> 
> Whether the compiler hacks it by converting the type before invoking the type detection, or whether it hacks by adjusting the IFTI detection is not really important.

*nod*

> In the general case, though, there is no implicit conversion -- the conversion function must be used. So it would be important to call the conversion function before IFTI is invoked.
> 
> But another solution would be define implicit conversions that are preferred for IFTI/auto. Like an opCanonical or something. This I think sits better and is more consistent.

Yah, that's what I meant with opOnCall. opOnAutoOrIFTI...

> There are other cases aside from tail-const that could make IFTI more usable. For example: https://issues.dlang.org/show_bug.cgi?id=4998
> 
> This wouldn't be doable without some more help from the compiler during IFTI with possibly template constraints.

*nod*

June 23, 2020
On Tuesday, 23 June 2020 at 16:20:26 UTC, Andrei Alexandrescu wrote:
> On 6/23/20 9:07 AM, Paul Backus wrote:
>>
>> I don't think it's a given that tail() is less efficient than popFront(). Here's a program where both tail() and popFront() produce the same assembly with `ldc -O2`:
>> 
>> https://d.godbolt.org/z/-S_JoF
>
> With arrays, no problem. Two words to copy, easy to track, enregister etc. With elaborate ranges, definitely a problem.

Do you (or anyone else reading this--feel free to chime in) have an example in mind of such an "elaborate range"? If so, I'd be happy to do the legwork of running additional experiments. No need to settle for speculation here when we can have actual data.

>> Even if we assume for the sake of argument that "recursion == difficult", I would still say that "difficult" is an improvement over "impossible".
>
> There's "onerous" somewhere in there, too. I agree with Steve who said implementing recursive variants of most range algorithms in Phobos would be onerous.

Again, if you, Steve, or anyone else in this thread have any examples in mind, I'd be happy to run the experiment of writing recursive versions myself.

> The each() primitive I discussed may work nice actually because it allows recursion to be done in only itself, instead of every algorithm.

Even without making `each` a primitive, you can do this. Implement a few fundamental algorithms like `each`, `map`, `filter`, and `fold` using recursion, and many others can be built on top of them.
June 24, 2020
On Tuesday, 23 June 2020 at 21:19:39 UTC, Paul Backus wrote:
> Even without making `each` a primitive, you can do this. Implement a few fundamental algorithms like `each`, `map`, `filter`, and `fold` using recursion, and many others can be built on top of them.

Slightly off-topic, but are you familiar with Clojure's transducers?
https://clojure.org/reference/transducers
https://www.youtube.com/watch?v=6mTbuzafcII

You can also check this talk from CppCon, to see how their applied to a statically typed language:
https://www.youtube.com/watch?v=6mTbuzafcII

For me, the main advantage is that this design allows implementing algorithms like map, filter, etc. without caring at all about the exact range interface. This for me is a big deal because, in my day-to-day job, I can't use much range/iterator/iterable/enumerable pull-style algorithms, because most my code is reactive/ asynchronous (push-based). So most of what I do is based on either Futures/Promises or Reactive Extensions:
http://reactivex.io/
http://reactivex.io/documentation/operators.html
June 24, 2020
On Monday, 22 June 2020 at 16:03:54 UTC, Andrei Alexandrescu wrote:
> On 6/21/20 1:43 PM, Stanislav Blinov wrote:
>> Input ranges, by nature being one-pass, *should not be copyable*...

> Good arguments, no doubt, but a long experience with noncopyable C++ objects suggests that defining such types need to be approached with trepidation as they are very cumbersome to use.

In C++ - traditionally they were cumbersome, even to define, until move came along. In D though? The biggest annoying obstacle is Phobos making lots of redundant copies, needlessly prohibiting the use of non-copyables. That, however, is not a fault of non-copyables, but of Phobos.
The GC limits the use of non-copyables for sure, though that's largely irrelevant to ranges themselves, esp. considering that classes indeed shouldn't ever be ranges.

> (Any type containing a noncopyable type as a member becomes itself noncopyable;

And that is *good*.

> this is not something that we can inflict lightly on the casual users of input ranges.

That is exactly what we *should* inflict on "casual" users. Compile-time errors. For trying to casually write bugs.

> (To wit: no input streams or iterators in C++ are noncopyable, although the same argument would apply to them as well.)

An input range being non-copyable is immensely beneficial - it would statically catch a good deal of problems. It would *alleviate* the need of added complexity in implementations. Looking back at the ByLine again: RefCounted internals just to remain copyable. I don't think appeasing "casual users" by over-engineering is a good compromise.
Non-copyables in D are easy to write, easy to use *correctly*, and difficult to use *incorrectly* (won't compile, after all).

> Also, it is not unheard of to have two input ranges fed from the same source with the obvious semantics (whoever calls functions to get more data will get the data and push the cursor further). True, buffering is an issue (what if two copies have each their own buffers?) but that's an engineering problem, not one of principles.

That would be different instances of input ranges over the same source, not copies of the same range carrying unnecessary state and complexity (over-engineered solutions to the buffering issue). We're never consuming the same range *simultaneously* in different places. With an uncopyable, we'd just have to give it to a consumer and then have them give the remainder back to us.

> It is quite clear to me that we can't propose a design with noncopyable input ranges without effectively making them pariahs that everybody will take pains to use and do their best to avoid.

Then we should propose a design that is not painful to use:

// error: `input` cannot be copied
// auto data = input.filter!somehow.map!something.array;
// Ok:
auto data = input.move.filter!somehow.map!something.array;

If we need partial consumption, i.e. preservation of the remainder, terminal primitives can give it back to us (after all, wrapping range is holding onto it):

auto data = input.move.filter!somehow.take(someAmount).map!something.array(input);

The implementation of such `array` would look like this:

auto array(Range, Remainder)(Range range, ref Remainder remainder)
if (isInputRange!Range && is(typeof(range.source) == Remainder))
{
    import std.array : appender;
    auto builder = appender!(ElementType!Range[]);
    // this is assuming current API, substitute with whatever replacement API
    while (!range.empty)
    {
        builder.put(range.front);
        range.popFront();
    }
    move(range.source, remainder);
    return builder[];
}

I wouldn't say this qualifies as "painful", to author or use. Authors of ranges would be writing ranges, and not the machinery required to keep them copyable. Users of ranges would be using ranges, and not the hidden machinery required to keep them copyable.
June 24, 2020
On Wednesday, 24 June 2020 at 11:49:11 UTC, Stanislav Blinov wrote:
> On Monday, 22 June 2020 at 16:03:54 UTC, Andrei Alexandrescu wrote:
>> It is quite clear to me that we can't propose a design with noncopyable input ranges without effectively making them pariahs that everybody will take pains to use and do their best to avoid.
>
> Then we should propose a design that is not painful to use:
>
> // error: `input` cannot be copied
> // auto data = input.filter!somehow.map!something.array;
> // Ok:
> auto data = input.move.filter!somehow.map!something.array;
>
> If we need partial consumption, i.e. preservation of the remainder, terminal primitives can give it back to us (after all, wrapping range is holding onto it):
>
> auto data = input.move.filter!somehow.take(someAmount).map!something.array(input);

IMO if the user has to manually call `move`, you have already failed at usability.

It may seem "easy" or "obvious" to you and I, but for beginners, this is going to be a huge stumbling block. You write some code that looks like it should obviously work, you get a mysterious error message that "std.algorithm.whatever!(some, args).Result is not copyable because it is annotated with @disable", and the solution is that you have to add ".move" to your code? Can you imagine having to explain that to someone in the Learn forum? I don't think I could do it with a straight face--it's too absurd.

The only way non-copyable input ranges can work is if the compiler is able to implicitly move them in cases like the above. In other words, we would need something like Walter's "Copying, Moving, and Forwarding" DIP [1].

[1] https://github.com/WalterBright/DIPs/blob/13NNN-WGB.md/DIPs/13NNN-WGB.md
June 24, 2020
On Tuesday, 23 June 2020 at 16:20:26 UTC, Andrei Alexandrescu wrote:
> On 6/23/20 9:07 AM, Paul Backus wrote:
>> I don't think it's a given that tail() is less efficient than popFront(). Here's a program where both tail() and popFront() produce the same assembly with `ldc -O2`:
>> 
>> https://d.godbolt.org/z/-S_JoF
>
> With arrays, no problem. Two words to copy, easy to track, enregister etc. With elaborate ranges, definitely a problem.
>
> ...
>
> There's "onerous" somewhere in there, too. I agree with Steve who said implementing recursive variants of most range algorithms in Phobos would be onerous.
>
> The each() primitive I discussed may work nice actually because it allows recursion to be done in only itself, instead of every algorithm.

I'm struck that no one AFAICS has suggested the following alternative: instead of `tail`, to allow popFront() (and if implemented, popBack()) to return an `Option!(ElementType, None)`.

What is returned would be either the popped element, or None if no more elements remain (with the nice by-product of no longer getting assert failures if pop{Front,Back} are called when the range is empty).

This might allow some natural API evolution along the following lines:

  * an input range need only implement such an option-based popFront() and
    no other methods

    - for backwards compatibility, we could also support implementing `front`
      and a `void`-returning popFront()

  * a forward range would additionally implement `front` and `save` methods:

    - `front` because, since a forward range is deterministic, it should
      always be possible to know up front (pun intended...) what the first
      element should be

    - `save` for the usual reasons

  * all other range types should derive naturally from this

    - we might want to consider a more fine-grained approach to bidirectional
      ranges, e.g. allowing a bidirectional input range with option-returning
      popFront and popBack

  * over time we could/should deprecate void-returning pop{Front,Back}, and
    possibly also defining `front` or `back` if `save` is not also defined

The approach of having only option-returning popFront would be particularly useful for input ranges where the front cannot or should not be predefined on construction (e.g. RNGs, or stuff that wraps them like RandomSample and friends).

(Is there actually a good example of an input range where the first element can be predefined on construction in a way that isn't -- like pseudo-RNGs -- an implementation detail that one probably wants to hide from the user?)