June 20, 2020
On 6/20/20 3:03 PM, Steven Schveighoffer wrote:
> On 6/20/20 6:43 AM, Paul Backus wrote:
>> On Saturday, 20 June 2020 at 04:34:42 UTC, H. S. Teoh wrote:
>>> On Fri, Jun 19, 2020 at 09:14:30PM -0400, Andrei Alexandrescu via Digitalmars-d wrote: [...]
>>>> One good goal for std.v2020 would be to forego autodecoding throughout.
>>> [...]
>>>
>>> Another could be to fix up the range API -- i.e, reconsider the ugliness that is .save, now that D has copy ctors.
>>>
>>>
>>> T
>>
>> Also, switch from `void popFront()` to `typeof(this) rest`, so that we can have `const` and `immutable` ranges.
> 
> How does that work? You'd have to use recursion I guess? ugly. Why do we need ranges for something like this?

Think Hakell lists. They can implement tail easily (just return the next pointer) but can't define popFront.

June 20, 2020
On 6/20/20 5:12 PM, Jon Degenhardt wrote:
> Nearly all the algorithms in std.algorithm.iteration (map, filter, fold, etc.) operate on input-only ranges.

Yah, one good experiment would be to implement these using alternate simpler APIs and see how they work.

You can infer a lot by just writing code and "smell" it. I know byLine didn't smell quite right.
June 21, 2020
On Sunday, 21 June 2020 at 00:05:01 UTC, Andrei Alexandrescu wrote:
> On 6/20/20 12:41 PM, Stanislav Blinov wrote:
>> Being able to call front() several times does not necessitate the *range* being buffering.
>
> If one calls front() twice without advancing the range, where does the range return the value from?

The most primitive example? The range is lazy and builds the element on every call. But I believe I've tried to make the distinction clear: *range* itself isn't necessarily buffering. Whatever it's iterating over may have to.

And, in case that wasn't clear either, I understand perfectly where you're coming from, since typical input ranges do indeed have to memoize that nasty first element upon initialization (and since they have to do that, they have to repeat that for every element thenceforth). Thing is, the input stream

`bool fetchNext(ref T);`

does not solve this problem, it merely shrugs it off unto the caller, with precarious consequences, as demonstrated.
June 21, 2020
On 6/20/20 8:38 PM, Stanislav Blinov wrote:
> On Sunday, 21 June 2020 at 00:05:01 UTC, Andrei Alexandrescu wrote:
>> On 6/20/20 12:41 PM, Stanislav Blinov wrote:
>>> Being able to call front() several times does not necessitate the *range* being buffering.
>>
>> If one calls front() twice without advancing the range, where does the range return the value from?
> 
> The most primitive example? The range is lazy and builds the element on every call. But I believe I've tried to make the distinction clear: *range* itself isn't necessarily buffering. Whatever it's iterating over may have to.
> 
> And, in case that wasn't clear either, I understand perfectly where you're coming from, since typical input ranges do indeed have to memoize that nasty first element upon initialization (and since they have to do that, they have to repeat that for every element thenceforth). Thing is, the input stream
> 
> `bool fetchNext(ref T);`
> 
> does not solve this problem, it merely shrugs it off unto the caller, with precarious consequences, as demonstrated.

I appreciate there's no shortage of people who teach my design and my code back to me. I honestly do, a lot.

It's unclear that much of anything, was demonstrated. Was the better alternative to make input ranges noncopyable? It only takes a couple of hours of coding a bit of that up to see it's quite onerous.
June 21, 2020
On Sunday, 21 June 2020 at 00:15:35 UTC, Andrei Alexandrescu wrote:
> On 6/20/20 5:12 PM, Jon Degenhardt wrote:
>> Nearly all the algorithms in std.algorithm.iteration (map, filter, fold, etc.) operate on input-only ranges.
>
> Yah, one good experiment would be to implement these using alternate simpler APIs and see how they work.
>
> You can infer a lot by just writing code and "smell" it. I know byLine didn't smell quite right.

Yeah, that would be a nice focused set to try it out on.
June 21, 2020
On Sunday, 21 June 2020 at 14:47:55 UTC, Andrei Alexandrescu wrote:

> I appreciate there's no shortage of people who teach my design and my code back to me. I honestly do, a lot.

And I, in turn, appreciate that you nicked one paragraph and ignored (I assume dismissed) everything else. Very motivating.

> It's unclear that much of anything, was demonstrated. Was the better alternative to make input ranges noncopyable?

No. No. No. It wasn't an alternative to your API. We weren't even discussing a completely alternative API until you threw it in.
I compared of two APIs in terms of operations. But "nothing was demonstrated". OK! Anyway, the whole discussion was going around the existing API, and either:

- input ranges non-copyable and forward ranges are, or
- status quo (i.e. save() for forward ranges)

I didn't catch on the distinction initially, when H.S. brought up the save(); now I do.

TLDR for all of the below - under either API, input ranges being non-copyable is a much better choice.

Input ranges, by nature being one-pass, *should not be copyable*. You can't do anything (good) with a copy, and have to invest into implementing a copy that won't bite. If you're giving such range away - you're giving it *away*, to someone else to consume. It being copyable only means that you're leaving for yourself a mutable reference to state that you shouldn't touch again. When you need the remainder back - your callee will move it back.

You mentioned the "smell" of ByLine. A good deal of that "smell" emanates from its copy-ability. I mean, disabling a constructor versus reference-counted internals - which plate is heavier?... Using either API, that smell will stay if it is to remain copyable.

Most of the time you'd be using input ranges as rvalues. In the remaining cases the compiler will give you an error if you try to copy - and that's a good error.

> It only takes a couple of hours of coding a bit of that up to see it's quite onerous.

It isn't. Bugs in the language and Phobos aside - it's one `move` away.

Here, a few Phobos algorithms implemented with the `fetchNext` API:

https://gist.github.com/radcapricorn/d76d29c6df6fa822d7889e799937f39d

The good? An actual source range (ByLine) is dumb-as-a-cork-simple. No buffering, no ref-counted internals: read, slice, rinse and repeat. The bad? Well, you can see for yourself.
June 21, 2020
On Saturday, 20 June 2020 at 14:44:58 UTC, Andrei Alexandrescu wrote:
> On 6/20/20 12:34 AM, H. S. Teoh wrote:
>> On Fri, Jun 19, 2020 at 09:14:30PM -0400, Andrei Alexandrescu via Digitalmars-d wrote:
>> [...]
>>> One good goal for std.v2020 would be to forego autodecoding
>>> throughout.
>> [...]
>> 
>> Another could be to fix up the range API -- i.e, reconsider the ugliness
>> that is .save, now that D has copy ctors.
>
> Awesome. So we have a lil list already.

These two are high on my list as well.

Also high on my list is a hash table in the standard library that better manages memory for large numbers of entries. There are multi-second GC pauses when large numbers of entries are added to the current AAs. There was an attempt at addressing this a couple years ago, but the first try didn't pan out: https://github.com/dlang/druntime/pull/1929.

Might not require a new library version to address this.
June 21, 2020
On Saturday, 20 June 2020 at 20:13:26 UTC, Walter Bright wrote:
> On 6/14/2020 9:18 AM, Avrina wrote:
>> Size doesn't matter if it doesn't work.
>
> I use it all the time, it works fine. It does lack the features that make makefiles incomprehensible, though.

I use it none of the time, and loath the fact it is named "make" which makes it's way into PATH. It's why it was replaced in DMD, it doesn't do what it needs to.

You mean comprehensible DM make files like this?

$G/nteh.obj : $C\rtlsym.h $C\nteh.c
	$(CC) -c -o$@ $(MFLAGS) $C\nteh

$G/os.obj : $C\os.c
	$(CC) -c -o$@ $(MFLAGS) $C\os

$G/out.obj : $C\out.c
	$(CC) -c -o$@ $(MFLAGS) $C\out

$G/outbuf.obj : $C\outbuf.h $C\outbuf.c
	$(CC) -c -o$@ $(MFLAGS) $C\outbuf

$G/pdata.obj : $C\pdata.c
	$(CC) -c -o$@ $(MFLAGS) $C\pdata

$G/ph2.obj : $C\ph2.c
	$(CC) -c -o$@ $(MFLAGS) $C\ph2

Yes so clear.

June 22, 2020
On 20.06.20 16:51, Andrei Alexandrescu wrote:
> 
> Input ranges should have only one API:
> 
> bool fetchNext(T& target);
> 
> Fill the user-provided target with the next element and return true. At the end of the range, return false and leave the target alone.

What if the caller does not know how to construct a T?

Nullable!T fetchNext();

(I get that D's support for algebraic data types is subpar, but maybe that is something to look into.)
June 21, 2020
On 6/21/20 10:47 AM, Andrei Alexandrescu wrote:
> Was the better alternative to make input ranges noncopyable? It only takes a couple of hours of coding a bit of that up to see it's quite onerous.

To recap/put forth a few possible APIs for input ranges (that are not forward):

bool fetchNext(ref T target);

Spec: As long as the range is nonempty, assigns the next element to target and returns true. At end of range returns false without touching target.

Pros: works naturally with actual input ranges such as files and sockets.

Cons: does not work if T is qualified. Creates a copy of each element for forward ranges that exist in memory. That means essentially the API is impaired and most algorithms that work with input ranges would need to specialize for forward ranges, too. (FWIW this interface was first discussed before D had qualifiers). Composes so-so, consider implementations of filter (not bad) and map (not so nice).

T* fetchNext();

Spec: As long as the range is nonempty, returns a pointer to the next element in the range. At end of range returns null.

Pros: Simple and efficient for many ranges. Doesn't compose too well.

Cons: Issues with escape analysis and safety. Sometimes the pointer is scope, sometimes it's not depending on the range.

T fetchNext(ref bool done);
or
ref T fetchNext(ref bool done);

Spec: As long as the range is nonempty, sets done to false and returns the next element in the range. At end of range sets done to true and returns an arbitrary T.

Pros: works with qualified data.

Cons: inefficient or needs two versions depending on ref/value. Complicates life of clients.

Flag!"each" each(alias fun);

Spec: calls fun(x) for each element in the range x - an efficient generalization of opApply. If fun returns No.each, stops iteration (and also returns No.each). Otherwise, returns Yes.each. See https://dlang.org/library/std/algorithm/iteration/each.html.

Pros: works naturally and efficiently (assuming proper inlining) with many range types. Composes quite well, picture e.g. implementations of map and filter. Beautiful. Efficient implementation for forward ranges is immediate. Jives well with Oleg's famous take on iteration (http://okmij.org/ftp/papers/LL3-collections-talk.pdf).

Cons: ranges such as map, filter etc. would need to expose both each() (for input ranges) and the existing interface (for better than input ranges). A number of algorithms would need to be redone to take advantage of the new interface. The language integration is not as nice as with the (sadly inefficient) existing opApply.

At a point Sebastian Wilzbach and I discussed that several ranges should get an each() member function, but that never got finalized. Once each() is a member of prominent input range algos (can't hurt because people already can call the less efficient each() global function) and we accumulate experience with it, we can sanction it as a legit optional primitive for ranges.

One more thing before I forget - we should drop classes that are ranges. They add too much complication. The functionality would still be present by wrapping a polymorphic implementation in a struct.
3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19