Range Redesign: Copy Semantics (page 6)

Settings

Help

Index » General » Range Redesign: Copy Semantics (page 6)

January 23

Re: Range Redesign: Copy Semantics

Posted by Quirin Schroll
in reply to Paul Backus

Permalink

Quirin Schroll

Posted in reply to Paul Backus

Permalink

On Monday, 22 January 2024 at 05:22:44 UTC, Paul Backus wrote:

You have N data structures and M algorithms, and by using this common interface you can implement all N*M combinations in O(N+M) code. Each data structure implements the range API once, and each algorithm is implemented once using the range API.

If you split the API in two, by making input streams (basic input ranges) and forward ranges completely disjoint, you undermine this goal. Now, each data structure has to implement two APIs, and each algorithm has to be implemented twice, once for input streams and once for forward ranges.

Not necessarily. First, any algorithm requiring a forward range wouldn’t be implemented twice. And, most importantly, a data structure would not offer input range iteration if it can offer forward range. Why would it?

In practice, what's going to happen is library authors will simply not bother to implement both, and we will end up with gaps where many (data structure, algorithm) pairs are not covered.

The only realistic concern I see with the proposal is algorithms being implemented requiring forward ranges when they could (also) work with basic input ranges. Overconstraining input is not specific to ranges, though.

What Jonathan never mentioned in his proposal, but something that, as far as I see, is true: One can wrap any forward range to be an input range. Heck, the pointer to any forward range is an input range, assuming there is a global next(ForwardRange*) function utilizing the forward range primitives. If you pass &forwardRange to an input range algorithm, it may copy it around as it pleases, it will change the underlying range just like an input range should; and you’d have the & for an indication that it does.

A problem I see with this is composition. Requiring the lifting to a pointer requires lvalues, and composition lives on not having those. The issue will generally be negligible, as most range algorithms returning a range will for the most part want to preserve as many of the abilities of the original range as possible. The difference in the algorithm would be that, instead of:

// implement input range
static if (/* base range is forward range */)
    // implement forward range
static if (/* base range is bidi range */
    // implement bidi range
// etc.

You would have:

static if (/* base range is input range */)
    // implement input range
else:
static if (/* base range is forward range */)
    // implement forward range
static if (/* base range is bidi range */)
    // implement bidi range
// etc.

January 23

Re: Range Redesign: Copy Semantics

Posted by Quirin Schroll
in reply to Ben Jones

Permalink

Quirin Schroll

Posted in reply to Ben Jones

Permalink

On Monday, 22 January 2024 at 05:54:59 UTC, Ben Jones wrote:

On Sunday, 21 January 2024 at 22:40:55 UTC, Paul Backus wrote:

If your concern is aesthetics, dereferencing a pointer or unboxing a Nullable is far less ugly than passing a callback.

I've been writing a fair amount of Kotlin (an a tiny bit of Swift) and the trailing lambda syntax is very nice for making these APIs look nice. Bascially, if the last argument to a function is a function, you can put it's curly braces after the parentheses: range.map{ it*2}.filter{ it < 10}

The sauce that Kotlin has that D doesn't is that 1-arg lambdas get an implicit parameter called it and explicit parameters come after the opening curly brace: { p1, p2 -> body }. I'm also not sure how it would work with templates. Generally, I think it's much easier to think about lifetime guarantees with functional APIs, so something like that might make them more palatable syntactically.

This is off-topic, but I want to mention that the implicit it confused the hell out of me when I learned Kotlin. The feature literally saves measly 6 keystrokes for it -> , 4 if you ignore spaces. In D, map!(it => it * 2) IMO, is pretty good (e.g. compared to C++). D’s lambdas have their flaws, but lack of brevity isn’t one.

January 23

Re: Range Redesign: Copy Semantics

Posted by Steven Schveighoffer
in reply to Paul Backus

Permalink

Steven Schveighoffer

Posted in reply to Paul Backus

Permalink

On Monday, 22 January 2024 at 17:18:56 UTC, Paul Backus wrote:

On Monday, 22 January 2024 at 15:41:35 UTC, Atila Neves wrote:

I think that .save was a historical mistake, and that ranges that can be copied are forward ranges. Something like a range reading from stdin or a socket would/should disable the copy/postblit constructors.

This was my initial thought too, but it turns out it's not quite that simple.

If we require basic input ranges to be non-copyable, then types which are inherently copyable (in particular, T[]) will be exclude from being basic input ranges.

You are thinking about this the wrong way. Copyability becomes a trait of forward ranges, instead of input ranges. An input range would not be copyable, therefore code which takes input ranges would have to take them by reference, or the caller would have to ensure no other copies exist. But you can still take forward ranges this way.

It is not hard to do -- you just remove the requirement for copying from isInputRange, and add it to isForwardRange.

In order to use a T[] with an algorithm that expects a basic input range, you would have to wrap it in a struct that has copying disabled.

A T[] can be used without copying it. So why can't you pass that into an algorithm that takes an input range?

It's also a big problem for composability. If half the algorithms require your range to be copyable, and the other half require it to be non-copyable, you have to keep track of [what color each algorithm is][1] whenever you write a range pipeline, or write a new algorithm that calls an existing one internally.

The assertion is that if the range is copyable, it's now a forward range and copying is allowed. It is on you as the range developer to ensure copying does not work if it's an input range. The algorithm that says isInputRange would have to avoid copying, but that also works for forward ranges as well. There would be no need for multiple colors.

Now, in practical terms, algorithms that are input-range compatible likely would need more careful attention to avoid copies than they currently do.

-Steve

January 23

Re: Range Redesign: Copy Semantics

Posted by Paul Backus
in reply to Steven Schveighoffer

Permalink

Paul Backus

Posted in reply to Steven Schveighoffer

Permalink

On Tuesday, 23 January 2024 at 17:25:47 UTC, Steven Schveighoffer wrote:

It is not hard to do -- you just remove the requirement for copying from isInputRange, and add it to isForwardRange.

[...]

The assertion is that if the range is copyable, it's now a forward range and copying is allowed. It is on you as the range developer to ensure copying does not work if it's an input range.

The only problem with this is that what you are asking the range developer to do is impossible. :)

Consider what would happen if the user creates a pointer to a non-copyable input range. The pointer (a) is inherently copyable and (b) implements the range interface (via implicit dereferencing), but (c) the copies are not independent.

(If your response to this is "ok, we'll just have isForwardRange explicitly reject pointers", keep in mind that this problem also applies to generic wrapper types that use alias this or opDispatch. Are we really going to tell our users that it's their fault if their code breaks because they put a range pointer inside one of these types?)

So, there has to be some other way of distinguishing forward ranges from input ranges, beyond just checking whether they can be copied. For example, by using next() for input ranges vs. head()/tail() for forward ranges. (As you pointed out earlier, reusing the empty/front/popFront API would not be a great idea.)

Once you have this other method of distinguishing input and forward ranges, requiring forward ranges to be copyable no longer implies that pure input ranges MUST be non-copyable. So we can allow "pointer-to-input-range" to be an input range itself, even though it supports copying, and nothing will break.

January 24

Re: Range Redesign: Copy Semantics

Posted by Nick Treleaven
in reply to Paul Backus

Permalink

Nick Treleaven

Posted in reply to Paul Backus

Permalink

On Tuesday, 23 January 2024 at 18:21:47 UTC, Paul Backus wrote:

isForwardRange could require the range type to have empty, front, popFront defined. Not sure if that solves your alias this/opDispatch points. This would mean arrays would need the range methods built-in rather than using UFCS.

January 24

Re: Range Redesign: Copy Semantics

Posted by Paul Backus
in reply to Nick Treleaven

Permalink

Paul Backus

Posted in reply to Nick Treleaven

Permalink

On Wednesday, 24 January 2024 at 12:26:43 UTC, Nick Treleaven wrote:

On Tuesday, 23 January 2024 at 18:21:47 UTC, Paul Backus wrote:

It could, but I don't think it's a great idea, because it would make ranges a special case. Along with all of the usual reasons special cases are bad, ranges are one of the first things new D programmers are going to encounter, and are pretty much the canonical example of introspection and structural typing in D. If we make ranges behave differently, users who learn from them are going to have a much harder time understanding every other use of introspection.

January 24

Re: Range Redesign: Copy Semantics

Posted by monkyyy
in reply to Paul Backus

Permalink

monkyyy

Posted in reply to Paul Backus

Permalink

On Tuesday, 23 January 2024 at 00:57:49 UTC, Paul Backus wrote:
> 
>     $ grep 'isForwardRange' std/algorithm/*.d | wc -l
>     141
>     $ grep 'isBidirectionalRange' std/algorithm/*.d | wc -l
>     42
>     $ grep 'isRandomAccessRange' std/algorithm/*.d | wc -l
>     62
>
> So, I would say that your hypothesis is not supported by the data, in this case.

Oh, I hope someone pays attension and makes arrays more important then doubly linked lists

January 24

Re: Range Redesign: Copy Semantics

Posted by Steven Schveighoffer
in reply to Paul Backus

Permalink

Steven Schveighoffer

Posted in reply to Paul Backus

Permalink

On Tuesday, 23 January 2024 at 18:21:47 UTC, Paul Backus wrote:

On Tuesday, 23 January 2024 at 17:25:47 UTC, Steven Schveighoffer wrote:

It is not hard to do -- you just remove the requirement for copying from isInputRange, and add it to isForwardRange.

[...]

The assertion is that if the range is copyable, it's now a forward range and copying is allowed. It is on you as the range developer to ensure copying does not work if it's an input range.

The only problem with this is that what you are asking the range developer to do is impossible. :)

Wouldn't that be on the person who wrapped the type? I wouldn't expect RefCounted to care that it's wrapping a range for instance.

We could explicitly forbid pointers and classes, because you can't obviously implement value semantics using those.

Indeed, since copying is defined by default, it's more onorous to properly create an input range. But so what? Can I tell you how many times people copy an input range without realizing the consequences today? At least if it's explicit that input ranges should not be copyable, then you will be more aware of the issue as the compiler complains about it.

So, there has to be some other way of distinguishing forward ranges from input ranges, beyond just checking whether they can be copied. For example, by using next() for input ranges vs. head()/tail() for forward ranges. (As [you pointed out earlier][1], reusing the empty/front/popFront API would not be a great idea.)

Yeah, I agree that if we change semantics, we have to change the naming, for the sake of all existing code. But the naming could be consistent between input/forward ranges as it is today. I'm not keen on the head/tail idea, as it seems way more clunky than I would like, and I'm not sold on the compiler making sure the x = x.tail operation is going to be efficient.

I don't know how bad it would be to deal with, it's not something I have put a lot of consideration into, but I had thought that this mechanism of preventing copying for things that shouldn't be used from multiple copies sounds like the proper way to prevent problems by default. Sometimes it's good to start with what is simple and correct, even if it's not what you want, and see if you can massage it into what makes sense.

-Steve

January 24

Re: Range Redesign: Copy Semantics

Posted by Paul Backus
in reply to Steven Schveighoffer

Permalink

Paul Backus

Posted in reply to Steven Schveighoffer

Permalink

On Wednesday, 24 January 2024 at 16:18:11 UTC, Steven Schveighoffer wrote:

On Tuesday, 23 January 2024 at 18:21:47 UTC, Paul Backus wrote:

Wouldn't that be on the person who wrapped the type? I wouldn't expect RefCounted to care that it's wrapping a range for instance.

Yes, it would--which is exactly what I'd like to avoid. I don't want us to put ourselves in a position where we have to tell end users "you're holding it wrong" if they try to combine two generic library features like this.

In general, we should not make assumptions in our library code unless we are capable of enforcing them. Since we are, unfortunately, not capable of enforcing the assumption that all copyable ranges have forward-range semantics, we should not allow ourselves make it.

(Perhaps there's an opportunity for a language feature here? A unique(T) qualifier would help with this, for example.)

I agree that permitting input ranges to be copyable puts more of a burden on the authors of range algorithms. However, given that there are edge cases where the burden of preventing copies must fall on someone other than the range's author, I would rather that person be the author of the generic algorithm than the end user.

My hope is that in practice, if Phobos itself contains some prominent examples of non-copyable ranges, these bugs will be found quickly during testing. That's obviously not as good as a reliable compile-time error, but it's not nothing.

For what it's worth, I agree that in practice, most input ranges should be non-copyable. That readf bug we discussed a little while ago would never have existed if LockingTextReader had been non-copyable from the start, for example.

January 24

Re: Range Redesign: Copy Semantics

Posted by Paul Backus
in reply to Paul Backus

Permalink

Paul Backus

Posted in reply to Paul Backus

Permalink

On Wednesday, 24 January 2024 at 17:35:32 UTC, Paul Backus wrote:

Having thought about it some more, this is a bad argument, and I withdraw it.

The entire range API is based on convention to begin with. Of course there's no enforcement mechanism!

Top | Forum index | About this forum

Forums