Jonathan M Davis
Posted in reply to Paul Backus
| On Monday, January 22, 2024 5:57:49 PM MST Paul Backus via Digitalmars-d wrote:
> On Monday, 22 January 2024 at 23:07:28 UTC, Jonathan M Davis
>
> wrote:
> > In practice, basic input ranges don't work with the vast majority of algorithms anyway. Some do (e.g. map), but pretty much anything that needs to do more than a simple transformation on the elements ends up needing a forward range. I really don't think that we're going to lose much by forcing basic input ranges to be separate.
>
> Here's a very rough approximation of the number of algorithms in Phobos that work with basic input ranges:
>
> $ grep 'isInputRange' std/algorithm/*.d | wc -l
> 153
>
> For comparison, here are the other range types:
>
> $ grep 'isForwardRange' std/algorithm/*.d | wc -l
> 141
> $ grep 'isBidirectionalRange' std/algorithm/*.d | wc -l
> 42
> $ grep 'isRandomAccessRange' std/algorithm/*.d | wc -l
> 62
>
> So, I would say that your hypothesis is not supported by the data, in this case.
I will admit that the number is higher than I would have expected, but most of those algorithms are on the simple side where the algorithm itself doesn't care whether the range is consumed, whereas the caller very much cares. They were written to accept basic input ranges, because the algorithm itself will work with a basic input range, but that doesn't mean that it can actually be used with basic input ranges to do anything useful. For instance, you can use startsWith on a basic input range, but you pretty much never would, because then you've consumed it - and actually, it's not even safe to use the rest of the range in generic code after passing it to startsWith, because it was copied, and the original could be in an invalid state. The same goes with any function that takes an argument and gives you a result that doesn't wrap the input range and return it or return a partially iterated range.
In my experience, it's very rare that it's possible to make anything that isn't incredibly simple work with a basic input range due to the fact that looking at it consumes it and makes it impossible to then do anything useful with it. And it's easy to write simple algorithms which will technically work with a basic input range but which aren't actually useful with a basic input range.
So, sure, a surprising number of the algorithms in std.algorithm work with isInputRange, but a number of them really shouldn't be used with basic input ranges. Now, a few of them might be more useful if you could rely on a basic input range having reference semantics, but their composibility in general is pretty poor.
Either way, I think that the gain from being able to require specific copy semantics for both forward ranges and basic input ranges is worth it. And since such a requirement couldn't actually be enforced at compile time unless we made it impossible to have basic input ranges that were structs (which would then make reference counting impossible), then a specific algorithm could choose to add a function to forward ranges via UFCS and support them if it didn't care about the copy semantics, but like with @trusted code, I think that it should then be up to the programmer to make sure that it does the right thing so that basic input ranges can be required to have reference semantics.
> Foreach works if the elements are rvalues, or the elements are lvalues and you use ref for the loop variable. This is the correct, expected behavior.
Well, I couldn't get it to work when I tried, but either way, it means that you can't use foreach on a range of non-copyable types in generic code, because the way that you use foreach would have to be the same regardless of the elment type. Of course, you could use static if and branch the code based on whether the types are copyable or not, but that's yet another complication to range implementations which are already often far too complicated with too many static if branches.
> > trying to support non-copyable types will actively make range-base code worse in many cases, because it will require calling front multiple times, and many range-based algorithms do real work in front rather than simply returning a value (e.g. map works that way).
>
> Have you ever actually written generic code that works with
> non-copyable types? I have, and it mostly involves (a) passing
> things by ref, (b) using 'auto ref' and core.lifetime.forward a
> lot, and (c) checking isCopyable!T before trying to copy things.
>
> I have no idea why you think this would require calling front multiple times, or doing extra work. Perhaps you could try writing out some example code to clarify?
If you can't rely on front returning something that you can copy, then you're going to have to call front multiple times any time that you need to access front multiple times - either that or call move on it, which you can't necessarily do safely, because that would alter the range. Algorithms that don't have to look at front more than once don't have that problem, but plenty of algorithms need to do multiple things with front. And often, it's much better to just store the result of front rather than call front multiple times, because it's often the case that front does actual work that you'd rather not do over and over again (and it's particularly bad when you have stuff like the result of map where the transformation function it was given allocates on the heap).
So the result of having to worry about non-copyable types is that algorithms that need to access front multiple times are either going to end up calling front multiple times, harming performance in many cases, or they're going to have static if branches to change what they're doing based on the return type of front, which is going to make them that much more complex to implement.
> > Personally, I think that we should make it very explicit that non-copyable types are not supported by ranges. They don't work with them now, and I don't think that it's even vaguely worth it to try to make ranges work with them. The cost is too high for too little benefit.
>
> First, as a general rule, I don't think that we, as standard-library authors, should be in the business of telling our users that their use-cases are illegitimate just because it makes our jobs more difficult.
>
> Second...how do you know the benefit is too little? You thought earlier that "the vast majority of algorithms" didn't work with basic input ranges, even though the actual data shows that isn't true. Unless you have concrete evidence, I think it is better to avoid making assumptions about which use-cases are important and which aren't.
IMHO, it's already way too complicated to write correct range-based code, and adding non-copyable types into the mix is very much a step too far. Obviously, there can be disagreement on that, but it's already very difficult to write range-based code that actually works properly with all types of ranges. You have to test it with a ridiculous number of types, and most people just don't do that. Phobos often doesn't do that. And every capability that we add to ranges means even more static if blocks and even more variations that need to be tested. The only way to fix that is to simplify things by putting greater restrictions and requirements on how ranges work, whereas making them work with a wider variety of things is going in the opposite direction of that.
Realistically, the best that even might happen here for folks who want ranges to work with non-copyable types is that a subset of the algorithms in Phobos will be made to work with them, but third-party libraries are not going to go to that level of effort (and I don't think that it's all reasonable to tell them that they should). And Phobos' range-based code will become even more complicated and that much harder to maintain and make sure that it works correctly. And we already have too much trouble making sure that Phobos' range-based code works correctly.
So, my take on it is that supporting non-copyable types is simply not worth the effort given that it's going to complicate the code and the tests that much more, and they're already too complex.
And so, if it's my choice, then we simply won't support non-copyable types with ranges. Obviously, I'm not the sole voice in this, and ultimately, the choice is going to be up to Walter and Atila (probably Atila given that he's supposed to be the one in charge of Phobos), so I could definitely be overruled on this, but I very much want to see it become easier to write correct range-based code, not have it become harder.
- Jonathan M Davis
|