November 04, 2021

On Thursday, 4 November 2021 at 04:06:15 UTC, Andrei Alexandrescu wrote:

>

I think it can be made to work, but for lvalue ranges only, and it will be difficult to make safe (scoped etc).

Overall this seems to create more problems than it solves.

Yeah, I think we should keep the front/popFront/empty scheme. Not because it's necessarily better, but because there's always a high risk for scope creep and second-system effect when doing projects like Phobos v2. Even discarding autodecoding and isForwardRange will be a lot of work already, let's not bite more than we can swallow.

November 04, 2021

On Thursday, 4 November 2021 at 10:45:25 UTC, Dukc wrote:

>

On Thursday, 4 November 2021 at 04:06:15 UTC, Andrei Alexandrescu wrote:

>

I think it can be made to work, but for lvalue ranges only, and it will be difficult to make safe (scoped etc).

Overall this seems to create more problems than it solves.

Yeah, I think we should keep the front/popFront/empty scheme. Not because it's necessarily better, but because there's always a high risk for scope creep and second-system effect when doing projects like Phobos v2. Even discarding autodecoding and isForwardRange will be a lot of work already, let's not bite more than we can swallow.

I agree that this is definitely not a v2 proposal--more like v3 or v4. But I do think it should be on the roadmap.

November 04, 2021

On Thursday, 4 November 2021 at 12:59:52 UTC, Paul Backus wrote:

>

I agree that this is definitely not a v2 proposal--more like v3 or v4. But I do think it should be on the roadmap.

Ah, that's more reasonable. Not saying I agree but at least I disagree much less.

November 04, 2021
On 11/4/21 12:43 AM, Paul Backus wrote:
> On Thursday, 4 November 2021 at 04:06:15 UTC, Andrei Alexandrescu wrote:
>> On 11/3/21 11:25 PM, Paul Backus wrote:
>>>
>>> If we want to avoid copying, we can have `next` return a `Ref!T` in the case where the forward range has lvalue elements:
>>>
>>> struct Ref(T)
>>> {
>>>      T* ptr;
>>>
>>>      ref inout(T) deref() inout
>>>      {
>>>          return *ptr;
>>>      }
>>>      alias deref this;
>>> }
>>>
>>> I've tested some simple uses of this wrapper on run.dlang.io, and it seems like DIP 1000 is good enough to make it work in @safe code.
>>>
>>> If "returns either `T` or `Ref!T`" sounds like a suspect design for an API, consider that it is basically the same thing as an `auto ref` return value--just with the distinction between ref and non-ref brought inside the type system.
>>
>> That was on the table, too, in the form of a raw pointer.
>>
>> I think it can be made to work, but for lvalue ranges only, and it will be difficult to make safe (scoped etc).
>>
>> Overall this seems to create more problems than it solves.
> 
> I'd be curious to see any examples of such problems you have in mind.
> 
> As far as I'm aware, no special effort should be required to make this @safe, aside from enabling -preview=dip1000 (which, granted, is still a work in progress).

Pointers are problematic because of aliasing and lifetime (what if the pointer survives the data structure it points into). So the `Ref` structs needs to be qualified appropriately with `scope`. So yes DIP1000 would need to be tight.

Usability is another matter that hasn't been quite looked at. Once you have a scoped pointer wrapper, what can and what can't you do with it easily? I'm not very sure.

Alias this is just poorly done. I think we shouldn't base a fundamental API on it.

Anyway, I'm cautiously optimistic. At the very least this should be explored.

Note that the whole thing still doesn't address unbuffered ranges. There must be a buffer of at least one element somewhere. That's... problematic.

November 04, 2021
On Thursday, 4 November 2021 at 15:29:59 UTC, Andrei Alexandrescu wrote:
> Usability is another matter that hasn't been quite looked at. Once you have a scoped pointer wrapper, what can and what can't you do with it easily? I'm not very sure.
>
> Alias this is just poorly done. I think we shouldn't base a fundamental API on it.

Both good points. It will take some experimentation to find out where the rough edges of this approach are, and whether they can be adequately sanded down.

> Anyway, I'm cautiously optimistic. At the very least this should be explored.
>
> Note that the whole thing still doesn't address unbuffered ranges. There must be a buffer of at least one element somewhere. That's... problematic.

Unbuffered ranges will return `Option!T` from `next`, rather than `Option!(Ref!T)`.

Again, this is the same distinction we already have between rvalue `front` and lvalue `front`, so I don't think the inconsistency is a problem, as long as we can make `Ref!T` function as a subtype of `T` (either via `alias this` or some more principled mechanism).
November 04, 2021
On 2021-11-04 12:39, Paul Backus wrote:
> Again, this is the same distinction we already have between rvalue `front` and lvalue `front`

That reminds me, we should drop that like a bad habit too :o).

Currently ranges have all sorts of weird, random genericity. Recalling from memory (perhaps/hopefully some of these have been fixed):

- At least at some point `empty` did not have to return bool, just something convertible to bool. Like immutable(bool).

- For a while we had a lively discussion about length returning ulong instead of size_t (relevant on 32-bit).

- front could return pretty much what it damn well pleased, including qualified data, rvalues vs lvalues, noncopyable stuff, etc.

- Thinking how inout interacts with everything ranges is just depressing.

- I seem to recall there was at least one popFront that returned something meaningful. (Maybe that's not too disruptive.)

Based on past experience we could and should simplify the range interface in places where genericity has little value and the implementation effort is high.
November 04, 2021
On Thu, Nov 04, 2021 at 06:38:30PM -0400, Andrei Alexandrescu via Digitalmars-d wrote:
> On 2021-11-04 12:39, Paul Backus wrote:
> > Again, this is the same distinction we already have between rvalue `front` and lvalue `front`
> 
> That reminds me, we should drop that like a bad habit too :o).
> 
> Currently ranges have all sorts of weird, random genericity. Recalling from memory (perhaps/hopefully some of these have been fixed):

Yeah, we need to get rid of useless genericity, and also exactly what is expected of range operations should be stated clearly and unambiguously in the API docs.  The current range API suffers from insufficient clarity, so many such cases went "under the radar" and inevitably ended up being implemented when some kind soul decided that it would be nice to support this or that niche case.


> - At least at some point `empty` did not have to return bool, just something convertible to bool. Like immutable(bool).

Yeah, .empty should return bool, and only bool.  Not immutable(bool), not something that alias this to bool, none of that sort.

Also, the spec should specify precisely whether .empty must be a function (and whether it should be a member function, a free function, or both), or it's allowed to be a member variable.  Currently in my own code I have a few cases where .empty is a variable rather than a function. It hasn't run into any problems yet so far, but things like this must be explicitly stated, otherwise somebody will inevitably write code that assumes one way or the other, and break things for no good reason.


> - For a while we had a lively discussion about length returning ulong instead of size_t (relevant on 32-bit).

Whichever way we decide, this should be specified clearly and not left up to interpretation.


> - front could return pretty much what it damn well pleased, including qualified data, rvalues vs lvalues, noncopyable stuff, etc.

Yeah, this has been especially troublesome.  I think we should specify
exactly what type(s) and qualifier(s) are permitted to be returned from
.front.

Don't forget transient values returned by .front that are invalidated by the next call to .popFront (e.g., std.stdio.File.byLine, which reuses the line buffer).  The range API needs to explicitly state whether .popFront is allowed to do this, and if it is allowed, range algorithms that attempt to cache .front past the next invocation to .popFront must be rewritten.  (This used to be a pretty big problem, but I think we've fixed most of the cases in Phobos by now. But it still turns its ugly head up every now and then in user code that makes wrong assumptions about the lifetime of the value returned by .front.)


> - Thinking how inout interacts with everything ranges is just depressing.

inout is the source of all kinds of nastiness in the language. It's a cute hack that works for the trivial cases, but once you combine it with other language features it's a mess. Consider this:

	inout T myFunc(T)(inout T delegate(inout T t) dg, inout T u) {...}

Does inout apply to the return value of dg, dg itself, or both? How does it interact with the inout on the function's return value?  How exactly does inout on t interact with the delegate's inout return, and how do they correlate with the inout of the outer function?  This is just one of many cases of ambiguity; it's not hard to construct other examples. In short, it's a mess.

And don't forget that inout behaves like const inside the function body, but when passed as a template argument triggers a different instantiation (template bloat).

And trying to work with inout in generic code where you have to deal with arbitrary incoming type qualifiers is an exercise in pain.

I think we should just flat out *not* support inout in ranges.


> - I seem to recall there was at least one popFront that returned something meaningful. (Maybe that's not too disruptive.)

It should be mandated by spec to return void.


> Based on past experience we could and should simplify the range interface in places where genericity has little value and the implementation effort is high.

+1.

Plus, the *exact* expectations of the various range functions should be spelled out in clear, unambiguous terms.  Such as ref or non-ref, const or mutable, function or member variable (or free function), transient .front or not, copyable or not, what exactly .popFront returns, etc.. There must be no room left for interpretation except where explicitly allowed.  Leave any small detail unspecified, and we can almost be guaranteed to be bitten by it later.

Best spell out the exact permitted function signatures and types with list of allowed qualifiers to leave no room for misinterpretation.


T

-- 
I am not young enough to know everything. -- Oscar Wilde
November 05, 2021
On Thursday, 4 November 2021 at 15:29:59 UTC, Andrei Alexandrescu wrote:
> On 11/4/21 12:43 AM, Paul Backus wrote:
>> On Thursday, 4 November 2021 at 04:06:15 UTC, Andrei Alexandrescu wrote:
>>> On 11/3/21 11:25 PM, Paul Backus wrote:
>>>> [...]
>>>
>>> That was on the table, too, in the form of a raw pointer.
>>>
>>> I think it can be made to work, but for lvalue ranges only, and it will be difficult to make safe (scoped etc).
>>>
>>> Overall this seems to create more problems than it solves.
>> 
>> I'd be curious to see any examples of such problems you have in mind.
>> 
>> As far as I'm aware, no special effort should be required to make this @safe, aside from enabling -preview=dip1000 (which, granted, is still a work in progress).
>
> Pointers are problematic because of aliasing and lifetime (what if the pointer survives the data structure it points into).

T* should mean infinite lifetime by default in @safe code: where did you get that pointer to begin with? If a struct contains a T* within it, then scoped struct variables solve the lifetime issue that way. Aliasing, however, is a problem we still have. Which is why we can't currently write a @safe vector.

> Note that the whole thing still doesn't address unbuffered ranges. There must be a buffer of at least one element somewhere. That's... problematic.

Yeah, I'm still wondering how to fix that.

November 05, 2021
On Thursday, 4 November 2021 at 23:30:05 UTC, H. S. Teoh wrote:
> On Thu, Nov 04, 2021 at 06:38:30PM -0400, Andrei Alexandrescu via Digitalmars-d wrote:
>> On 2021-11-04 12:39, Paul Backus wrote:
>> > Again, this is the same distinction we already have between rvalue `front` and lvalue `front`
>> 
>> That reminds me, we should drop that like a bad habit too :o).
>> 
>> Currently ranges have all sorts of weird, random genericity. Recalling from memory (perhaps/hopefully some of these have been fixed):
>
> Yeah, we need to get rid of useless genericity, and also exactly what is expected of range operations should be stated clearly and unambiguously in the API docs.  The current range API suffers from insufficient clarity, so many such cases went "under the radar" and inevitably ended up being implemented when some kind soul decided that it would be nice to support this or that niche case.

Sometimes genericity is a good thing. Take C++, where range for was originally specified in C++11 such that the begin and end iterators had to be the same type, which on the face it seems to makes sense. But then that was found out to be overly constraining, and to be able to add ranges to C++17 they had to change the definition of a range for loop so that end only had to be comparable to begin and could be a different type.

>> - At least at some point `empty` did not have to return bool, just something convertible to bool. Like immutable(bool).
>
> Yeah, .empty should return bool, and only bool.  Not immutable(bool), not something that alias this to bool, none of that sort.
>
> Also, the spec should specify precisely whether .empty must be a function (and whether it should be a member function, a free function, or both), or it's allowed to be a member variable.

Similarly to what I said above, I don't think the spec should do this at all. Plasticity is what D is good at, and leaving it to "range.empty is a bool" is, IMHO, far better. I *love* not using parens for functions with no args and being able to use a function/variable/enum, then being able to change that and not have to touch the rest of the code at all.


November 09, 2021
On Fri, Nov 05, 2021 at 11:43:01AM +0000, Atila Neves via Digitalmars-d wrote:
> On Thursday, 4 November 2021 at 23:30:05 UTC, H. S. Teoh wrote:
[...]
> > Yeah, we need to get rid of useless genericity, and also exactly what is expected of range operations should be stated clearly and unambiguously in the API docs.  The current range API suffers from insufficient clarity, so many such cases went "under the radar" and inevitably ended up being implemented when some kind soul decided that it would be nice to support this or that niche case.
> 
> Sometimes genericity is a good thing. Take C++, where range for was originally specified in C++11 such that the begin and end iterators had to be the same type, which on the face it seems to makes sense. But then that was found out to be overly constraining, and to be able to add ranges to C++17 they had to change the definition of a range for loop so that end only had to be comparable to begin and could be a different type.

Genericity is definitely a good thing -- when it doesn't lead to the slippery slope of ever-more-complicated convolutions in the code as a result of trying to cater to every unnatural use case.  The whole point of the range abstraction is to *simplify* code; if simplicity and clarity of code is compromised because of genericity, then we have failed.


[...]
> > Yeah, .empty should return bool, and only bool.  Not immutable(bool), not something that alias this to bool, none of that sort.
> > 
> > Also, the spec should specify precisely whether .empty must be a function (and whether it should be a member function, a free function, or both), or it's allowed to be a member variable.
> 
> Similarly to what I said above, I don't think the spec should do this at all. Plasticity is what D is good at, and leaving it to "range.empty is a bool" is, IMHO, far better. I *love* not using parens for functions with no args and being able to use a function/variable/enum, then being able to change that and not have to touch the rest of the code at all.

I disagree. The spec *should* explicitly state what .empty (or any other range method/identifier) is allowed to be.  If you want more genericity, simply have the spec say ".empty may be either a method or a member field".

This may seem trivial, but it's necessary to prevent things like some Phobos code assuming that .empty is always a method, and then it fails when somebody passes in a range that has a field instead.  Also, on the user-facing side, it prevents spurious bug reports like "how come my custom-made range with non-copyable .empty masqueraded from a nested struct via alias this doesn't pass isInputRange?", which then prompts some well-meaning soul to implement support for this obscure case, thereby adding all kinds of weird fluff to Phobos that really don't belong there.

We want to be able to say to such bug reports, "the spec says .empty can only be method or a bool field, sorry we don't support stuff where .empty is a non-copyable wrapper object that uses alias this to implicitly convert to a value wrapper with an .opCast!bool that returns an immutable(bool) which can then be value-copied onto a bool".

Andrei has said many times that these kinds of obscure cases don't belong to Phobos. If some user wants static arrays to work with ranges, then just write `[]` and be done with it, instead of adding yet another useless feature to Phobos (which inevitably will cause some unexpected poor interaction with another obscure case, and we're stuck in the endless churn of accreting features in Phobos that make it harder to maintain yet does not actually make any progress in improving D code). If somebody wants .empty to be a wrapper struct that uses alias this and .opCast!bool to return an immutable(bool), just have them write a wrapper that uses a function .empty to return a bool.

The fact that user code ended up in such a tangled mess is a sign that something is wrong on *their* side; we should not be promoting bad code practices by supporting such monstrosities in Phobos; we should instead be triggering a compile error so that the user cleans up his act and writes better code.


T

-- 
Insanity is doing the same thing over and over again and expecting different results.
1 2 3 4 5
Next ›   Last »