May 31, 2016
On Tuesday, 31 May 2016 at 18:11:34 UTC, Steven Schveighoffer wrote:
>> 1) Current definition of input range (most importantly, the fact `front`
>> has to be @property-like) implies `front` to always return the same
>> result until `popFront` is called.
>
> Regardless of property-like or not, this should be the case. Otherwise, popFront makes no sense.

Except it isn't in many cases you call "bugs" :(

>> 2) For ranges that call predicates on elements to evaluate next element
>> this can only be achieved by caching - predicates are never required to
>> be pure.
>
> Or, by not returning different things from your predicate.

It is perfectly legal for predicate to be non-pure and that would be hugely annoying if anyone decided to prohibit it. Also even pure predicates may be simply very expensive to evaluate which can make `front` a silent pessimization.

> This is like saying RedBlackTree is broken when I give it a predicate of "a == b".

RBL at least makes certain demands about valid predicate can be. This is not case for ranges in general.

>> 3) But caching is sub-optimal performance wise and thus bunch of Phobos
>> algorithms violate `front` consistency / cheapness expectation
>> evaluating predicates each time it is called (liked map).
>
> I don't think anything defensively caches front in case the next call to front is different, unless that's specifically the reason for the range.

And that makes input ranges violate implication #1 (front stability) casually to the point it can't be relied at all and one has to always make sure it is only evaluated once (make stack local copy or something like that).


> I think we should be aware that the range API doesn't prevent bugs of all kinds. There's only so much analysis the compiler can do.

This is a totally valid code I want to actually work and not be discarded as "bug".
May 31, 2016
On 31.05.2016 22:59, Dicebot wrote:
>
>
>
>> I think we should be aware that the range API doesn't prevent bugs of
>> all kinds. There's only so much analysis the compiler can do.
>
> This is a totally valid code I want to actually work and not be
> discarded as "bug".

map often allows random access. Do you suggest it should cache opIndex too?
May 31, 2016
On Tuesday, 31 May 2016 at 21:25:12 UTC, Timon Gehr wrote:
> On 31.05.2016 22:59, Dicebot wrote:
>>
>>
>>
>>> I think we should be aware that the range API doesn't prevent bugs of
>>> all kinds. There's only so much analysis the compiler can do.
>>
>> This is a totally valid code I want to actually work and not be
>> discarded as "bug".
>
> map often allows random access. Do you suggest it should cache opIndex too?

Random access map must store all already evaluated items in memory in mu opinion.
May 31, 2016
On 5/31/16 4:59 PM, Dicebot wrote:
> On Tuesday, 31 May 2016 at 18:11:34 UTC, Steven Schveighoffer wrote:
>>> 1) Current definition of input range (most importantly, the fact `front`
>>> has to be @property-like) implies `front` to always return the same
>>> result until `popFront` is called.
>>
>> Regardless of property-like or not, this should be the case.
>> Otherwise, popFront makes no sense.
>
> Except it isn't in many cases you call "bugs" :(

If you want to use such "ranges", the compiler will not stop you. Just don't expect any help from Phobos.

>>> 2) For ranges that call predicates on elements to evaluate next element
>>> this can only be achieved by caching - predicates are never required to
>>> be pure.
>>
>> Or, by not returning different things from your predicate.
>
> It is perfectly legal for predicate to be non-pure and that would be
> hugely annoying if anyone decided to prohibit it. Also even pure
> predicates may be simply very expensive to evaluate which can make
> `front` a silent pessimization.

There's no requirement or need to prevent it. Just a) don't do it, or b) deal with the consequences.

>
>> This is like saying RedBlackTree is broken when I give it a predicate
>> of "a == b".
>
> RBL at least makes certain demands about valid predicate can be. This is
> not case for ranges in general.

RedBlackTree with "a == b" will compile and operate. It just won't do much red-black-tree-like things.

>>> 3) But caching is sub-optimal performance wise and thus bunch of Phobos
>>> algorithms violate `front` consistency / cheapness expectation
>>> evaluating predicates each time it is called (liked map).
>>
>> I don't think anything defensively caches front in case the next call
>> to front is different, unless that's specifically the reason for the
>> range.
>
> And that makes input ranges violate implication #1 (front stability)
> casually to the point it can't be relied at all and one has to always
> make sure it is only evaluated once (make stack local copy or something
> like that).

That's a little much. If you expect such things, you can construct them. There's no way for the functions to ascertain what your lambda is going to do (halting problem) and determine to cache or not based on that.

>> I think we should be aware that the range API doesn't prevent bugs of
>> all kinds. There's only so much analysis the compiler can do.
>
> This is a totally valid code I want to actually work and not be
> discarded as "bug".

Then it's not a bug? It's going to work just fine how you specified it. I just don't consider it a valid "range" for general purposes.

You can do this if you want caching:

only(0).map!(x => uniform(0, 10)).cache

-Steve
June 01, 2016
On Tuesday, 31 May 2016 at 18:31:05 UTC, Steven Schveighoffer wrote:
> On 5/31/16 11:45 AM, Jonathan M Davis via Digitalmars-d wrote:
>> On Monday, May 30, 2016 09:57:29 H. S. Teoh via Digitalmars-d wrote:
>>> I'd argue that range-based generic code that assumes non-transience is
>>> inherently buggy, because generic code ought not to make any
>>> assumptions beyond what the range API guarantees. Currently, the range
>>> API does not guarantee non-transience, therefore code that assumes so is
>>> broken by definition.  Just because they happen to work most of the time
>>> does not change the fact that they're written wrongly.
>>
>> Technically, the range API doesn't even require that front return the same
>> value every time that it's called, because isInputRange can't possibly test
>> for it.
>
> The API doesn't require it mechanically, but the API does require it semantically (what does popFront mean if front changes automatically?). If front returns different things, I'd say that's a bug in your range construction.

The `Generator` range is an eager violator of this requirement:
https://github.com/dlang/phobos/blob/ca292ff78cd825f642eb58d586e2723ba14ae448/std/range/package.d#L3075-L3080

... although I'd agree that's an implementation error.
June 01, 2016
On Tuesday, 31 May 2016 at 12:42:23 UTC, Steven Schveighoffer wrote:

> There are 2 main issues with FILE *:
>
> 1) it does not provide buffer access, so you must rely on things like getline if they exist. But these have their own problems (i.e. do not support unicode, require C-malloc'd buffer)
>
You can cheat by using setvbuf() and imposing your own buffer to the FILE* routine. What and how the underlying implementation put in that buffer is of course not documented but not very difficult to guess (for example fseek()/fread() will always fill the buffer from 0 to the end (of buffer or file depending what come first).
June 01, 2016
On 6/1/16 8:49 AM, Joseph Rushton Wakeling wrote:
> On Tuesday, 31 May 2016 at 18:31:05 UTC, Steven Schveighoffer wrote:
>> On 5/31/16 11:45 AM, Jonathan M Davis via Digitalmars-d wrote:
>>> On Monday, May 30, 2016 09:57:29 H. S. Teoh via Digitalmars-d wrote:
>>>> I'd argue that range-based generic code that assumes non-transience is
>>>> inherently buggy, because generic code ought not to make any
>>>> assumptions beyond what the range API guarantees. Currently, the range
>>>> API does not guarantee non-transience, therefore code that assumes
>>>> so is
>>>> broken by definition.  Just because they happen to work most of the
>>>> time
>>>> does not change the fact that they're written wrongly.
>>>
>>> Technically, the range API doesn't even require that front return the
>>> same
>>> value every time that it's called, because isInputRange can't
>>> possibly test
>>> for it.
>>
>> The API doesn't require it mechanically, but the API does require it
>> semantically (what does popFront mean if front changes
>> automatically?). If front returns different things, I'd say that's a
>> bug in your range construction.
>
> The `Generator` range is an eager violator of this requirement:
> https://github.com/dlang/phobos/blob/ca292ff78cd825f642eb58d586e2723ba14ae448/std/range/package.d#L3075-L3080
>
>
> .... although I'd agree that's an implementation error.

Yeah, that seems like a bug.

-Steve
June 01, 2016
On 6/1/16 10:05 AM, Patrick Schluter wrote:
> On Tuesday, 31 May 2016 at 12:42:23 UTC, Steven Schveighoffer wrote:
>
>> There are 2 main issues with FILE *:
>>
>> 1) it does not provide buffer access, so you must rely on things like
>> getline if they exist. But these have their own problems (i.e. do not
>> support unicode, require C-malloc'd buffer)
>>
> You can cheat by using setvbuf() and imposing your own buffer to the
> FILE* routine. What and how the underlying implementation put in that
> buffer is of course not documented but not very difficult to guess (for
> example fseek()/fread() will always fill the buffer from 0 to the end
> (of buffer or file depending what come first).

But there is no mechanism to determine where the current file pointer is inside the buffer. One could use ftell vs. tell on the file descriptor, but that is going to perform quite poorly.

-Steve
June 02, 2016
On 6/1/16 8:37 PM, Steven Schveighoffer wrote:
> On 6/1/16 8:49 AM, Joseph Rushton Wakeling wrote:
>> The `Generator` range is an eager violator of this requirement:
>> https://github.com/dlang/phobos/blob/ca292ff78cd825f642eb58d586e2723ba14ae448/std/range/package.d#L3075-L3080
>>
>>
>>
>> .... although I'd agree that's an implementation error.
>
> Yeah, that seems like a bug.

I guess this controversy was hashed out when this function was added. So this is very much intentional:

https://github.com/dlang/phobos/pull/2606
https://github.com/dlang/phobos/pull/3002

I'll start a new thread to discuss it. I think it needs to be changed.

-Steve
June 03, 2016
On Wednesday, 1 June 2016 at 01:31:53 UTC, Steven Schveighoffer wrote:
> If you want to use such "ranges", the compiler will not stop you. Just don't expect any help from Phobos.

It only strengthens my opinion that Phobos is not a standard library I want. Really, many of those issue would have been solved if basic input range was defined as `empty` + `ElementType popFront()` instead. Some more - if algorithms didn't try to preserve original range kind unless they can do it with no overhead (i.e. arrray.map should not be a random access range out of the box).

>> This is a totally valid code I want to actually work and not be
>> discarded as "bug".
>
> Then it's not a bug? It's going to work just fine how you specified it. I just don't consider it a valid "range" for general purposes.
>
> You can do this if you want caching:
>
> only(0).map!(x => uniform(0, 10)).cache

Good advice. Don't want bugs with non-stable results and accidental double I/O in your long idiomatic range pipeline? Just put "cache" calls everywhere just to be safe, defensive programming for the win!

Existing situation is bug prone exactly because you have no idea if you need to cache or not unless you try out specific combination of algorithms / predicates / input and carefully check what it does.