Transient ranges (page 7) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » Transient ranges (page 7)

June 03, 2016

Re: Transient ranges

Posted by Andrei Alexandrescu
in reply to Dicebot

Andrei Alexandrescu

Posted in reply to Dicebot

On 6/3/16 4:37 AM, Dicebot wrote:
> Really, many of those issue would have been solved if basic input range
> was defined as `empty` + `ElementType popFront()` instead.

FYI that was on the table, along with many other possible designs. (popFront() should be allowed to return by reference, too.) The problem with this one is that it is inefficient with arrays when the same element is accessed multiple times - the user is forced to create a copy or to use other contortions. Arrays are a prime use case for the range interface. -- Andrei

June 03, 2016

Re: Transient ranges

Posted by Dicebot
in reply to Andrei Alexandrescu

Dicebot

Posted in reply to Andrei Alexandrescu

On 06/03/2016 01:49 PM, Andrei Alexandrescu wrote:
> On 6/3/16 4:37 AM, Dicebot wrote:
>> Really, many of those issue would have been solved if basic input range was defined as `empty` + `ElementType popFront()` instead.
> 
> FYI that was on the table, along with many other possible designs. (popFront() should be allowed to return by reference, too.) The problem with this one is that it is inefficient with arrays when the same element is accessed multiple times - the user is forced to create a copy or to use other contortions. Arrays are a prime use case for the range interface. -- Andrei

Has it actually been measured as inefficient? Ranges are very
inline/optimization friendly because of the templated nature of
algorithms - I would expect at least GDC/LDC to take advantage of that
for simple case and for more complicated ones mutiple evaluations of
`front` predicates will dominate performance gain from direct array access.

I guess I need to benchmark it myself and see :)

June 03, 2016

Re: Transient ranges

Posted by Steven Schveighoffer
in reply to Dicebot

Steven Schveighoffer

Posted in reply to Dicebot

On 6/3/16 4:37 AM, Dicebot wrote:
> On Wednesday, 1 June 2016 at 01:31:53 UTC, Steven Schveighoffer wrote:
>> If you want to use such "ranges", the compiler will not stop you. Just
>> don't expect any help from Phobos.
>
> It only strengthens my opinion that Phobos is not a standard library I
> want. Really, many of those issue would have been solved if basic input
> range was defined as `empty` + `ElementType popFront()` instead.

This doesn't solve the problem. empty may not be knowable until you try to fetch the next element (for instance, i/o). A better interface would be bool getNext(ref ElementType), or Nullable!ElementType getNext(), or tuple!(bool, "eof", ElementType, "value") getNext().

I've said many times, i/o is not conducive to ranges. You have to shoehorn it, which is fine if you need compatibility, but it shouldn't be a mechanism of low-level implementation. And you should expect some cruft here because of that.

> Some
> more - if algorithms didn't try to preserve original range kind unless
> they can do it with no overhead (i.e. arrray.map should not be a random
> access range out of the box).

Yes, I can see good reason why you would want this. Hm... a set of adapters which reduce a range to its lesser API would be useful here:

array.asInput.map

But an expectation for map should be that you want it to be exactly the same as the original, but with a transformation applied to each fetch of an element. I think it needs to provide random access if random access is provided to it.

>> Then it's not a bug? It's going to work just fine how you specified
>> it. I just don't consider it a valid "range" for general purposes.
>>
>> You can do this if you want caching:
>>
>> only(0).map!(x => uniform(0, 10)).cache
>
> Good advice. Don't want bugs with non-stable results and accidental
> double I/O in your long idiomatic range pipeline? Just put "cache" calls
> everywhere just to be safe, defensive programming for the win!

Strawman, not what I said.

> Existing situation is bug prone exactly because you have no idea if you
> need to cache or not unless you try out specific combination of
> algorithms / predicates / input and carefully check what it does.

Again, yes, it's still possible to write code with bugs. The compiler or range library can only hold your hand so much.

I don't know of a library that's impossible to use incorrectly. Especially ones that let you execute arbitrary code internally.

-Steve

June 12, 2016

Re: Transient ranges

Posted by Dicebot
in reply to Steven Schveighoffer

Dicebot

Posted in reply to Steven Schveighoffer

On Friday, 3 June 2016 at 12:03:26 UTC, Steven Schveighoffer wrote:
>> It only strengthens my opinion that Phobos is not a standard library I
>> want. Really, many of those issue would have been solved if basic input
>> range was defined as `empty` + `ElementType popFront()` instead.
>
> This doesn't solve the problem. empty may not be knowable until you try to fetch the next element (for instance, i/o). A better interface would be bool getNext(ref ElementType), or Nullable!ElementType getNext(), or tuple!(bool, "eof", ElementType, "value") getNext().

Not necessarily, one can simply define that range processing requires checking `empty` both before and after getting next element (and that returned value is undefined if empty == true).

Though I do agree that returning `Optional!ElementType` would be even better.

> Yes, I can see good reason why you would want this. Hm... a set of adapters which reduce a range to its lesser API would be useful here:
>
> array.asInput.map
>
> But an expectation for map should be that you want it to be exactly the same as the original, but with a transformation applied to each fetch of an element. I think it needs to provide random access if random access is provided to it.

That is matter of design philosophy. For me such basic library primitives warrant C++ attitude of "don't pay for what you don't ask for" - and everything else, including actual feature completeness, is of negligible importance compared to that. If keeping random access for map requires either caching or multiple evaluations of predicate, I want it to be banned by default and enabled explicitly (not sure what could be good API for that though).

Phobos indeed doesn't seem to make such priorities right now and that is one of reasons I am growing increasingly unhappy with it.

>>> Then it's not a bug? It's going to work just fine how you specified
>>> it. I just don't consider it a valid "range" for general purposes.
>>>
>>> You can do this if you want caching:
>>>
>>> only(0).map!(x => uniform(0, 10)).cache
>>
>> Good advice. Don't want bugs with non-stable results and accidental
>> double I/O in your long idiomatic range pipeline? Just put "cache" calls
>> everywhere just to be safe, defensive programming for the win!
>
> Strawman, not what I said.

But that is what your answer meant in context of my original question :) My complaint was about existing semantics are being error-prone and hard to spot - you proposed it by adding more _manual_ fixup, which kills the whole point.

June 13, 2016

Re: Transient ranges

Posted by Steven Schveighoffer
in reply to Dicebot

Steven Schveighoffer

Posted in reply to Dicebot

On 6/12/16 4:46 AM, Dicebot wrote:
> On Friday, 3 June 2016 at 12:03:26 UTC, Steven Schveighoffer wrote:

>> Yes, I can see good reason why you would want this. Hm... a set of
>> adapters which reduce a range to its lesser API would be useful here:
>>
>> array.asInput.map
>>
>> But an expectation for map should be that you want it to be exactly
>> the same as the original, but with a transformation applied to each
>> fetch of an element. I think it needs to provide random access if
>> random access is provided to it.
>
> That is matter of design philosophy. For me such basic library
> primitives warrant C++ attitude of "don't pay for what you don't ask
> for" - and everything else, including actual feature completeness, is of
> negligible importance compared to that. If keeping random access for map
> requires either caching or multiple evaluations of predicate, I want it
> to be banned by default and enabled explicitly (not sure what could be
> good API for that though).

So what it seems like you want is somerange.map(...).front to evaluate the lambda only once, without caching. This doesn't fit into the definition of ranges, which require the ability to access the same front more than once.

What you really want is a new kind of construct. It's not a range.

Given that front must be accessible more than once, map has to make a choice, caching or reevaluating. Not doing one of those two things is not an option for ranges.

Note that this has nothing to do with random access.

> Phobos indeed doesn't seem to make such priorities right now and that is
> one of reasons I am growing increasingly unhappy with it.

It's not so much that Phobos attempts to give you kitchen sink support at all costs, it's that if the sink being passed in has all the faucets, it'll forward them for free if possible.

>>>> Then it's not a bug? It's going to work just fine how you specified
>>>> it. I just don't consider it a valid "range" for general purposes.
>>>>
>>>> You can do this if you want caching:
>>>>
>>>> only(0).map!(x => uniform(0, 10)).cache
>>>
>>> Good advice. Don't want bugs with non-stable results and accidental
>>> double I/O in your long idiomatic range pipeline? Just put "cache" calls
>>> everywhere just to be safe, defensive programming for the win!
>>
>> Strawman, not what I said.
>
> But that is what your answer meant in context of my original question :)
> My complaint was about existing semantics are being error-prone and hard
> to spot - you proposed it by adding more _manual_ fixup, which kills the
> whole point.

No, it's not. My advice is to understand the limitations and expectations of the range wrappers you are using (i.e. read the docs [1]). If you need caching for your purposes, then do that by plopping cache at the end of your use case. The user must have to understand that providing a mapping function that does random things (or even varying things on each call) is going to result in unexpected behavior. The compiler cannot foresee the purpose of your lambda code.

You just said only pay for what you ask for. I don't want to pay for caching for map if I don't need it. I'd be pissed if map cached for something like map!(x => x * 2). If you want caching, then ask for it.

There was a talk by Herb Sutter on atomic operations on various platforms. His general rule was, as long as you don't write race conditions, and use only provided atomics, your code will do what you think. We can apply a similar rule here: as long as you write a lambda which for a given input will provide the same result, map will do what you expect. When you try odd things like your random call, we don't guarantee things will work like you expect. You may utilize the knowledge of how map works, and how the things that are using the map work to write something clever that breaks the rules. But no guarantees.

-Steve

[1] I admit, the map docs aren't explicit about this, and should be.

June 13, 2016

Re: Transient ranges

Posted by John Colvin
in reply to Steven Schveighoffer

John Colvin

Posted in reply to Steven Schveighoffer

On Monday, 13 June 2016 at 13:59:06 UTC, Steven Schveighoffer wrote:
> No, it's not. My advice is to understand the limitations and expectations of the range wrappers you are using (i.e. read the docs [1]). If you need caching for your purposes, then do that by plopping cache at the end of your use case. The user must have to understand that providing a mapping function that does random things (or even varying things on each call) is going to result in unexpected behavior. The compiler cannot foresee the purpose of your lambda code.
>
> You just said only pay for what you ask for. I don't want to pay for caching for map if I don't need it. I'd be pissed if map cached for something like map!(x => x * 2). If you want caching, then ask for it.
>
> There was a talk by Herb Sutter on atomic operations on various platforms. His general rule was, as long as you don't write race conditions, and use only provided atomics, your code will do what you think. We can apply a similar rule here: as long as you write a lambda which for a given input will provide the same result, map will do what you expect. When you try odd things like your random call, we don't guarantee things will work like you expect. You may utilize the knowledge of how map works, and how the things that are using the map work to write something clever that breaks the rules. But no guarantees.
>
> -Steve
>
> [1] I admit, the map docs aren't explicit about this, and should be.

I see three levels of function used with map:

pure: everything is as expected, consider caching if expensive.

idempotent side-effects: remember map is lazy and you'll be ok.

non-idempotent side-effects: probably not what you want.

June 13, 2016

Re: Transient ranges

Posted by Andrei Alexandrescu
in reply to Dicebot

Andrei Alexandrescu

Posted in reply to Dicebot

On 06/12/2016 04:46 AM, Dicebot wrote:
> That is matter of design philosophy. For me such basic library
> primitives warrant C++ attitude of "don't pay for what you don't ask
> for" - and everything else, including actual feature completeness, is of
> negligible importance compared to that.

C++'s input iterator model also predicates multiple accesses to the current value by means of *it.

> If keeping random access for map requires either caching or multiple
> evaluations of predicate, I want it to be banned by default and
> enabled explicitly (not sure what could be good API for that though).

In the initial implementation, map did cache the current element. That made some folks unhappy, so it was removed. In lieu, a function "cache" was introduced. Apparently this setup makes other folks unhappy.

It seems to me, sometimes if I went by what this forum agrees upon I'd write one thousand lines of code one day and remove it the next.

> Phobos indeed doesn't seem to make such priorities right now and that is
> one of reasons I am growing increasingly unhappy with it.

What steps do you think we could take with Phobos to make it better without breaking backward compatibility?

Andrei

June 14, 2016

Re: Transient ranges

Posted by Dicebot
in reply to Steven Schveighoffer

Dicebot

Posted in reply to Steven Schveighoffer

On Monday, 13 June 2016 at 13:59:06 UTC, Steven Schveighoffer wrote:
> So what it seems like you want is somerange.map(...).front to evaluate the lambda only once, without caching. This doesn't fit into the definition of ranges, which require the ability to access the same front more than once.
>
> What you really want is a new kind of construct. It's not a range.

I think you are completely right here, though it is not very comforting :) Indeed, looking now through all uses of ranges in my code, it has always been same specific subset of their functionality - input range based pipeline. Chunk input, build up filters/transformations, write output. Never random access or even `.save`, and each original input chunk is supposed to be processed only once.

Every of my complaints is simply a natural outcome of this usage pattern - I'd really prefer to have something more stupid but also more fool-proof.

June 14, 2016

Re: Transient ranges

Posted by Dicebot
in reply to Andrei Alexandrescu

Dicebot

Posted in reply to Andrei Alexandrescu

On Tuesday, 14 June 2016 at 00:19:54 UTC, Andrei Alexandrescu wrote:
> On 06/12/2016 04:46 AM, Dicebot wrote:
>> That is matter of design philosophy. For me such basic library
>> primitives warrant C++ attitude of "don't pay for what you don't ask
>> for" - and everything else, including actual feature completeness, is of
>> negligible importance compared to that.
>
> C++'s input iterator model also predicates multiple accesses to the current value by means of *it.

Yeah, I don't say C++ iterator model is any better for that kind of design - sorry if my wording implied that. I wonder how they would address it if pipeline approach would ever become popular in C++.

See my response to Steven - probably my main issue is wanting some very different concept, dedicated stream/pipeline abstraction that would still be usable with all the same std.algorithm

> It seems to me, sometimes if I went by what this forum agrees upon I'd write one thousand lines of code one day and remove it the next.

Please don't take my rants in the NG as a serious call to action. I am simply sharing my concerns in a relevant topic - if there was anything pragmatical I could propose, I'd take it to the mail list instead or even right a DIP. To be honest, I'd even prefer if you ignored _any_ request/proposal which is not submitted as such. NG can't be taken as a base for reasoning.

>> Phobos indeed doesn't seem to make such priorities right now and that is
>> one of reasons I am growing increasingly unhappy with it.
>
> What steps do you think we could take with Phobos to make it better without breaking backward compatibility?

Introducing `std2` package is technically not breaking backwards compatibility but I presume this is not what you have meant ;) I don't know - as I have already said, otherwise I'd actually taken some more constructive actions instead of throwing complaints.

If I'd have a say in designing brand new Phobos, it would be very different - async by default, with a strict separation of lightweight @nogc algorithm and API core (fitting even for embedded env) and all the feature rich additional modules built on top. This is not kind of stuff one simply "fixes" in existing mature library.

At the same time, I am not even sure if it is even important to fix. It is good enough for any kind of general purpose development, good for productivity. And bigger projects with hard performance concerns and maintenance costs tend to grow their own "standard" libraries anyway.

July 06, 2016

Re: Transient ranges

Posted by Andrei Alexandrescu
in reply to John Colvin

Andrei Alexandrescu

Posted in reply to John Colvin

Related: just made a few comments to https://github.com/dlang/phobos/pull/4511. More work on range formalization is welcome. Thanks! -- Andrei

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation