May 29, 2016 Re: Transient ranges | ||||
---|---|---|---|---|
| ||||
Posted in reply to ZombineDev | On Sunday, 29 May 2016 at 11:28:11 UTC, ZombineDev wrote: > On Sunday, 29 May 2016 at 11:15:19 UTC, Dicebot wrote: >> On 05/28/2016 08:27 PM, Joseph Rushton Wakeling wrote: >>> On Saturday, 28 May 2016 at 01:48:08 UTC, Jonathan M Davis wrote: >>>> On Friday, May 27, 2016 23:42:24 Seb via Digitalmars-d wrote: >>>>> So what about the convention to explicitely declare a `.transient` enum member on a range, if the front element value can change? >>>> >>>> Honestly, I don't think that supporting transient ranges is worth it. >>> >>> I have personally wondered if there was a case for a TransientRange concept where the only primitives defined are `empty` and `front`. >>> >>> `popFront()` is not defined because the whole point is that every single call to `front` will produce a different value. >> >> I would prefer such ranges to not have `front` and return new item from `popFront` instead but yes, I would much prefer it to existing form, transient or not. It is impossible to correctly define input range without caching front which may not be always possible and may have negative performance impact. Because of that, a lot of Phobos ranges compromise `front` consistency in favor of speed up - which only seems to work because most algorithms need to access `front` once. >> >> I believe this is biggest issue in D ranges design right now, by large margin. > > +1 > I think making popFront return a value for transient ranges is a sound idea. It would allow to easily distinguish between InputRange and TransientRange with very simple CT introspection. The biggest blocker is to teach the compiler to recognize TransientRange types in foreach. Scratch that: > Another option is to make popFront return a new range, ala slice[1..$] (like std.range.dropOne) which would have the benefit of allowing const/immutable ranges to work. This won't work safely, because the compiler would need to disallow access to the previous instance of the range (sort of Rust moved-from objects), but it's currently no possible. I proposed that idea because I have other uses for immutable ranges, unrelated to this discussion. |
May 29, 2016 Re: Transient ranges | ||||
---|---|---|---|---|
| ||||
Posted in reply to ZombineDev | On Sunday, 29 May 2016 at 11:36:37 UTC, ZombineDev wrote:
> On Sunday, 29 May 2016 at 11:28:11 UTC, ZombineDev wrote:
>> On Sunday, 29 May 2016 at 11:15:19 UTC, Dicebot wrote:
>>> On 05/28/2016 08:27 PM, Joseph Rushton Wakeling wrote:
>>>> On Saturday, 28 May 2016 at 01:48:08 UTC, Jonathan M Davis wrote:
>>>>> On Friday, May 27, 2016 23:42:24 Seb via Digitalmars-d wrote:
>>>>>> So what about the convention to explicitely declare a `.transient` enum member on a range, if the front element value can change?
>>>>>
>>>>> Honestly, I don't think that supporting transient ranges is worth it.
>>>>
>>>> I have personally wondered if there was a case for a TransientRange concept where the only primitives defined are `empty` and `front`.
>>>>
>>>> `popFront()` is not defined because the whole point is that every single call to `front` will produce a different value.
>>>
>>> I would prefer such ranges to not have `front` and return new item from `popFront` instead but yes, I would much prefer it to existing form, transient or not. It is impossible to correctly define input range without caching front which may not be always possible and may have negative performance impact. Because of that, a lot of Phobos ranges compromise `front` consistency in favor of speed up - which only seems to work because most algorithms need to access `front` once.
>>>
>>> I believe this is biggest issue in D ranges design right now, by large margin.
>>
>> +1
>> I think making popFront return a value for transient ranges is a sound idea. It would allow to easily distinguish between InputRange and TransientRange with very simple CT introspection. The biggest blocker is to teach the compiler to recognize TransientRange types in foreach.
>
> Scratch that:
>
>> Another option is to make popFront return a new range, ala slice[1..$] (like std.range.dropOne) which would have the benefit of allowing const/immutable ranges to work.
>
> This won't work safely, because the compiler would need to disallow access to the previous instance of the range (sort of Rust moved-from objects), but it's currently no possible.
>
> I proposed that idea because I have other uses for immutable ranges, unrelated to this discussion.
Isnt the idea to communicate "hey! You may not possibly cache whatever value you get in an iteration"? If yes, just removing .front seems odd as, well, I can still obviously just cache the result of .popFront if I'm inexperienced with ranges. Maybe instead simply rename .front to .buffer for transient ranges? That name would more accurately convey that this is just a buffer the range uses to dump the current front value in, but that if you want to cache it you will need to duplicate it (because, hey, this is the buffer of the range, not yours).
Not sure this is a great idea or a great name, but simply removing .front and returning the current iteration value from .popFront imho does not convey "you should not expect to be able to simply cache this by assignment" to me - arguably this is not a problem since how transient ranges work would be a documented thing, but thats just my 2 cents.
|
May 29, 2016 Re: Transient ranges | ||||
---|---|---|---|---|
| ||||
Posted in reply to ZombineDev | On Sunday, 29 May 2016 at 11:28:11 UTC, ZombineDev wrote: > On Sunday, 29 May 2016 at 11:15:19 UTC, Dicebot wrote: >> On 05/28/2016 08:27 PM, Joseph Rushton Wakeling wrote: >>> On Saturday, 28 May 2016 at 01:48:08 UTC, Jonathan M Davis wrote: >>>> On Friday, May 27, 2016 23:42:24 Seb via Digitalmars-d wrote: >>>>> So what about the convention to explicitely declare a `.transient` enum member on a range, if the front element value can change? >>>> >>>> Honestly, I don't think that supporting transient ranges is worth it. >>> >>> I have personally wondered if there was a case for a TransientRange concept where the only primitives defined are `empty` and `front`. >>> >>> `popFront()` is not defined because the whole point is that every single call to `front` will produce a different value. >> >> I would prefer such ranges to not have `front` and return new item from `popFront` instead but yes, I would much prefer it to existing form, transient or not. It is impossible to correctly define input range without caching front which may not be always possible and may have negative performance impact. Because of that, a lot of Phobos ranges compromise `front` consistency in favor of speed up - which only seems to work because most algorithms need to access `front` once. >> >> I believe this is biggest issue in D ranges design right now, by large margin. > > +1 > I think making popFront return a value for transient ranges is a sound idea. It would allow to easily distinguish between InputRange and TransientRange with very simple CT introspection. The biggest blocker is to teach the compiler to recognize TransientRange types in foreach. I don't think that should be a huge problem, but after having looked at the compiler code [1]: we should name it neither front nor popFront, because how would the compiler know that it is supposed to be transient and not a normal InputRange without front or popFront for which it should throw an error? Idea 1: New name that will make it easier to distinguish that transient ranges are something completly different to normal ranges. How about next? Problem 1: One can't use algorithms that work on transient ranges (map, reduce) anymore Idea 2: Help the compiler with @Transient or `enum transient = true` Problem 2: How would the "transientivity" be automatically forwarded to ranges that work on it. Btw thinking longer about it - transient ranges aren't bad per se. They objey the InputRange contract and e.g. the following works just fine. It's just impossible to distinguish between a transient and a non-transient InputRange. ``` // input: 1\n2\n\3\n4\n... void main() { import std.stdio, std.conv, std.algorithm; stdin .byLine .map!((a) => a.to!int) .sum .writeln; } ``` https://github.com/dlang/dmd/blob/master/src/statement.d#L2596 |
May 29, 2016 Re: Transient ranges | ||||
---|---|---|---|---|
| ||||
Posted in reply to ZombineDev | On Sunday, 29 May 2016 at 11:28:11 UTC, ZombineDev wrote:
> On Sunday, 29 May 2016 at 11:15:19 UTC, Dicebot wrote:
>> I would prefer such ranges to not have `front` and return new item from `popFront` instead but yes, I would much prefer it to existing form, transient or not. It is impossible to correctly define input range without caching front which may not be always possible and may have negative performance impact. Because of that, a lot of Phobos ranges compromise `front` consistency in favor of speed up - which only seems to work because most algorithms need to access `front` once.
>>
>> I believe this is biggest issue in D ranges design right now, by large margin.
>
> +1
> I think making popFront return a value for transient ranges is a sound idea. It would allow to easily distinguish between InputRange and TransientRange with very simple CT introspection. The biggest blocker is to teach the compiler to recognize TransientRange types in foreach.
I don't follow your reasoning here. In the proposal I put forward, if a range doesn't define `popFront()`, it's not an InputRange, it's a TransientRange.
Conversely, if it _does_ define `popFront()`, it _is_ an InputRange.
What's the problem with introspecting that?
|
May 29, 2016 Re: Transient ranges | ||||
---|---|---|---|---|
| ||||
Posted in reply to Joseph Rushton Wakeling | On Sunday, 29 May 2016 at 15:45:14 UTC, Joseph Rushton Wakeling wrote:
> What's the problem with introspecting that?
There is none :)
it could be implemented today.
|
May 29, 2016 Re: Transient ranges | ||||
---|---|---|---|---|
| ||||
Posted in reply to Joseph Rushton Wakeling | On Sunday, 29 May 2016 at 15:45:14 UTC, Joseph Rushton Wakeling wrote:
> On Sunday, 29 May 2016 at 11:28:11 UTC, ZombineDev wrote:
>> On Sunday, 29 May 2016 at 11:15:19 UTC, Dicebot wrote:
>>> I would prefer such ranges to not have `front` and return new item from `popFront` instead but yes, I would much prefer it to existing form, transient or not. It is impossible to correctly define input range without caching front which may not be always possible and may have negative performance impact. Because of that, a lot of Phobos ranges compromise `front` consistency in favor of speed up - which only seems to work because most algorithms need to access `front` once.
>>>
>>> I believe this is biggest issue in D ranges design right now, by large margin.
>>
>> +1
>> I think making popFront return a value for transient ranges is a sound idea. It would allow to easily distinguish between InputRange and TransientRange with very simple CT introspection. The biggest blocker is to teach the compiler to recognize TransientRange types in foreach.
>
> I don't follow your reasoning here. In the proposal I put forward, if a range doesn't define `popFront()`, it's not an InputRange, it's a TransientRange.
>
> Conversely, if it _does_ define `popFront()`, it _is_ an InputRange.
>
> What's the problem with introspecting that?
Nothing. Just that it could lead to a lot of surprising mistakes, currently the following yields an error:
struct A
{
auto front(){ ...}
enum empty = false;
}
A as;
foreach (a; as) {} // ERROR: no method popFront found
with the proposed change it would compile.
|
May 29, 2016 Re: Transient ranges | ||||
---|---|---|---|---|
| ||||
Posted in reply to Joseph Rushton Wakeling | On Sunday, 29 May 2016 at 15:45:14 UTC, Joseph Rushton Wakeling wrote:
> On Sunday, 29 May 2016 at 11:28:11 UTC, ZombineDev wrote:
>> On Sunday, 29 May 2016 at 11:15:19 UTC, Dicebot wrote:
>>> I would prefer such ranges to not have `front` and return new item from `popFront` instead but yes, I would much prefer it to existing form, transient or not. It is impossible to correctly define input range without caching front which may not be always possible and may have negative performance impact. Because of that, a lot of Phobos ranges compromise `front` consistency in favor of speed up - which only seems to work because most algorithms need to access `front` once.
>>>
>>> I believe this is biggest issue in D ranges design right now, by large margin.
>>
>> +1
>> I think making popFront return a value for transient ranges is a sound idea. It would allow to easily distinguish between InputRange and TransientRange with very simple CT introspection. The biggest blocker is to teach the compiler to recognize TransientRange types in foreach.
>
> I don't follow your reasoning here. In the proposal I put forward, if a range doesn't define `popFront()`, it's not an InputRange, it's a TransientRange.
>
> Conversely, if it _does_ define `popFront()`, it _is_ an InputRange.
>
> What's the problem with introspecting that?
Yes it can be introspected in library, but it breaks the expectations of the unsuspecting user. For example:
auto r = getSomeRange();
assert (r.front == r.front);
So, under your suggestion, the above could fail just because an implementation detail was changed like modifying some input range to become a transient range and somehow the code still compiled.
Now I'm leaning more towards the buffer + popFront + empty trio, because it's harder to misuse.
An alternative would be tagging such ranges with an enum or UDA, but we would need to verify/change too many places in phobos for this to work.
|
May 29, 2016 Re: Transient ranges | ||||
---|---|---|---|---|
| ||||
Posted in reply to Dicebot | On 5/29/16 7:15 AM, Dicebot wrote:
> On 05/28/2016 08:27 PM, Joseph Rushton Wakeling wrote:
>> On Saturday, 28 May 2016 at 01:48:08 UTC, Jonathan M Davis wrote:
>>> On Friday, May 27, 2016 23:42:24 Seb via Digitalmars-d wrote:
>>>> So what about the convention to explicitely declare a `.transient`
>>>> enum member on a range, if the front element value can change?
>>>
>>> Honestly, I don't think that supporting transient ranges is worth it.
>>
>> I have personally wondered if there was a case for a TransientRange
>> concept where the only primitives defined are `empty` and `front`.
>>
>> `popFront()` is not defined because the whole point is that every single
>> call to `front` will produce a different value.
>
> I would prefer such ranges to not have `front` and return new item from
> `popFront` instead but yes, I would much prefer it to existing form,
> transient or not. It is impossible to correctly define input range
> without caching front which may not be always possible and may have
> negative performance impact. Because of that, a lot of Phobos ranges
> compromise `front` consistency in favor of speed up - which only seems
> to work because most algorithms need to access `front` once.
What problems are solvable only by not caching the front element? I can't think of any.
And there is no way to define "transient" ranges in a way other than explicitly declaring they are transient. There isn't anything inherent or introspectable about such ranges.
-Steve
|
May 29, 2016 Re: Transient ranges | ||||
---|---|---|---|---|
| ||||
Posted in reply to ZombineDev | On 5/29/16 7:28 AM, ZombineDev wrote:
> On Sunday, 29 May 2016 at 11:15:19 UTC, Dicebot wrote:
>> On 05/28/2016 08:27 PM, Joseph Rushton Wakeling wrote:
>>> On Saturday, 28 May 2016 at 01:48:08 UTC, Jonathan M Davis wrote:
>>>> On Friday, May 27, 2016 23:42:24 Seb via Digitalmars-d wrote:
>>>>> So what about the convention to explicitely declare a `.transient`
>>>>> enum member on a range, if the front element value can change?
>>>>
>>>> Honestly, I don't think that supporting transient ranges is worth it.
>>>
>>> I have personally wondered if there was a case for a TransientRange
>>> concept where the only primitives defined are `empty` and `front`.
>>>
>>> `popFront()` is not defined because the whole point is that every
>>> single call to `front` will produce a different value.
>>
>> I would prefer such ranges to not have `front` and return new item
>> from `popFront` instead but yes, I would much prefer it to existing
>> form, transient or not. It is impossible to correctly define input
>> range without caching front which may not be always possible and may
>> have negative performance impact. Because of that, a lot of Phobos
>> ranges compromise `front` consistency in favor of speed up - which
>> only seems to work because most algorithms need to access `front` once.
>>
>> I believe this is biggest issue in D ranges design right now, by large
>> margin.
>
> +1
> I think making popFront return a value for transient ranges is a sound
> idea. It would allow to easily distinguish between InputRange and
> TransientRange with very simple CT introspection. The biggest blocker is
> to teach the compiler to recognize TransientRange types in foreach.
This doesn't help at all. I can still make a "transient" range with all three range primitives.
There seems to be a misunderstanding about what a transient range is.
byLine is a transient range that requires the front element be cacheable (I have to build the line somewhere, and reusing that buffer provides performance). Shoehorning into a popFront-only style "range" does not solve the problem. Not only that, but now I would have to cache BOTH the front element and the next one.
-Steve
|
May 29, 2016 Re: Transient ranges | ||||
---|---|---|---|---|
| ||||
Posted in reply to Jonathan M Davis | On 5/27/16 9:48 PM, Jonathan M Davis via Digitalmars-d wrote: > On Friday, May 27, 2016 23:42:24 Seb via Digitalmars-d wrote: >> So what about the convention to explicitely declare a >> `.transient` enum member on a range, if the front element value >> can change? > > Honestly, I don't think that supporting transient ranges is worth it. Every > single range-based function would have to either test that the "transient" > enum wasn't there or take transient ranges into account, and realistically, > that isn't going to happen. For better or worse, we do have byLine in > std.stdio, which has a transient front, but aside from the performance > benefits, it's been a disaster. Wholly disagree. If we didn't cache the element, D would be a laughingstock of performance-minded tests. > It's way too error-prone. We now have > byLineCopy to combat that, but of course, byLine is the more obvious > function and thus more likely to be used (plus it's been around longer), so > a _lot_ of code is going to end up using it, and a good chunk of that code > really should be using byLineCopy. There's nothing actually wrong with using byLine, and copying on demand. Why such a negative connotation? > I'm of the opinion that if you want a transient front, you should just use > opApply and skip ranges entirely. So you want to make this code invalid? Why? foreach(i; map!(a => a.to!int)(stdin.byLine)) { // process each integer ... } You want to make me copy each line to a heap-allocated string so I can parse it?!! > Allowing for front to be transient - > whether you can check for it or not - simply is not worth the extra > complications. I'd love it if we deprecated byLine's range functions, and > made it use opApply instead and just declare transient ranges to be > completely unsupported. If you want to write your code to have a transient > front, you can obviously take that risk, but you're on your own. There is no way to disallow front from being transient. In fact, it should be assumed that it is the default unless it's wholly a value-type. -Steve |
Copyright © 1999-2021 by the D Language Foundation