range.save

Does anyone else think range.save is a hack? I often find myself forgetting to call range.save in my generic code with my unittests working fine. Also, working with a range of ranges may forget to call range.save.(Ex: [1,2,4].repeat)

On 20/11/15 10:30 AM, Freddy wrote: > Does anyone else think range.save is a hack? I often find myself > forgetting to call range.save in my generic code with my unittests > working fine. Also, working with a range of ranges may forget to call > range.save.(Ex: [1,2,4].repeat) Not really. If you forget it, then things won't work since it'll be empty ;)

On Friday, 20 November 2015 at 02:47:17 UTC, Rikki Cattermole wrote: > On 20/11/15 10:30 AM, Freddy wrote: >> Does anyone else think range.save is a hack? I often find myself >> forgetting to call range.save in my generic code with my unittests >> working fine. Also, working with a range of ranges may forget to call >> range.save.(Ex: [1,2,4].repeat) > > Not really. > If you forget it, then things won't work since it'll be empty ;) If range is class. For most of structs like Iota range.save returns just itself.

Am 19.11.2015 um 22:30 schrieb Freddy: > Does anyone else think range.save is a hack? I often find myself > forgetting to call range.save in my generic code with my unittests > working fine. Also, working with a range of ranges may forget to call > range.save.(Ex: [1,2,4].repeat) I think that .save is a serious design flaw in the range API and there must me countless theoretical or future bugs lurking in todays code bases, only working by chance due to undefined algorithm behavior. The problem is that all of the alternatives that were discussed all had different trade-offs and solving the .save issue would lead to other issues/limitations (or at least AFAIR there hasn't been a suggestion where that wouldn't be the case). And since we now already have an API, that wouldn't help anyway. Maybe we could inject code into existing ranges that, in debug mode, at least causes assertions to trigger whenever an operation is done on an outdated copy of an input range (or of a non '.save'd forward range). That could then be advertised as a standard pattern for user defined ranges ("mixin RangeSemantics;").

Am 20.11.2015 um 16:37 schrieb Andrei Alexandrescu: > On 11/20/2015 04:42 AM, Sönke Ludwig wrote: >> I think that .save is a serious design flaw in the range API > > How would you redo it? -- Andrei That's why I wrote that the alternatives had their own issues - I unfortunately don't have a solution to this either. Making usage errors fail loudly at runtime is the only one improvement I had in mind that wouldn't result in unacceptable code breakage. Now if we could exclude reference type ranges from the picture* and ignore the fact that this would break tons of code, I think making input ranges non-copyable and using the postblit constructor to do the job of save() would be the right approach. Recently this was mentioned somewhere and the counter argument was that this wouldn't work for wrapper ranges: struct FilterRange(R) { private R _input; this(R input) { _input = input; // error: R not copyable } // ... } But it does work! struct FilterRange(R) { private R _input; this(R input) { swap(_input, input); // fine! } // ... } Passing a range into such a wrapper range can still be done directly in case of rvalues: FilterRange!MyRange(MyRange(...)) L-values require an explicit "move" call (which is a good thing): MyRange r; auto fr = FilterRange(r.move); The lowering of "foreach" to "for" would also have to be adjusted accordingly. The main drawback of using postblit for forward ranges is that it might happen that it gets invoked accidentally, resulting in hidden performance issues (IMO better than correctness issues, but anyway). That could either be mitigated by performing substantial work lazily instead of directly in this(this), or by keeping support for save() in addition to this(this), so that a valid forward range could still disable this(this) if it's costly and expose the copy functionality through save() instead. * To still support reference type ranges, we could turn this around and define a copyref() method that creates a new range that references the same internal range object. The advantage is that this way around failure to call copyref() will result in an immediate error.

November 20, 2015

Re: range.save

Posted by Jonathan M Davis
in reply to Sönke Ludwig

Permalink

Jonathan M Davis

Posted in reply to Sönke Ludwig

Permalink

On Friday, 20 November 2015 at 16:35:01 UTC, Sönke Ludwig wrote:
> The lowering of "foreach" to "for" would also have to be adjusted accordingly.

I'm actually wondering if we should change it even to support the way that we currently do things. The problem is that the semantics of copying a range are undefined. They depend entirely on how the range is implemented (value types, reference types, and pseudo-reference types all act differently when copied). The result is that if you want generic range-based code to work with all range types, you can never use a range after it's been copied unless it's assigned a new value - and foreach does a copy!

foreach(e; range)
{
    // do stuff
}

->

for(auto __c = range; !__c.empty; __c.popFront())
{
    auto e = __c.front;
    // do stuff
}

So, in generic code, once you use foreach on a range, you can't use it again. Something like

foreach(e; range)
{
   if(cond)
       break;
}
// do something with range again

is inherently broken. You're pretty much forced to use a naked for loop, e.g.

for(; !range.empty; range.popFront())
{
    auto e = range.front;
    if(cond)
        break;
}
// do something with range

So, what we really want is probably something more like

foreach(e; range)
{
    // do stuff
}

->

for(; !range.empty; range.popFront())
{
    auto e = range.front;
    // do stuff
}

Unfortunately, that _does_ risk breaking some code with forward ranges - but only because they're being implicitly saved, which is arguably a bug. So, I'd be inclined to argue that the change should be made - if nothing else, because any attempts to break out of the loop are screwed with the currently lowering, and pure input ranges can't be saved to work around the problem, meaning that they're doubly screwed.

Now, that does have the downside of making foreach work differently for arrays, since they don't use the normal range lowering (rather, they're effectively saved), which means that we're still pretty much screwed...

So, maybe what we need to do is have the foreach lowering check for save and just call it if it exists so that pure input ranges get

for(; !range.empty; range.popFront())
{
    auto e = range.front;
    // do stuff
}

whereas forward ranges get

for(auto __c = range.save; !__c.empty; __c.popFront())
{
    auto e = __c.front;
    // do stuff
}

That's more complicated, but it avoids breaking code while still making it so that input ranges and foreach aren't quite so broken - though it means that input ranges and forward ranges act differently with foreach. But the alternative is basically tell folks that they need to call save explicitly when using foreach on forward ranges in generic code and to tell them that they should never use foreach on pure input ranges unless they intend to iterate over the whole range.

save tries to solve the copying problem with ranges, but the overall situation is _far_ worse than just trying to make it so that reference type ranges have a way to be copied properly.

- Jonathan M Davis

On Thursday, 19 November 2015 at 21:30:23 UTC, Freddy wrote: > ... Another problem I noticed with ranges is that all functionality is unionized. Ranges are expected to be able to popFront,Index,popBack, randomly possibly forcing ranges to carry unneeded buffers if indexing or popBack in never used. On possible solution is to have .retroRange and .indexRange On forward/input ranges.

On 11/23/15 7:10 PM, Freddy wrote: > On Thursday, 19 November 2015 at 21:30:23 UTC, Freddy wrote: >> ... > Another problem I noticed with ranges is that all functionality is > unionized. Ranges are expected to be able to popFront,Index,popBack, > randomly possibly forcing ranges to carry unneeded buffers if indexing > or popBack in never used. surely you mean opIndex? Note that ranges are required to implement front, popFront, and empty. That's it, then it is a range. Even save isn't required unless you want it to be a forward range. But yes, a fundamental requirement is to be able to get the front element repeatedly. This necessitates a buffer or "saving of state". -Steve

On Tuesday, 24 November 2015 at 01:03:36 UTC, Steven Schveighoffer wrote: > But yes, a fundamental requirement is to be able to get the front element repeatedly. This necessitates a buffer or "saving of state". Either that or the same operation/calculation is done every time you call front, and it results in the same value (e.g. the result of map calls its predicate every time that you cal front on it). In any case, regardless of how it's done, front needs to always return the same value until popFront is called, and how that's done can vary. - Jonathan M Davis

Forums