range.save (page 2) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » range.save (page 2)

November 24, 2015

Posted by Freddy
in reply to Steven Schveighoffer

Freddy

Posted in reply to Steven Schveighoffer

On Tuesday, 24 November 2015 at 01:03:36 UTC, Steven Schveighoffer wrote:
> surely you mean opIndex? Note that ranges are required to implement front, popFront, and empty. That's it, then it is a range. Even save isn't required unless you want it to be a forward range.

I meant .indexableRange, random access may require extra buffers or stack space that take space even when random access isn't used, and I'm asking for optional members(.retroRange, .indexableRange)

> But yes, a fundamental requirement is to be able to get the front element repeatedly. This necessitates a buffer or "saving of state".

Not quite what I was thinking. I was saying that ranges that implement back,popBack may need to implement a backwards buffer along a forward buffer even if the backwards buffer is never used.

November 24, 2015

Posted by Steven Schveighoffer
in reply to Freddy

Steven Schveighoffer

Posted in reply to Freddy

On 11/23/15 8:38 PM, Freddy wrote:

> Not quite what I was thinking. I was saying that ranges that implement
> back,popBack may need to implement a backwards buffer along a forward
> buffer even if the backwards buffer is never used.

Maybe you are saying that if you want to implement indexing, you must also implement back and popBack? Note that if you don't implement something, it just doesn't get qualified as that type of range, so it's definitely possible to have indexing, but not back and popBack (although if range[$-1] is possible, then wouldn't that easily qualify as back?). So I don't really understand still.

I think your issue may be better described with an example.

-Steve

November 24, 2015

Posted by Freddy
in reply to Steven Schveighoffer

Freddy

Posted in reply to Steven Schveighoffer

On Tuesday, 24 November 2015 at 01:53:39 UTC, Steven Schveighoffer wrote:
> ...
I can't quite think of an example right now but there was a thread about this a few years ago. http://forum.dlang.org/thread/twnymbxfdmqupsfjfjul@forum.dlang.org

November 26, 2015

Posted by Andrei Alexandrescu
in reply to Sönke Ludwig

Andrei Alexandrescu

Posted in reply to Sönke Ludwig

On 11/20/2015 11:34 AM, Sönke Ludwig wrote:
> Am 20.11.2015 um 16:37 schrieb Andrei Alexandrescu:
>> On 11/20/2015 04:42 AM, Sönke Ludwig wrote:
>>> I think that .save is a serious design flaw in the range API
>>
>> How would you redo it? -- Andrei
>
> That's why I wrote that the alternatives had their own issues - I
> unfortunately don't have a solution to this either. Making usage errors
> fail loudly at runtime is the only one improvement I had in mind that
> wouldn't result in unacceptable code breakage.
>
> Now if we could exclude reference type ranges from the picture* and
> ignore the fact that this would break tons of code, I think making input
> ranges non-copyable and using the postblit constructor to do the job of
> save() would be the right approach.

(had this in my other's laptop outbox for a while)

The baseline was STL's input iterators, which were quite bad - there's no syntactic distinction between input and forward iterators, which makes the entire matter a convention. That's not the case for any other iterators - code using their capabilities won't compile with weaker iterators (nice).

So initially I thought a simple solution would be this:

struct MyForwardRange
{
    enum bool isForward = true;
    ... empty, front, popFront ...
}

enum isForwardRange(R) =
  is(typeof(R.isForward)) && R.isForward;

Then there's no need for .save(), and isForward!R can be used for constraints etc. Reference forward ranges are not supported but then that's liberating (fewer complications), rare, and easy to work around by wrapping.

Looking back, the only reason for which I didn't go that way was fear; I simply hadn't seen such an idiom and I was afraid it'd be too nonconformist. Introspection was new and back then quite arcane, and the idioms were unclear. So I went with a more conventional solution of adding one method.

Today that's the way I'd do it, and in fact it is the way the new collections will work exactly that way - if a type exposes isStdCollection then it commits to implementing the collection primitives as specified.

As of right now the situation with ranges is suboptimal - you need to use .save() but if you don't the sheer copy works most of the time, just not always. It may be nice to improve on that a bit, for example to require input ranges that are not forward ranges to indeed have reference semantics. Or require forward ranges to define .save() but only with the body { return this; }. Either of these would break code.

Andrei

November 26, 2015

Posted by Jonathan M Davis
in reply to Andrei Alexandrescu

Jonathan M Davis

Posted in reply to Andrei Alexandrescu

On Thursday, 26 November 2015 at 17:23:07 UTC, Andrei Alexandrescu wrote:
> As of right now the situation with ranges is suboptimal - you need to use .save() but if you don't the sheer copy works most of the time, just not always. It may be nice to improve on that a bit, for example to require input ranges that are not forward ranges to indeed have reference semantics. Or require forward ranges to define .save() but only with the body { return this; }. Either of these would break code.

As long as a range has semantics very similar to a dynamic array, it works great, but as soon as it doesn't, it tends to fall apart in subtle ways. The current situation is simply too error-prone and unwieldy IMHO. It works great in the common case but definitely falls apart at the edges. And even if you're diligent about getting it right, the number of calls to save that are required is ridiculous. So, while I completely agree that we want to avoid breaking code if we can, I'm also increasingly convinced that we should look at what would need to be done to make ranges clean to use without being error-prone and see if we can get there with minimal breaking changes. Regardless, if we _do_ make breaking changes with regards to ranges, we need to be very sure that we want to make them.

- Jonathan M Davis

November 27, 2015

Posted by Joseph Rushton Wakeling
in reply to Andrei Alexandrescu

Joseph Rushton Wakeling

Posted in reply to Andrei Alexandrescu

On Thursday, 26 November 2015 at 17:23:07 UTC, Andrei Alexandrescu wrote:
> So initially I thought a simple solution would be this:
>
> struct MyForwardRange
> {
>     enum bool isForward = true;
>     ... empty, front, popFront ...
> }
>
> enum isForwardRange(R) =
>   is(typeof(R.isForward)) && R.isForward;
>
> Then there's no need for .save(), and isForward!R can be used for constraints etc. Reference forward ranges are not supported but then that's liberating (fewer complications), rare, and easy to work around by wrapping.

By "reference forward range" do you mean a reference type (per se) or any range that contains an internal reference type (e.g. a range whose private data includes a dynamic array)?

November 27, 2015

Posted by Jonathan M Davis
in reply to Joseph Rushton Wakeling

Jonathan M Davis

Posted in reply to Joseph Rushton Wakeling

On Friday, 27 November 2015 at 08:53:26 UTC, Joseph Rushton Wakeling wrote:
> On Thursday, 26 November 2015 at 17:23:07 UTC, Andrei Alexandrescu wrote:
>> So initially I thought a simple solution would be this:
>>
>> struct MyForwardRange
>> {
>>     enum bool isForward = true;
>>     ... empty, front, popFront ...
>> }
>>
>> enum isForwardRange(R) =
>>   is(typeof(R.isForward)) && R.isForward;
>>
>> Then there's no need for .save(), and isForward!R can be used for constraints etc. Reference forward ranges are not supported but then that's liberating (fewer complications), rare, and easy to work around by wrapping.
>
> By "reference forward range" do you mean a reference type (per se) or any range that contains an internal reference type (e.g. a range whose private data includes a dynamic array)?

Obviously, Andrei will have to answer to know what he meant, but with regards to ranges, I would consider a reference type to be one where in

auto copy = range;

doing anything to copy or range does the exact same thing to the other, because they refer to the exact same state. Something like save is required to get a separate range where popping elements from one will not affect the other.

Contrast that with a value type where copy and range are completely separate. And then there are ranges where the copy shares _some_ state with the original but not all, which as far as save goes is pretty much the same as a reference type but means that you can't rely on mutating one having the same effect on the other one either (the easiest example would be a range that would otherwise be a reference type but caches front).

Based on that view of things, dynamic arrays definitely are value types. Obviously, if you start doing stuff that would mutate their elements, then they aren't really value types, but if all you're doing is iterating over them, then they're essentially value types.

What we lose without save (or something else to replace it) is the ability to have any range types that aren't value types (or which essentially behave like them). So, a range which is a dynamic array or a simple wrapper around a dynamic array is fine, because copying the range is enough, but anything where a copy is not the same as save would have been will no longer work. Using postblit constructors like Sonke suggested solves _some_ of that problem, but not all of it, since that doesn't work with classes, and const doesn't work with postblit constructors, whereas we could make it work with save. The loss of classes poses the problem that non-templated functions can't really be made to work with ranges. And the loss of reference type ranges in general definitely is a problem for some uses cases.

But the more I look at the semantics involved, the more inclined I am to argue that trying to have the same code work for both value types and reference types is questionable at best. On top of that, pure input ranges almost need to be reference types (though they can currently work as pseudo-reference types at least some of the time), and forward ranges really need to be value types (at least as much as dynamic arrays are anyway) if we don't want to have to call save everywhere.

I'm starting to think that it would be better to have pure input ranges have to be reference types and forward ranges have to be value types and then be very careful about which functions work with both rather than simply treating all forward ranges like pure input ranges that can also be copied via save.

- Jonathan M Davis

November 27, 2015

Posted by Joseph Rushton Wakeling
in reply to Jonathan M Davis

Joseph Rushton Wakeling

Posted in reply to Jonathan M Davis

On Friday, 27 November 2015 at 09:20:23 UTC, Jonathan M Davis wrote:
> Obviously, Andrei will have to answer to know what he meant, but with regards to ranges, I would consider a reference type to be one where in
>
> auto copy = range;
>
> doing anything to copy or range does the exact same thing to the other, because they refer to the exact same state. Something like save is required to get a separate range where popping elements from one will not affect the other.

Unfortunately it's a bit more complicated than that, because it's readily possible to have ranges where

    auto copy = range;

... will copy _some_ of the internals by value, and some by reference -- e.g. a range whose private data includes some integer values and a dynamic array.

That's not necessarily a problem if the reference-type data does not influence the range's behaviour (e.g. you're doing forward iteration over a container accessed by ref), but it's readily possible to imagine a range design where

    auto copy = range;
    copy.popFront();

... will affect range's state without updating it to the _same_ state as copy.

November 27, 2015

Posted by Jonathan M Davis
in reply to Joseph Rushton Wakeling

Jonathan M Davis

Posted in reply to Joseph Rushton Wakeling

On Friday, 27 November 2015 at 10:17:46 UTC, Joseph Rushton Wakeling wrote:
> On Friday, 27 November 2015 at 09:20:23 UTC, Jonathan M Davis wrote:
>> Obviously, Andrei will have to answer to know what he meant, but with regards to ranges, I would consider a reference type to be one where in
>>
>> auto copy = range;
>>
>> doing anything to copy or range does the exact same thing to the other, because they refer to the exact same state. Something like save is required to get a separate range where popping elements from one will not affect the other.
>
> Unfortunately it's a bit more complicated than that, because it's readily possible to have ranges where
>
>     auto copy = range;
>
> ... will copy _some_ of the internals by value, and some by reference -- e.g. a range whose private data includes some integer values and a dynamic array.
>
> That's not necessarily a problem if the reference-type data does not influence the range's behaviour (e.g. you're doing forward iteration over a container accessed by ref), but it's readily possible to imagine a range design where
>
>     auto copy = range;
>     copy.popFront();
>
> ... will affect range's state without updating it to the _same_ state as copy.

Yes. I discussed that in my post, though maybe I wasn't clear enough. But such a range requires save just as much as a full-on reference type does, so ultimately it's pretty much in the same boat as far as save goes. It's also a serious problem for pure input ranges to such ranges exist, since then even if you can assume that any range which can implement save does implement save, you still can't guarantee that an input range is a reference type, which becomes a serious problem when an input range is copied. e.g.

foreach(e; range)
{
    if(cond)
        break;
}
range.popFront();

could be fine if you can guarantee that the range is a reference type, but if you can't guarantee that (as is currently the case), then as soon as you use a pure input range with a foreach loop, you can't use it anymore, because it's copied as part of the lowering. To get around that, you can use a for loop directly, but the fact that pure input ranges aren't currently guaranteed to be reference types definitely makes it harder to write consistent, correct code with pure input ranges.

As it stands, forward ranges have the same problem, but by calling save, you can ensure that the copy is separate from the original and ensure that iterating the original after the loop is fine (though if you want reference semantics, you would still have to use a for loop), but pure input ranges don't have a way to ensure consistency like that other than maybe wrapping the range in another type which you can guarantee is a reference type.

Regardless, even if we require that pure input ranges be reference types and that forward ranges be value types (at least insomuch as copying them results in a separate range with the same state like with save), we then still have the problem that pure input ranges and forward ranges have different semantics. But as long as dynamic arrays are ranges, it's not like we could force all ranges to be reference types, and that wouldn't be good for efficiency anyway, since it would almost certainly mean more heap allocations just so that they can be reference types.

- Jonathan M Davis

November 27, 2015

Posted by Jonathan M Davis
in reply to Jonathan M Davis

Jonathan M Davis

Posted in reply to Jonathan M Davis

On Friday, 27 November 2015 at 09:20:23 UTC, Jonathan M Davis wrote:
> I'm starting to think that it would be better to have pure input ranges have to be reference types and forward ranges have to be value types and then be very careful about which functions work with both rather than simply treating all forward ranges like pure input ranges that can also be copied via save.

Another piece of this puzzle to consider is that unless a range is a value type (or at least acts like a value type as long as you don't mutate its elements) or has save called on it, then it fundamentally doesn't work with lazy ranges. So, at minimum, we need to consider making it so that lazy ranges require forward ranges (and then, assuming that we continue to have save, the lazy ranges need to always call save on the range that they're given).

- Jonathan M Davis

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation