Thread overview
More range woes: std.array.save is invalid
Dec 20, 2012
H. S. Teoh
Dec 20, 2012
monarch_dodra
Dec 20, 2012
monarch_dodra
December 20, 2012
http://d.puremagic.com/issues/show_bug.cgi?id=8061

After being sidetracked by the stack-allocated buffer fiasco, I finally found the root problem in the above bug.

The problem is std.array.save:

@property T[] save(T)(T[] a) @safe pure nothrow
{
    return a;
}

This implementation is only correct for certain values of T. It is wrong when T is a forward range, for example, because if you save a T[] and iterate over its elements, it will consume the T's in the array, so that the original range has elements that are now consumed.

The correct implementation when T is a forward range is to return a copy of the array where the elements are obtained by calling T.save.

The problem is, now .save is a potentially very expensive operation.

At a deeper level, it can be argued that actually, std.array.save is not wrong, because you asked it to save the array, not the elements within the array. So in that sense, there's nothing wrong with the implementation.

What is wrong here is that many algorithms in std.range and std.algorithm wrongly assume that just because the outer range (that is, T[]) is a forward range, *and* the inner ranges (that is, T) are also forward ranges, that therefore T[].save will save the current position of the *range of ranges*.  Actually, the only thing that is guaranteed to be saved is the current position of the *containing range*, it does NOT save the position of the subranges at all (at least, not according to the current definition of .save, which is NOT transitive). This means that many wrapper ranges are broken, because they export a .save method that does not work correctly under certain circumstances.

The correct way to wrap a range of ranges as a forward range, is if the wrapping range's .save method *copies* the outer range and populates it with .save'd subranges. Otherwise, it must be treated only as an input range. However, since there's no general way to copy a range, all of these wrapper ranges cannot be more than *input* ranges.

That is, unless we redefine .save to be transitive. But then, it becomes just a variant of a general .dup function for ranges. And so we come back to a fundamental issue in D, that there's no generic way to duplicate a value. This is causing us a lot of grief in generic code (the whole transient .front issue is another instance of this problem, since transience wouldn't be a problem if there was a generic .dup method), and, as I see it, will continue to cause us more grief in the future, because there's no way you can assume that assigning anything to anything else will not magically become invalid when you subsequently perform a seemingly-unrelated operation. That makes writing generic code a veritable minefield.


T

-- 
Why can't you just be a nonconformist like everyone else? -- YHL
December 20, 2012
On Thursday, 20 December 2012 at 07:05:08 UTC, H. S. Teoh wrote:
> The problem is std.array.save:
>
> @property T[] save(T)(T[] a) @safe pure nothrow
> {
>     return a;
> }
>
> This implementation is only correct for certain values of T. It is wrong
> when T is a forward range, for example, because if you save a T[] and
> iterate over its elements, it will consume the T's in the array, so that
> the original range has elements that are now consumed.
>
> T

I'd say that's the correct behavior: A range is just a way to iterate over contents, and "save" makes no promise (and *should* make no promise) to make copies of the range's elements.

If you modify one of the elements in your range, then you'd better hope that change is seen by all other ranges iterating over the same data.

What I'm basically saying is:
//----
auto a = [1, 2, 3]
auto b = a.save;
b[0] = 5;
assert(a[0] == 5); //Correct behavior
//----

The fact that you have ints or forwards ranges as elements is irrelevant.

---------------------------
BTW: Just the same way, "dup" will not transitively duplicate the contents of an array, or of std.container.array.

---------------------------
To get your described behavior, then you should wrap your array into your own range, and make the array's elements part of your iteration definition. This way you'll surprise no one.
December 20, 2012
On Thursday, 20 December 2012 at 07:05:08 UTC, H. S. Teoh wrote:
> What is wrong here is that many algorithms in std.range and
> std.algorithm wrongly assume that just because the outer range (that is,
> T[]) is a forward range, *and* the inner ranges (that is, T) are also
> forward ranges, that therefore T[].save will save the current position
> of the *range of ranges*.

Most probably, algorithm is still not quite great about using save. This is especially true in regards to RoR, but that's a special case, and we are fixing those one by one.
December 20, 2012
On 12/20/12 2:03 AM, H. S. Teoh wrote:
> http://d.puremagic.com/issues/show_bug.cgi?id=8061
>
> After being sidetracked by the stack-allocated buffer fiasco, I finally
> found the root problem in the above bug.
>
> The problem is std.array.save:
>
> @property T[] save(T)(T[] a) @safe pure nothrow
> {
>      return a;
> }
>
> This implementation is only correct for certain values of T. It is wrong
> when T is a forward range, for example, because if you save a T[] and
> iterate over its elements, it will consume the T's in the array, so that
> the original range has elements that are now consumed.
>
> The correct implementation when T is a forward range is to return a copy
> of the array where the elements are obtained by calling T.save.

I think you're overthinking this. Save saves the position in the current range without any promise about its contents.

Andrei