Jump to page: 1 2
Thread overview
Re: More range woes: std.array.save is invalid
Dec 20, 2012
H. S. Teoh
Dec 20, 2012
Jonathan M Davis
Dec 20, 2012
H. S. Teoh
Dec 20, 2012
monarch_dodra
Dec 20, 2012
H. S. Teoh
Dec 20, 2012
monarch_dodra
Dec 20, 2012
H. S. Teoh
Dec 20, 2012
Jonathan M Davis
Dec 20, 2012
H. S. Teoh
December 20, 2012
On Thu, Dec 20, 2012 at 02:36:18AM -0500, Andrei Alexandrescu wrote:
> On 12/20/12 2:03 AM, H. S. Teoh wrote:
> >http://d.puremagic.com/issues/show_bug.cgi?id=8061
> >
> >After being sidetracked by the stack-allocated buffer fiasco, I finally found the root problem in the above bug.
> >
> >The problem is std.array.save:
> >
> >@property T[] save(T)(T[] a) @safe pure nothrow
> >{
> >     return a;
> >}
> >
> >This implementation is only correct for certain values of T. It is wrong when T is a forward range, for example, because if you save a T[] and iterate over its elements, it will consume the T's in the array, so that the original range has elements that are now consumed.
> >
> >The correct implementation when T is a forward range is to return a copy of the array where the elements are obtained by calling T.save.
> 
> I think you're overthinking this. Save saves the position in the current range without any promise about its contents.
[...]

Yes, which means many current algorithms that take a range of ranges and returns a wrapper range are wrong, because they assume that the wrapper range can be a forward range when both the container and the subranges are forward ranges, but this is not a sufficient condition in the general case.


T

-- 
Doubt is a self-fulfilling prophecy.
December 20, 2012
On Wednesday, December 19, 2012 23:44:12 H. S. Teoh wrote:
> Yes, which means many current algorithms that take a range of ranges and returns a wrapper range are wrong, because they assume that the wrapper range can be a forward range when both the container and the subranges are forward ranges, but this is not a sufficient condition in the general case.

Then they have bugs in their implementation which need to be fixed. There's nothing wrong with std.array.save. It's doing the right thing.

- Jonathan M Davis
December 20, 2012
On Thu, Dec 20, 2012 at 12:08:24AM -0800, Jonathan M Davis wrote:
> On Wednesday, December 19, 2012 23:44:12 H. S. Teoh wrote:
> > Yes, which means many current algorithms that take a range of ranges and returns a wrapper range are wrong, because they assume that the wrapper range can be a forward range when both the container and the subranges are forward ranges, but this is not a sufficient condition in the general case.
> 
> Then they have bugs in their implementation which need to be fixed. There's nothing wrong with std.array.save. It's doing the right thing.
[...]

OK, so the subject line is misleading.

But the bigger issue is that wrappers of forward ranges of forward ranges can only be input ranges, in spite of the fact that, in theory, you can save each component of the given range of ranges.


T

-- 
Nobody is perfect.  I am Nobody. -- pepoluan, GKC forum
December 20, 2012
On Thursday, 20 December 2012 at 15:25:27 UTC, H. S. Teoh wrote:
> On Thu, Dec 20, 2012 at 12:08:24AM -0800, Jonathan M Davis wrote:
>> On Wednesday, December 19, 2012 23:44:12 H. S. Teoh wrote:
>> > Yes, which means many current algorithms that take a range of ranges
>> > and returns a wrapper range are wrong, because they assume that the
>> > wrapper range can be a forward range when both the container and the
>> > subranges are forward ranges, but this is not a sufficient condition
>> > in the general case.
>> 
>> Then they have bugs in their implementation which need to be fixed.
>> There's nothing wrong with std.array.save. It's doing the right thing.
> [...]
>
> OK, so the subject line is misleading.
>
> But the bigger issue is that wrappers of forward ranges of forward
> ranges can only be input ranges, in spite of the fact that, in theory,
> you can save each component of the given range of ranges.
>
>
> T

Why is that?

If the wrapper is *designed* as being a RoR, then the "elements"
of the range are defined as being part of the iteration scheme.
Once you have defined it that way, you can have the wrapper save
the top Range, as well as each sub-range individually, and then
the wrapper is Forward... No?
December 20, 2012
On Thu, Dec 20, 2012 at 04:47:10PM +0100, monarch_dodra wrote:
> On Thursday, 20 December 2012 at 15:25:27 UTC, H. S. Teoh wrote:
> >On Thu, Dec 20, 2012 at 12:08:24AM -0800, Jonathan M Davis wrote:
> >>On Wednesday, December 19, 2012 23:44:12 H. S. Teoh wrote:
> >>> Yes, which means many current algorithms that take a range of ranges and returns a wrapper range are wrong, because they assume that the wrapper range can be a forward range when both the container and the subranges are forward ranges, but this is not a sufficient condition in the general case.
> >>
> >>Then they have bugs in their implementation which need to be fixed. There's nothing wrong with std.array.save. It's doing the right thing.
> >[...]
> >
> >OK, so the subject line is misleading.
> >
> >But the bigger issue is that wrappers of forward ranges of forward ranges can only be input ranges, in spite of the fact that, in theory, you can save each component of the given range of ranges.
[...]
> Why is that?

Because the definition of .save only requires that the state of the outer range is saved. Nothing is guaranteed about the state of the inner ranges.


> If the wrapper is *designed* as being a RoR, then the "elements" of the range are defined as being part of the iteration scheme.  Once you have defined it that way, you can have the wrapper save the top Range, as well as each sub-range individually, and then the wrapper is Forward... No?

In *theory*, yes, you can do this. But currently, it's not possible to do this generically, because .save has to return the same type as the outer range. The outer range's .save method does NOT call .save on the inner ranges (e.g., an array of forward ranges: std.array.save simply returns a shallow copy of the original array). So the inner ranges will be invalidated by iterating either the .save'd range, or the original range.

You have to return an outer range in which all of its elements have been replaced with .save'd copies. But there is no method in the outer range that does this, so there is no way to do this in generic code.

*If* .save is allowed to return a different type, then yes, you can return a wrapper of the outer range that lets you iterate through .save'd copies of the inner ranges. But AFAIK this isn't allowed by the current definition of .save.


T

-- 
Why waste time learning, when ignorance is instantaneous? -- Hobbes, from Calvin & Hobbes
December 20, 2012
On Thursday, 20 December 2012 at 16:04:43 UTC, H. S. Teoh wrote:
> On Thu, Dec 20, 2012 at 04:47:10PM +0100, monarch_dodra wrote:
> [...]
>> Why is that?
>
> Because the definition of .save only requires that the state of the
> outer range is saved. Nothing is guaranteed about the state of the inner
> ranges.
>
>
>> If the wrapper is *designed* as being a RoR, then the "elements" of
>> the range are defined as being part of the iteration scheme.  Once you
>> have defined it that way, you can have the wrapper save the top Range,
>> as well as each sub-range individually, and then the wrapper is
>> Forward... No?
>
> In *theory*, yes, you can do this. But currently, it's not possible to
> do this generically, because .save has to return the same type as the
> outer range. The outer range's .save method does NOT call .save on the
> inner ranges (e.g., an array of forward ranges: std.array.save simply
> returns a shallow copy of the original array). So the inner ranges will
> be invalidated by iterating either the .save'd range, or the original
> range.
>
> You have to return an outer range in which all of its elements have been
> replaced with .save'd copies. But there is no method in the outer range
> that does this, so there is no way to do this in generic code.
>
> *If* .save is allowed to return a different type, then yes, you can
> return a wrapper of the outer range that lets you iterate through
> .save'd copies of the inner ranges. But AFAIK this isn't allowed by the
> current definition of .save.
>
>
> T

Yeah..., I think I get it now. Thanks for insisting.

While you _can_ save the outer range, if you try to replace the individual elements of the saved range with saved copies, at the end of the day, both your outer ranges will still reference the same ranges, saved or not. Ok. Now we can move forward.

One of the solutions I see would be that the RoR wrapper (we're talking about Joiner, right?), never mutate the elements of the RoR directly, but first store a *saved* copy of the element into a _current.

This way, you only need to save the *outer* range, but not *ALL* of the inner ranges individually. This is good because it keeps things in check performance wise.

I commented on your fix to joiner:
https://github.com/D-Programming-Language/phobos/pull/987/files#r2478956
This may solve the problem actually.
December 20, 2012
On Thu, Dec 20, 2012 at 05:30:39PM +0100, monarch_dodra wrote:
> On Thursday, 20 December 2012 at 16:04:43 UTC, H. S. Teoh wrote:
> >On Thu, Dec 20, 2012 at 04:47:10PM +0100, monarch_dodra wrote: [...]
> >>Why is that?
> >
> >Because the definition of .save only requires that the state of the outer range is saved. Nothing is guaranteed about the state of the inner ranges.
> >
> >
> >>If the wrapper is *designed* as being a RoR, then the "elements" of the range are defined as being part of the iteration scheme.  Once you have defined it that way, you can have the wrapper save the top Range, as well as each sub-range individually, and then the wrapper is Forward... No?
> >
> >In *theory*, yes, you can do this. But currently, it's not possible to do this generically, because .save has to return the same type as the outer range. The outer range's .save method does NOT call .save on the inner ranges (e.g., an array of forward ranges: std.array.save simply returns a shallow copy of the original array). So the inner ranges will be invalidated by iterating either the .save'd range, or the original range.
> >
> >You have to return an outer range in which all of its elements have been replaced with .save'd copies. But there is no method in the outer range that does this, so there is no way to do this in generic code.
[...]
> Yeah..., I think I get it now. Thanks for insisting.
> 
> While you _can_ save the outer range, if you try to replace the individual elements of the saved range with saved copies, at the end of the day, both your outer ranges will still reference the same ranges, saved or not. Ok. Now we can move forward.
> 
> One of the solutions I see would be that the RoR wrapper (we're talking about Joiner, right?), never mutate the elements of the RoR directly, but first store a *saved* copy of the element into a _current.
> 
> This way, you only need to save the *outer* range, but not *ALL* of the inner ranges individually. This is good because it keeps things in check performance wise.
> 
> I commented on your fix to joiner: https://github.com/D-Programming-Language/phobos/pull/987/files#r2478956 This may solve the problem actually.

You're right, if joiner only iterates through .save'd copies of the subranges, then there will be no problem. Any external modifications is not our responsibility, because as long as you only iterate over the joined range (without mutation), it will not cause unexpected modifications of the original range.


T

-- 
One reason that few people are aware there are programs running the internet is that they never crash in any significant way: the free software underlying the internet is reliable to the point of invisibility. -- Glyn Moody, from the article "Giving it all away"
December 20, 2012
On Thursday, December 20, 2012 08:03:04 H. S. Teoh wrote:
> Because the definition of .save only requires that the state of the outer range is saved. Nothing is guaranteed about the state of the inner ranges.

That just means that that anything involving ranges of ranges needs to be written with the understanding that saving the outer range doesn't save the inner ones. So, we may have bugs with regards to this (similar to how we have bugs related to not calling save when we should), but there's still no problem with save itself - only with how it's used.

So much of this would have been easier though if reference type ranges had been disallowed. Too late now though. We'll just have to fix such bugs as we find them and strengthen the unit tests so that they get caught like they should be.

- Jonathan M Davis
December 20, 2012
On 12/20/12 12:48 PM, Jonathan M Davis wrote:
> On Thursday, December 20, 2012 08:03:04 H. S. Teoh wrote:
>> Because the definition of .save only requires that the state of the
>> outer range is saved. Nothing is guaranteed about the state of the inner
>> ranges.
>
> That just means that that anything involving ranges of ranges needs to be
> written with the understanding that saving the outer range doesn't save the
> inner ones. So, we may have bugs with regards to this (similar to how we have
> bugs related to not calling save when we should), but there's still no problem
> with save itself - only with how it's used.
>
> So much of this would have been easier though if reference type ranges had
> been disallowed. Too late now though. We'll just have to fix such bugs as we
> find them and strengthen the unit tests so that they get caught like they
> should be.

I think this whole issue is akin to the "transitory" discussion of a while ago.

Andrei


December 20, 2012
On Thu, Dec 20, 2012 at 01:47:16PM -0500, Andrei Alexandrescu wrote:
> On 12/20/12 12:48 PM, Jonathan M Davis wrote:
> >On Thursday, December 20, 2012 08:03:04 H. S. Teoh wrote:
> >>Because the definition of .save only requires that the state of the outer range is saved. Nothing is guaranteed about the state of the inner ranges.
> >
> >That just means that that anything involving ranges of ranges needs to be written with the understanding that saving the outer range doesn't save the inner ones. So, we may have bugs with regards to this (similar to how we have bugs related to not calling save when we should), but there's still no problem with save itself - only with how it's used.

Yes, I just submitted a pull request for joiner (the no-separator variant), and am working on another pull for the with-separator variant of joiner.

We will also need to review all algorithms/wrapper ranges that take a range-of-range parameter for correctness.


> >So much of this would have been easier though if reference type ranges had been disallowed. Too late now though. We'll just have to fix such bugs as we find them and strengthen the unit tests so that they get caught like they should be.
> 
> I think this whole issue is akin to the "transitory" discussion of a while ago.
[...]

Which has still not be resolved. :-/

Can we at the very least agree that any Phobos algorithm that *can* be made to work with transient ranges (without incurring performance or other problems), should? I can work through the list I posted some time ago and submit pulls for them.

It would be better than doing nothing at all, though this is IMO still not fully satisfactory. If nothing else, I think we should update the documentation to warn users of potential pitfalls with using transient ranges, and indicate clearly which algorithms have been vetted for safe operation with transient ranges.


T

-- 
Why are you blatanly misspelling "blatant"? -- Branden Robinson
« First   ‹ Prev
1 2