Jump to page: 1 25  
Page
Thread overview
Re: Tricky semantics of ranges & potentially numerous Phobos bugs
Oct 16, 2012
Jonathan M Davis
Oct 16, 2012
Tommi
Oct 16, 2012
H. S. Teoh
Oct 16, 2012
deadalnix
Oct 16, 2012
Peter Alexander
Oct 17, 2012
H. S. Teoh
Oct 16, 2012
Jonathan M Davis
Oct 16, 2012
Tommi
Oct 16, 2012
Jonathan M Davis
Oct 16, 2012
H. S. Teoh
Oct 17, 2012
Timon Gehr
Oct 16, 2012
Philippe Sigaud
Oct 16, 2012
H. S. Teoh
Oct 16, 2012
Jonathan M Davis
Oct 16, 2012
Mehrdad
Oct 16, 2012
bearophile
Oct 17, 2012
H. S. Teoh
Oct 17, 2012
Jonathan M Davis
Oct 16, 2012
H. S. Teoh
Oct 16, 2012
bearophile
Oct 16, 2012
deadalnix
Oct 16, 2012
Jonathan M Davis
Oct 16, 2012
Era Scarecrow
Oct 16, 2012
Jonathan M Davis
Oct 16, 2012
H. S. Teoh
Oct 16, 2012
Jonathan M Davis
Oct 16, 2012
David Gileadi
Oct 16, 2012
Jonathan M Davis
Oct 16, 2012
Jens Mueller
Oct 17, 2012
H. S. Teoh
Oct 17, 2012
Jonathan M Davis
Oct 17, 2012
monarch_dodra
Oct 16, 2012
Jonathan M Davis
Oct 17, 2012
monarch_dodra
Oct 17, 2012
bearophile
Oct 17, 2012
bearophile
Oct 17, 2012
monarch_dodra
Oct 19, 2012
Marco Leise
October 16, 2012
On Monday, October 15, 2012 22:09:05 H. S. Teoh wrote:
> The scary thing is, I see similar code like this all over Phobos. Does this mean that most of std.algorithm may need to be revised to address this issue? At the very least, it would seem that a code audit is in order to weed out this particular issue.

Probably related to http://d.puremagic.com/issues/show_bug.cgi?id=6495

This was discussed in http://d.puremagic.com/issues/show_bug.cgi?id=8084

The thing is that realistically, it's going to be a big problem to not be able to rely on front returning the same value on every call or not being able to rely on the result of front still being around after the call to popFront. Ranges such as ByLine are incredibly abnormal, and it's arguably reasonable that they must be treated specially by the programmer using them. But that could pose a usability problem as well.

So, I don't really know what the right answer is, but I _really_ don't like the idea of having to worry about the result of front changing after a call to popFront in every single function that ever uses front. In the general case, I just don't see how that's tenable. I'd _much_ rather that it be up to the programmer using abnormal ranges such as ByLine to use them correctly.

- Jonathan M Davis
October 16, 2012
On Monday, October 15, 2012 22:48:15 Jonathan M Davis wrote:
> So, I don't really know what the right answer is, but I _really_ don't like the idea of having to worry about the result of front changing after a call to popFront in every single function that ever uses front. In the general case, I just don't see how that's tenable. I'd _much_ rather that it be up to the programmer using abnormal ranges such as ByLine to use them correctly.

And actually, it seems to me that issues like this make it look like it was a mistake to make ranges like ByLine ranges in the first place. They should have just defined opApply so that they'd work nicely in foreach but not with range- based functions. They're clearly not going to work with a _lot_ of range-based functions.

- Jonathan M Davis
October 16, 2012
On Tuesday, 16 October 2012 at 06:03:56 UTC, Jonathan M Davis wrote:
> And actually, it seems to me that issues like this make it look like it was a
> mistake to make ranges like ByLine ranges in the first place. They should have
> just defined opApply so that they'd work nicely in foreach but not with range-
> based functions. They're clearly not going to work with a _lot_ of range-based
> functions.
>
> - Jonathan M Davis

+1
October 16, 2012
On Mon, Oct 15, 2012 at 10:59:26PM -0700, Jonathan M Davis wrote:
> On Monday, October 15, 2012 22:48:15 Jonathan M Davis wrote:
> > So, I don't really know what the right answer is, but I _really_ don't like the idea of having to worry about the result of front changing after a call to popFront in every single function that ever uses front. In the general case, I just don't see how that's tenable. I'd _much_ rather that it be up to the programmer using abnormal ranges such as ByLine to use them correctly.
> 
> And actually, it seems to me that issues like this make it look like it was a mistake to make ranges like ByLine ranges in the first place. They should have just defined opApply so that they'd work nicely in foreach but not with range- based functions. They're clearly not going to work with a _lot_ of range-based functions.
[...]

But nothing about the definition of a range, as currently defined in std.range, guarantees that whatever value was returned by .front is still valid after popFront() is called.

I'm not saying that this should be the "correct" behaviour, but the current definition does not prohibit a range from, say, reusing an array to compute its next element. For example, one may have a range that returns the permutations of a given array, in which popFront() permutes the elements in-place. In this case, .front will become invalid once popFront() is called. Many of the current range-based functions will not work correctly in this case.

Of course, it's arguable whether such ranges should be admissible, but as far as the current definition goes, I don't see anything that states otherwise. If we don't make such things clear, then we're bound to run into pathological cases where bugs will creep in because of unstated assumptions in the code.


T

-- 
Fact is stranger than fiction.
October 16, 2012
Jonathan M Davis:

> And actually, it seems to me that issues like this make it look like it was a
> mistake to make ranges like ByLine ranges in the first place. They should have
> just defined opApply so that they'd work nicely in foreach but not with range-
> based functions. They're clearly not going to work with a _lot_ of range-based
> functions.

I like byLine() to be a range, so it's compatible with std.algorithm stuff. But a short time after the creation of byLine() I suggested to make it copy (dup) lines on default and not copy them on request, this means:

auto lineBuff = "foo.txt".File().byLine().array();
==> good result


auto lineBuff = "foo.txt".File().byLine!false().array();
==> doesn't copy, garbage result.

I design my ranges like that. It's safe because on default (or if you don't know what you are doing) it copies, and it's a bit slower. When you know what you are doing and you want more speed, you disable the copy with a compile-time argument.

Bye,
bearophile
October 16, 2012
H. S. Teoh:

> If we don't make such things clear, then we're bound to run
> into pathological cases where bugs will creep in because of unstated assumptions in the code.

A good language/std library has to protect the ignorant, where possible. And this is a place where it's easy to do this.

Bye,
bearophile
October 16, 2012
On Tuesday, 16 October 2012 at 14:23:28 UTC, bearophile wrote:
> I design my ranges like that. It's safe because on default (or if you don't know what you are doing) it copies, and it's a bit slower. When you know what you are doing and you want more speed, you disable the copy with a compile-time argument.

Have to say that in my (admittedly not so extensive) experience of byLine, it's slow enough anyway that I can't imagine the extra performance hit you describe would be that onerous.  Certainly worth it for making it "safe" as a range.
October 16, 2012
On Tuesday, 16 October 2012 at 05:49:11 UTC, Jonathan M Davis wrote:
> So, I don't really know what the right answer is, but I _really_ don't like the idea of having to worry about
> the result of front changing after a call to popFront
> in every single function that ever uses front.

Isn't the deeper underlying problem here the fact that there's no generic way to make a deep copy in this language (does any language?) Making a deep copy of front would be a clean solution. I don't know how to implement this generic deep copying functionality though. For example: what would be the semantics of deep-copying a range?
October 16, 2012
On Tue, Oct 16, 2012 at 05:01:57PM +0200, Tommi wrote:
> On Tuesday, 16 October 2012 at 05:49:11 UTC, Jonathan M Davis wrote:
> >So, I don't really know what the right answer is, but I _really_ don't like the idea of having to worry about the result of front changing after a call to popFront in every single function that ever uses front.
> 
> Isn't the deeper underlying problem here the fact that there's no generic way to make a deep copy in this language (does any language?) Making a deep copy of front would be a clean solution. I don't know how to implement this generic deep copying functionality though. For example: what would be the semantics of deep-copying a range?

I don't think the problem here is whether we have a deep copy or not, but that the semantics of ranges are not defined clearly enough. If we clearly state, for example, that whatever is returned by .front must persist beyond popFront(), then it would be up to the range to implement the deep copy, e.g., return the .dup or .idup'd array instead of a reference to an internal buffer.

OTOH, if we clearly state that .front may not persist past the next
popFront(), then it would be up to the caller to .dup the return value,
or otherwise make a safe copy of it (or use the value before calling
popFront()).

The current ambiguous situation leads to one range doing one thing, and another range doing something else, and either way, either this code will break or that code will break.


T

-- 
"Holy war is an oxymoron." -- Lazarus Long
October 16, 2012
Le 16/10/2012 15:30, H. S. Teoh a écrit :
> On Mon, Oct 15, 2012 at 10:59:26PM -0700, Jonathan M Davis wrote:
>> On Monday, October 15, 2012 22:48:15 Jonathan M Davis wrote:
>>> So, I don't really know what the right answer is, but I _really_
>>> don't like the idea of having to worry about the result of front
>>> changing after a call to popFront in every single function that ever
>>> uses front. In the general case, I just don't see how that's
>>> tenable. I'd _much_ rather that it be up to the programmer using
>>> abnormal ranges such as ByLine to use them correctly.
>>
>> And actually, it seems to me that issues like this make it look like
>> it was a mistake to make ranges like ByLine ranges in the first place.
>> They should have just defined opApply so that they'd work nicely in
>> foreach but not with range- based functions. They're clearly not going
>> to work with a _lot_ of range-based functions.
> [...]
>
> But nothing about the definition of a range, as currently defined in
> std.range, guarantees that whatever value was returned by .front is
> still valid after popFront() is called.
>
> I'm not saying that this should be the "correct" behaviour, but the
> current definition does not prohibit a range from, say, reusing an array
> to compute its next element. For example, one may have a range that
> returns the permutations of a given array, in which popFront() permutes
> the elements in-place. In this case, .front will become invalid once
> popFront() is called. Many of the current range-based functions will not
> work correctly in this case.
>
> Of course, it's arguable whether such ranges should be admissible, but
> as far as the current definition goes, I don't see anything that states
> otherwise. If we don't make such things clear, then we're bound to run
> into pathological cases where bugs will creep in because of unstated
> assumptions in the code.
>
>
> T
>

The default semantic should be the safe one.

Providing backdoor for experienced developer is useful, but it does arm as default behavior.
« First   ‹ Prev
1 2 3 4 5