Thread overview | ||||||
---|---|---|---|---|---|---|
|
October 16, 2012 Tricky semantics of ranges & potentially numerous Phobos bugs | ||||
---|---|---|---|---|
| ||||
In an effort to locate the suspected Phobos bug that I ran into in my new Cartesian product implementation, I looked over the code for std.algorithm joiner more carefully, and noticed the following:
Given a range of ranges ror, it assigns ror.front to a struct member and then calls ror.popFront() immediately. Then as the user calls the joined range's popFront, it successively pops its local copy of ror.front until it's empty, whereupon it assigns the new ror.front to the local copy again, and so on.
While this works for array of arrays, it *doesn't* work for ranges that overwrite ror.front when ror.popFront() is called. For example, it wouldn't work for stdin.byLine(), because the underlying buffer it reused. Here's the proof:
Code:
// Filename: test2.d
import std.algorithm;
import std.stdio;
void main() {
auto lines = stdin.byLine();
writeln(lines.joiner);
}
Compile & test:
# Compile
dmd test2.d
# Echo three lines, "abc", "def", and "ghi", and feed it to
# the program.
(echo abc; echo def; echo ghi) | ./test2
# This is the output:
defghighi
As you can see, the output is mangled, because byLine() has reused the line buffer before joiner.Result got to it.
Long story short: saving a local copy of range.front and then calling range.popFront() may _invalidate_ the copy. So basically, either you need a forward range and use .save to keep track of old values of range.front, or you have to refrain from calling popFront() until you're well and truly done with the current value of .front. While not respecting this will work with many common ranges, it will also fail in subtle ways when given other ranges.
The scary thing is, I see similar code like this all over Phobos. Does this mean that most of std.algorithm may need to be revised to address this issue? At the very least, it would seem that a code audit is in order to weed out this particular issue.
:-/
T
--
A linguistics professor was lecturing to his class one day. "In
English," he said, "A double negative forms a positive. In some
languages, though, such as Russian, a double negative is still a
negative. However, there is no language wherein a double positive can
form a negative."
A voice from the back of the room piped up, "Yeah, yeah."
|
October 17, 2012 Re: Tricky semantics of ranges & potentially numerous Phobos bugs | ||||
---|---|---|---|---|
| ||||
Posted in reply to H. S. Teoh | On 10/16/12 1:09 AM, H. S. Teoh wrote:
> The scary thing is, I see similar code like this all over Phobos. Does
> this mean that most of std.algorithm may need to be revised to address
> this issue? At the very least, it would seem that a code audit is in
> order to weed out this particular issue.
Such issues do happen, are relatively rare, and are virtually untested because there's been no unittests with a deliberately "bad" input range. Although of course we do need to add the appropriate fixes and unittests, I'm not worried about it at all systemically.
Andrei
|
October 17, 2012 Re: Tricky semantics of ranges & potentially numerous Phobos bugs | ||||
---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | On Wednesday, 17 October 2012 at 15:14:44 UTC, Andrei Alexandrescu wrote:
> Such issues do happen, are relatively rare, and are virtually untested because there's been no unittests with a deliberately "bad" input range. Although of course we do need to add the appropriate fixes and unittests, I'm not worried about it at all systemically.
If an abstraction encourages use in a way which leads to hard-to-detect logic bugs that do not occur with the most common test cases, this is _exactly_ the thing we should be worried about!
David
|
October 17, 2012 Re: Tricky semantics of ranges & potentially numerous Phobos bugs | ||||
---|---|---|---|---|
| ||||
Posted in reply to David Nadlinger | On 10/17/12 12:57 PM, David Nadlinger wrote:
> On Wednesday, 17 October 2012 at 15:14:44 UTC, Andrei Alexandrescu wrote:
>> Such issues do happen, are relatively rare, and are virtually untested
>> because there's been no unittests with a deliberately "bad" input
>> range. Although of course we do need to add the appropriate fixes and
>> unittests, I'm not worried about it at all systemically.
>
> If an abstraction encourages use in a way which leads to hard-to-detect
> logic bugs that do not occur with the most common test cases, this is
> _exactly_ the thing we should be worried about!
When I designed input ranges vs. forward ranges there were long discussion on how to distinguish them. If you have a better design it would probably not influence the state of the affair, but it would be a good discussion to have. FWIW I can already think of a couple but all rely on UFCS.
Andrei
|
Copyright © 1999-2021 by the D Language Foundation