Thread overview
[Issue 9599] New: File.byLine doesn't function properly with take
Feb 27, 2013
Chris Cain
Jul 22, 2013
Nick Treleaven
February 27, 2013
http://d.puremagic.com/issues/show_bug.cgi?id=9599

           Summary: File.byLine doesn't function properly with take
           Product: D
           Version: D2
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Phobos
        AssignedTo: nobody@puremagic.com
        ReportedBy: zshazz@gmail.com


--- Comment #0 from Chris Cain <zshazz@gmail.com> 2013-02-26 23:46:15 PST ---
Using 2.062, Regarding the following code:

---
import std.stdio, std.range;

void main() {
    auto file = File.tmpfile();
    file.write("1\n2\n3\n");
    file.rewind();

    auto fbl = file.byLine();
    foreach(line; fbl.take(1)) writeln(line);
    foreach(line; fbl) writeln(line);
}
---

The expected output for this would be:
---
1
2
3

---

but actual output:
---
1
3

---

Generalized observation: When take is used on a ByLine range, it takes the appropriate number of elements and then consumes one additional element preventing anything else from using it.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
February 27, 2013
http://d.puremagic.com/issues/show_bug.cgi?id=9599


monarchdodra@gmail.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |monarchdodra@gmail.com


--- Comment #1 from monarchdodra@gmail.com 2013-02-27 03:35:11 PST ---
(In reply to comment #0)
> Using 2.062, Regarding the following code:

The bug is actually inside byLine itself, so we can remove take from the
equation. The problem is that byLine is over-eager:
1) Creating a front element eagerly pops that element.
2) poping an element eagerlly parses the next, effectivelly popping it off too
if it is never read:

Reduced test showing this:

//----
import std.stdio;

void main()
{
   auto file = File.tmpfile();
   file.write("1\n2\n3\n4\n5");
   file.rewind();

   auto fbl1 = file.byLine();
   writeln(fbl1.front); //prints 1.

   auto fbl2 = file.byLine();
   writeln(fbl2.front); //prints 2... Wait. Who popped off 1?
   fbl2.popFront();     //pops off 2, and consumes 3.

   auto fbl3 = file.byLine();
   writeln(fbl3.front); //prints 4.
}
//----

Ideally, byLine should be reworked to be a little more lazy, and better
preserve the integrity of its underlying stream:
- "front means do NOT modify the referenced container"
- "pop means remove the CURRENT element, and stop there"

byLine is obviously not doing that. The fact that it is *just* an input range does not mean it gets to bypass standard rules.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
February 27, 2013
http://d.puremagic.com/issues/show_bug.cgi?id=9599



--- Comment #2 from monarchdodra@gmail.com 2013-02-27 10:09:30 PST ---
(In reply to comment #1)
> (In reply to comment #0)
> > Using 2.062, Regarding the following code:
> 
> The bug is actually inside byLine itself

byChunk is subject to the exact same issue.

The fact that they don't behave according to normal range semantics could be a potentially serious problems when not used linearly.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
March 13, 2013
http://d.puremagic.com/issues/show_bug.cgi?id=9599


monarchdodra@gmail.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         AssignedTo|nobody@puremagic.com        |monarchdodra@gmail.com


--- Comment #3 from monarchdodra@gmail.com 2013-03-13 00:20:19 PDT ---
(In reply to comment #1)
> Ideally, byLine should be reworked to be a little more lazy, and better
> preserve the integrity of its underlying stream:
> - "front means do NOT modify the referenced container"
> - "pop means remove the CURRENT element, and stop there"

Either that, or take the

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
July 22, 2013
http://d.puremagic.com/issues/show_bug.cgi?id=9599


Nick Treleaven <ntrel-public@yahoo.co.uk> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |pull
                 CC|                            |ntrel-public@yahoo.co.uk


--- Comment #4 from Nick Treleaven <ntrel-public@yahoo.co.uk> 2013-07-22 08:32:06 PDT ---
(In reply to comment #1)
> The bug is actually inside byLine itself, so we can remove take from the
> equation. The problem is that byLine is over-eager:
> 1) Creating a front element eagerly pops that element.
> 2) poping an element eagerlly parses the next, effectivelly popping it off too
> if it is never read:

https://github.com/D-Programming-Language/phobos/pull/1433

>    auto file = File.tmpfile();
>    file.write("1\n2\n3\n4\n5");
>    file.rewind();
> 
>    auto fbl1 = file.byLine();
>    writeln(fbl1.front); //prints 1.
> 
>    auto fbl2 = file.byLine();
>    writeln(fbl2.front); //prints 2... Wait. Who popped off 1?

I think the above behaviour is understandable for a range like ByLine.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
August 19, 2013
http://d.puremagic.com/issues/show_bug.cgi?id=9599



--- Comment #5 from github-bugzilla@puremagic.com 2013-08-19 10:50:18 PDT ---
Commits pushed to master at https://github.com/D-Programming-Language/phobos

https://github.com/D-Programming-Language/phobos/commit/4c2a8bea355e2a980b21d41c5454fe7a34de1777 Add unittest for issue 9599, plus some other byLine cases

https://github.com/D-Programming-Language/phobos/commit/ec1f0fdb9d3f4b9ffd3acd444d27195ffc6a15fb Fix Issue 9599 - File.byLine doesn't function properly with take

Calling take could wrongly pop an extra line from the range. Solved by making ByLine use reference-counting.

Note: Just changing ByLine not to eagerly read the next line was not
sufficient to handle all cases properly (plus that makes empty() less
efficient).

Note: ByLine was documented until recently.

https://github.com/D-Programming-Language/phobos/commit/7bc6e8153921b10eb61179ec318e01b825ff94c5 Merge pull request #1433 from ntrel/byLine-take

Fix Issue 9599 - File.byLine doesn't function properly with take

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
August 19, 2013
http://d.puremagic.com/issues/show_bug.cgi?id=9599


monarchdodra@gmail.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED


-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------