Thread overview | |||||||||
---|---|---|---|---|---|---|---|---|---|
|
June 04, 2010 File.byLine should return dups? | ||||
---|---|---|---|---|
| ||||
Hi folks, I expected this program to print the lines of a file in reverse order: // bad.d import std.stdio; import std.range; void main() { auto f = File("bad.d"); foreach(char[] line; retro(array(f.byLine()))) writeln(line); } ...but instead it prints a jumble of line fragments. I suspect that it's because byLine() is not dup'ing the arrays it's reading from the file? This version works: // good.d import std.stdio; import std.range; Retro!(char[][]) retroLines(File f) { char[][] lines; foreach(line; f.byLine()) lines ~= line.dup; // note the .dup! return retro(lines); } void main() { auto f = File("good.d"); foreach(line; retroLines(f)) writeln(line); } ..but if you remove the '.dup' then it prints a similar mess. So, is there a bug in byLine(), or am I just using it wrong? Best, Graham |
June 04, 2010 Re: File.byLine should return dups? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Graham Fawcett | On Fri, 04 Jun 2010 15:35:06 -0400, Graham Fawcett <fawcett@uwindsor.ca> wrote: > Hi folks, > > I expected this program to print the lines of a file in reverse order: > > // bad.d > import std.stdio; > import std.range; > void main() { > auto f = File("bad.d"); > foreach(char[] line; retro(array(f.byLine()))) > writeln(line); > } > > ...but instead it prints a jumble of line fragments. I suspect that > it's because byLine() is not dup'ing the arrays it's reading from the > file? > > This version works: > > // good.d > import std.stdio; > import std.range; > Retro!(char[][]) retroLines(File f) { > char[][] lines; > foreach(line; f.byLine()) > lines ~= line.dup; // note the .dup! > return retro(lines); > } > void main() { > auto f = File("good.d"); > foreach(line; retroLines(f)) > writeln(line); > } > > ..but if you remove the '.dup' then it prints a similar mess. > > So, is there a bug in byLine(), or am I just using it wrong? The latter. File is re-using the buffer for each line, so you are seeing the data get overwritten. This is for performance reasons. Not everyone wants incur heap allocations for every line of a file As you showed, it's possible to get the desired behavior if you need it. The reverse would be impossible. Now, that being said, a nice addition would be to create a duper range that allows you to do one expression: foreach(char[] line; retro(array(duper(f.byLine())))) -Steve Copied from my .announce reply :) |
June 04, 2010 Re: File.byLine should return dups? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Steven Schveighoffer | Hi Steven, On Fri, 04 Jun 2010 15:43:18 -0400, Steven Schveighoffer wrote: > On Fri, 04 Jun 2010 15:35:06 -0400, Graham Fawcett <fawcett@uwindsor.ca> wrote: > >> Hi folks, >> >> I expected this program to print the lines of a file in reverse order: >> >> // bad.d >> import std.stdio; >> import std.range; >> void main() { >> auto f = File("bad.d"); >> foreach(char[] line; retro(array(f.byLine()))) >> writeln(line); >> } >> >> ...but instead it prints a jumble of line fragments. I suspect that it's because byLine() is not dup'ing the arrays it's reading from the file? >> >> This version works: >> >> // good.d >> import std.stdio; >> import std.range; >> Retro!(char[][]) retroLines(File f) { >> char[][] lines; >> foreach(line; f.byLine()) >> lines ~= line.dup; // note the .dup! >> return retro(lines); >> } >> void main() { >> auto f = File("good.d"); >> foreach(line; retroLines(f)) >> writeln(line); >> } >> >> ..but if you remove the '.dup' then it prints a similar mess. >> >> So, is there a bug in byLine(), or am I just using it wrong? > > The latter. File is re-using the buffer for each line, so you are seeing the data get overwritten. This is for performance reasons. Not everyone wants incur heap allocations for every line of a file As you showed, it's possible to get the desired behavior if you need it. The reverse would be impossible. > Thanks for the response. Yes, it makes mode to favour performance for the common case, and as you say it's not hard to resolve this issue. > Now, that being said, a nice addition would be to create a duper range that allows you to do one expression: > > foreach(char[] line; retro(array(duper(f.byLine())))) Yes -- duper (and iduper) for the win! Great idea. > -Steve > > Copied from my .announce reply :) Sorry again for the .announce posting. :P Graham |
June 04, 2010 Re: File.byLine should return dups? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Graham Fawcett | On 06/04/2010 02:58 PM, Graham Fawcett wrote:
[snip]
> Thanks for the response. Yes, it makes mode to favour performance for
> the common case, and as you say it's not hard to resolve this issue.
>
>> Now, that being said, a nice addition would be to create a duper range
>> that allows you to do one expression:
>>
>> foreach(char[] line; retro(array(duper(f.byLine()))))
>
> Yes -- duper (and iduper) for the win! Great idea.
I wonder what we can do to automatically reject such programs.
Andrei
|
June 04, 2010 Re: File.byLine should return dups? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Graham Fawcett | On Fri, 04 Jun 2010 19:58:43 +0000, Graham Fawcett wrote: > Hi Steven, > > On Fri, 04 Jun 2010 15:43:18 -0400, Steven Schveighoffer wrote: > >> On Fri, 04 Jun 2010 15:35:06 -0400, Graham Fawcett <fawcett@uwindsor.ca> wrote: >> >>> Hi folks, >>> >>> I expected this program to print the lines of a file in reverse order: >>> >>> // bad.d >>> import std.stdio; >>> import std.range; >>> void main() { >>> auto f = File("bad.d"); >>> foreach(char[] line; retro(array(f.byLine()))) >>> writeln(line); >>> } >>> >>> ...but instead it prints a jumble of line fragments. I suspect that it's because byLine() is not dup'ing the arrays it's reading from the file? >>> >>> This version works: >>> >>> // good.d >>> import std.stdio; >>> import std.range; >>> Retro!(char[][]) retroLines(File f) { >>> char[][] lines; >>> foreach(line; f.byLine()) >>> lines ~= line.dup; // note the .dup! >>> return retro(lines); >>> } >>> void main() { >>> auto f = File("good.d"); >>> foreach(line; retroLines(f)) >>> writeln(line); >>> } >>> >>> ..but if you remove the '.dup' then it prints a similar mess. >>> >>> So, is there a bug in byLine(), or am I just using it wrong? >> >> The latter. File is re-using the buffer for each line, so you are seeing the data get overwritten. This is for performance reasons. Not everyone wants incur heap allocations for every line of a file As you showed, it's possible to get the desired behavior if you need it. The reverse would be impossible. >> >> > Thanks for the response. Yes, it makes mode to favour performance for the common case, and as you say it's not hard to resolve this issue. > >> Now, that being said, a nice addition would be to create a duper range that allows you to do one expression: >> >> foreach(char[] line; retro(array(duper(f.byLine())))) > > Yes -- duper (and iduper) for the win! Great idea. It just occurred to me that duper/iduper could be defined as alias map!("a.dup") duper; alias map!("a.idup") iduper; Thanks for suggesting duper, I like that idea very much. :) Graham > >> -Steve >> >> Copied from my .announce reply :) > > Sorry again for the .announce posting. :P > > Graham |
June 04, 2010 Re: File.byLine should return dups? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | On Fri, 04 Jun 2010 15:04:56 -0500, Andrei Alexandrescu wrote:
> On 06/04/2010 02:58 PM, Graham Fawcett wrote: [snip]
>> Thanks for the response. Yes, it makes mode to favour performance for the common case, and as you say it's not hard to resolve this issue.
>>
>>> Now, that being said, a nice addition would be to create a duper range that allows you to do one expression:
>>>
>>> foreach(char[] line; retro(array(duper(f.byLine()))))
>>
>> Yes -- duper (and iduper) for the win! Great idea.
>
> I wonder what we can do to automatically reject such programs.
>
> Andrei
isForwardPersistentRange vs. isForwardVolatileRange?
Graham
|
June 06, 2010 Re: File.byLine should return dups? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | On Fri, 04 Jun 2010 16:04:56 -0400, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote: > On 06/04/2010 02:58 PM, Graham Fawcett wrote: > [snip] >> Thanks for the response. Yes, it makes mode to favour performance for >> the common case, and as you say it's not hard to resolve this issue. >> >>> Now, that being said, a nice addition would be to create a duper range >>> that allows you to do one expression: >>> >>> foreach(char[] line; retro(array(duper(f.byLine())))) >> >> Yes -- duper (and iduper) for the win! Great idea. > > I wonder what we can do to automatically reject such programs. Train people that being given a mutable buffer does not mean you own it? :) Seriously though, it's just something you learn. I don't think there's anything the compiler can do to help. -Steve |
Copyright © 1999-2021 by the D Language Foundation