Thread overview
std.range.chunk without length
Oct 30, 2013
Stephan Schiffels
Oct 30, 2013
qznc
Oct 31, 2013
Stephan Schiffels
Feb 13, 2014
Stephan Schiffels
Feb 13, 2014
bearophile
Feb 13, 2014
monarch_dodra
Feb 18, 2014
Stephan Schiffels
October 30, 2013
Hi,

I'd like a version of std.range.chunk that does not require the range to have the "length" property.

As an example, consider a file that you would like parse by lines and always lump together four lines, i.e.

import std.stdio;
void main() {
  auto range = File("test.txt", "r").byLine();
  foreach(c; range.chunks(4)) { //doesn't compile
    writefln("%s %s", c[0], c[1]);
  }
}

Thanks,
Stephan
October 30, 2013
On Wednesday, 30 October 2013 at 00:20:12 UTC, Stephan Schiffels wrote:
> Hi,
>
> I'd like a version of std.range.chunk that does not require the range to have the "length" property.
>
> As an example, consider a file that you would like parse by lines and always lump together four lines, i.e.
>
> import std.stdio;
> void main() {
>   auto range = File("test.txt", "r").byLine();
>   foreach(c; range.chunks(4)) { //doesn't compile
>     writefln("%s %s", c[0], c[1]);
>   }
> }

Your wish was granted. Monarchdodra was sent back in time [0], so it is already fixed in HEAD. You could try the dmd beta [1].


[0] https://github.com/D-Programming-Language/phobos/pull/992
[1] http://forum.dlang.org/thread/526DD8C5.2040402@digitalmars.com
October 31, 2013
On Wednesday, 30 October 2013 at 20:43:54 UTC, qznc wrote:
> On Wednesday, 30 October 2013 at 00:20:12 UTC, Stephan Schiffels wrote:
>> Hi,
>>
>> I'd like a version of std.range.chunk that does not require the range to have the "length" property.
>>
>> As an example, consider a file that you would like parse by lines and always lump together four lines, i.e.
>>
>> import std.stdio;
>> void main() {
>>  auto range = File("test.txt", "r").byLine();
>>  foreach(c; range.chunks(4)) { //doesn't compile
>>    writefln("%s %s", c[0], c[1]);
>>  }
>> }
>
> Your wish was granted. Monarchdodra was sent back in time [0], so it is already fixed in HEAD. You could try the dmd beta [1].
>
>
> [0] https://github.com/D-Programming-Language/phobos/pull/992
> [1] http://forum.dlang.org/thread/526DD8C5.2040402@digitalmars.com

Ah, awesome! Should have updated my github clone then.
Thanks,
Stephan

February 13, 2014
On Thursday, 31 October 2013 at 10:35:54 UTC, Stephan Schiffels wrote:
> On Wednesday, 30 October 2013 at 20:43:54 UTC, qznc wrote:
>> On Wednesday, 30 October 2013 at 00:20:12 UTC, Stephan Schiffels wrote:
>>> Hi,
>>>
>>> I'd like a version of std.range.chunk that does not require the range to have the "length" property.
>>>
>>> As an example, consider a file that you would like parse by lines and always lump together four lines, i.e.
>>>
>>> import std.stdio;
>>> void main() {
>>> auto range = File("test.txt", "r").byLine();
>>> foreach(c; range.chunks(4)) { //doesn't compile
>>>   writefln("%s %s", c[0], c[1]);
>>> }
>>> }
>>
>> Your wish was granted. Monarchdodra was sent back in time [0], so it is already fixed in HEAD. You could try the dmd beta [1].
>>
>>
>> [0] https://github.com/D-Programming-Language/phobos/pull/992
>> [1] http://forum.dlang.org/thread/526DD8C5.2040402@digitalmars.com
>
> Ah, awesome! Should have updated my github clone then.
> Thanks,
> Stephan

Sorry for the late follow up, but it turns out that std.range.chunks needs a ForwardRange, and hence does not work on File.byLine(). The referenced pull request claims that it does in the comments, but of course the current implementation needs a "save()" function which doesn't exist for the byLine range.

It would be actually easy to implement chunks without the "save" function, by using an internal buffer, which would however make this algorithm's memory burden linear in the chunk size. Would that be acceptable? If so, I'd be happy to make that change and push it.

Stephan
February 13, 2014
Stephan Schiffels:

> It would be actually easy to implement chunks without the "save" function, by using an internal buffer, which would however make this algorithm's memory burden linear in the chunk size. Would that be acceptable?

I think it's acceptable. But perhaps you need to add one more optional argument for the buffer :-)

Bye,
bearophile
February 13, 2014
On Thursday, 13 February 2014 at 14:45:44 UTC, bearophile wrote:
> Stephan Schiffels:
>
>> It would be actually easy to implement chunks without the "save" function, by using an internal buffer, which would however make this algorithm's memory burden linear in the chunk size. Would that be acceptable?
>
> I think it's acceptable. But perhaps you need to add one more optional argument for the buffer :-)
>
> Bye,
> bearophile

Users andralex:
https://github.com/D-Programming-Language/phobos/pull/1186

And quickfur:
https://github.com/D-Programming-Language/phobos/pull/1453

Have submitted different algorithms for a similar problem: Basically, bu being "2-dimensional lazy" (each subrange is itself a lazy range). However, both come with their own pitfalls.

Andrei's still requires forward ranges. quickfur's doesn't, and, arguably, has a simpler design. However, if I remember correctly, it is also less efficient (it does double work). Implementing Quickfur's solution in Chunks for input ranges only could be a good idea.

It *is* extra work, more code, more code to cover (that is difficult to cover). I'm not sure we have the man power to support such complexity: I was able to make chunks work with forward ranges, but I still haven't even fixed Splitter yet! I think that should take precedence.
February 18, 2014
On Thursday, 13 February 2014 at 17:41:37 UTC, monarch_dodra wrote:
> On Thursday, 13 February 2014 at 14:45:44 UTC, bearophile wrote:
>> Stephan Schiffels:
>>
>>> It would be actually easy to implement chunks without the "save" function, by using an internal buffer, which would however make this algorithm's memory burden linear in the chunk size. Would that be acceptable?
>>
>> I think it's acceptable. But perhaps you need to add one more optional argument for the buffer :-)
>>
>> Bye,
>> bearophile
>
> Users andralex:
> https://github.com/D-Programming-Language/phobos/pull/1186
>
> And quickfur:
> https://github.com/D-Programming-Language/phobos/pull/1453
>
> Have submitted different algorithms for a similar problem: Basically, bu being "2-dimensional lazy" (each subrange is itself a lazy range). However, both come with their own pitfalls.
>
> Andrei's still requires forward ranges. quickfur's doesn't, and, arguably, has a simpler design. However, if I remember correctly, it is also less efficient (it does double work). Implementing Quickfur's solution in Chunks for input ranges only could be a good idea.
>
> It *is* extra work, more code, more code to cover (that is difficult to cover). I'm not sure we have the man power to support such complexity: I was able to make chunks work with forward ranges, but I still haven't even fixed Splitter yet! I think that should take precedence.

Yeah, nevermind, I won't do it. I realised that you had good reasons to require a ForwardRange. Chunking really needs some sort of "save" implemented. And what I had in mind to make it work on File.byLine with a buffer is actually a hack that effectively adds "save" functionality to the InputRange… so I agree it's logically not reasonable to do it here.

Thanks anyway.
Stephan