Adding a read primitive to ranges

Would it be a bad idea to add a read primitive to ranges for streaming? ---- struct ReadRange(T){ size_t read(T[] buffer); //and | or T[] read(size_t request); /+ empty,front,popFront,etc +/ } ----

On Mon, 04 May 2015 00:07:25 +0000, Freddy wrote: > Would it be a bad idea to add a read primitive to ranges for streaming? > ---- > struct ReadRange(T){ > size_t read(T[] buffer); //and | or T[] read(size_t request); > > /+ empty,front,popFront,etc +/ > } > ---- if you want to add such things, i'd say you should model that by `std.stdio.File` (`rawRead`, `rawWrite` and other file functions). i'm using my `streams` module that uses such interfaces for a long time. can't see why it should be range, though. i introduced "Stream" entity, which, like range, can be checked with various traits: isReadableStream, isWriteableStream, isSeekableStream and so on. note that stream can be range too, that's completely different interfaces. what is good with taking `std.stdio.File` as a base -- all my stream operations immediately usable on standard file objects from Phobos.

On Monday, 4 May 2015 at 00:07:27 UTC, Freddy wrote: > Would it be a bad idea to add a read primitive to ranges for streaming? > ---- > struct ReadRange(T){ > size_t read(T[] buffer); > //and | or > T[] read(size_t request); > > /+ empty,front,popFront,etc +/ > } > ---- IT seems redundant to me. It's semantically no different than iterating through the range normally with front/popFront. For objects where reading large amounts of data is more efficient than reading one-at-a-time, you can implement a byChunks function like stdio.File.

On Monday, 4 May 2015 at 15:16:25 UTC, Alex Parrill wrote: > > IT seems redundant to me. It's semantically no different than iterating through the range normally with front/popFront. For objects where reading large amounts of data is more efficient than reading one-at-a-time, you can implement a byChunks function like stdio.File. The ploblem is that all the functions in std.range,std.algorithm and many other wrappers would ignore byChucks and produce much slower code.

On Monday, 4 May 2015 at 19:23:08 UTC, Freddy wrote: > On Monday, 4 May 2015 at 15:16:25 UTC, Alex Parrill wrote: > > The ploblem is that all the functions in std.range,std.algorithm and many other wrappers would ignore byChucks and produce much slower code. How so? `file.byChunks(4096).joiner` is a range that acts as if you read each byte out of the file one at a time, but actually reads them in 4096-byte buffers. It's still compatible with all of the range and algorithm functions.

May 05, 2015

Re: Adding a read primitive to ranges

Posted by Freddy
in reply to Alex Parrill

Permalink

Freddy

Posted in reply to Alex Parrill

Permalink

On Monday, 4 May 2015 at 23:20:57 UTC, Alex Parrill wrote:
> On Monday, 4 May 2015 at 19:23:08 UTC, Freddy wrote:
>> On Monday, 4 May 2015 at 15:16:25 UTC, Alex Parrill wrote:
>>
>> The ploblem is that all the functions in std.range,std.algorithm and many other wrappers would ignore byChucks and produce much slower code.
>
> How so? `file.byChunks(4096).joiner` is a range that acts as if you read each byte out of the file one at a time, but actually reads them in 4096-byte buffers. It's still compatible with all of the range and algorithm functions.

Reading an arbitrary number of data after being wrapped.
For example
----
void func(R)(R range){//expects range of strings
    string[] elms=range.read(5);
    string[] elms2=range.read(9);
    /++..++/
}


void caller(){
    auto file=...;//unbuffered file
    file.map!(a=>a.to!string).func();
}
----
Using byChucks would cause much more reallocation.

On Tuesday, 5 May 2015 at 00:50:44 UTC, Freddy wrote: > ---- > void func(R)(R range){//expects range of strings > string[] elms=range.read(5); > string[] elms2=range.read(9); > /++..++/ > } > > > void caller(){ > auto file=...;//unbuffered file > file.map!(a=>a.to!string).func(); > } > ---- Wait, Bad example, ---- void func(R)(R range){//expects range of ubyte ubyte[] data=range.read(VERY_BIG_NUMBER); ubyte[] other_data=range.read(OTHER_VERY_BIG_NUMBER); } ---- which would be more optimal for a file but still works for other ranges, compared to looping though the ranges read appending to data.

On Monday, 4 May 2015 at 00:07:27 UTC, Freddy wrote: > Would it be a bad idea to add a read primitive to ranges for streaming? > ---- > struct ReadRange(T){ > size_t read(T[] buffer); > //and | or > T[] read(size_t request); > > /+ empty,front,popFront,etc +/ > } > ---- Also if so, What about adding a default read for input ranges. Something like ---- typeof(range.front)[] read(R)(ref R range,size_t amount){ auto data=new typeof(range.front)[amount]; /+... read into data ...+/ return data[0..actual_amount]; } ----

On Tuesday, 5 May 2015 at 01:28:03 UTC, Freddy wrote: > Wait, Bad example, > ---- > void func(R)(R range){//expects range of ubyte > ubyte[] data=range.read(VERY_BIG_NUMBER); > ubyte[] other_data=range.read(OTHER_VERY_BIG_NUMBER); > } > ---- > which would be more optimal for a file but still works for other ranges, compared to looping though the ranges read appending to data. How would it be more optimal? As I said, if you pass in `file.byChunks(some_amount).joiner`, this will still read the file in large chunks. It's less optimal now because `read` has to allocate an array on every call (easily avoidable by passing in a reusable buffer, but still). Equivalent code with ranges: auto range = file.byChunks(4096).joiner; ubyte[] data = range.take(VERY_BIG_NUMBER).array; ubyte[] other_data = range.take(OTHER_VERY_BIG_NUMBER).array;

> How would it be more optimal? As I said, if you pass in `file.byChunks(some_amount).joiner`, this will still read the file in large chunks. It's less optimal now because `read` has to allocate an array on every call (easily avoidable by passing in a reusable buffer, but still). > > Equivalent code with ranges: > > auto range = file.byChunks(4096).joiner; > ubyte[] data = range.take(VERY_BIG_NUMBER).array; > ubyte[] other_data = range.take(OTHER_VERY_BIG_NUMBER).array; The range solution copies from a buffer to a newly allocated array many times, doing many system calls. The read(stream) solution allocates a new array and does one system call. Sorry for the miscommunication.

Forums