Thread overview
Adding a read primitive to ranges
May 04, 2015
Freddy
May 04, 2015
ketmar
May 04, 2015
Alex Parrill
May 04, 2015
Freddy
May 04, 2015
Alex Parrill
May 05, 2015
Freddy
May 05, 2015
Freddy
May 05, 2015
Alex Parrill
May 05, 2015
Freddy
May 05, 2015
Freddy
May 04, 2015
Would it be a bad idea to add a read primitive to ranges for streaming?
----
struct ReadRange(T){
    size_t read(T[] buffer);
    //and | or
    T[] read(size_t request);

    /+ empty,front,popFront,etc +/
}
----
May 04, 2015
On Mon, 04 May 2015 00:07:25 +0000, Freddy wrote:

> Would it be a bad idea to add a read primitive to ranges for streaming?
> ----
> struct ReadRange(T){
>      size_t read(T[] buffer); //and | or T[] read(size_t request);
> 
>      /+ empty,front,popFront,etc +/
> }
> ----

if you want to add such things, i'd say you should model that by `std.stdio.File` (`rawRead`, `rawWrite` and other file functions).

i'm using my `streams` module that uses such interfaces for a long time.

can't see why it should be range, though. i introduced "Stream" entity, which, like range, can be checked with various traits: isReadableStream, isWriteableStream, isSeekableStream and so on. note that stream can be range too, that's completely different interfaces.

what is good with taking `std.stdio.File` as a base -- all my stream operations immediately usable on standard file objects from Phobos.

May 04, 2015
On Monday, 4 May 2015 at 00:07:27 UTC, Freddy wrote:
> Would it be a bad idea to add a read primitive to ranges for streaming?
> ----
> struct ReadRange(T){
>     size_t read(T[] buffer);
>     //and | or
>     T[] read(size_t request);
>
>     /+ empty,front,popFront,etc +/
> }
> ----

IT seems redundant to me. It's semantically no different than iterating through the range normally with front/popFront. For objects where reading large amounts of data is more efficient than reading one-at-a-time, you can implement a byChunks function like stdio.File.
May 04, 2015
On Monday, 4 May 2015 at 15:16:25 UTC, Alex Parrill wrote:
>
> IT seems redundant to me. It's semantically no different than iterating through the range normally with front/popFront. For objects where reading large amounts of data is more efficient than reading one-at-a-time, you can implement a byChunks function like stdio.File.

The ploblem is that all the functions in std.range,std.algorithm and many other wrappers would ignore byChucks and produce much slower code.
May 04, 2015
On Monday, 4 May 2015 at 19:23:08 UTC, Freddy wrote:
> On Monday, 4 May 2015 at 15:16:25 UTC, Alex Parrill wrote:
>
> The ploblem is that all the functions in std.range,std.algorithm and many other wrappers would ignore byChucks and produce much slower code.

How so? `file.byChunks(4096).joiner` is a range that acts as if you read each byte out of the file one at a time, but actually reads them in 4096-byte buffers. It's still compatible with all of the range and algorithm functions.
May 05, 2015
On Monday, 4 May 2015 at 23:20:57 UTC, Alex Parrill wrote:
> On Monday, 4 May 2015 at 19:23:08 UTC, Freddy wrote:
>> On Monday, 4 May 2015 at 15:16:25 UTC, Alex Parrill wrote:
>>
>> The ploblem is that all the functions in std.range,std.algorithm and many other wrappers would ignore byChucks and produce much slower code.
>
> How so? `file.byChunks(4096).joiner` is a range that acts as if you read each byte out of the file one at a time, but actually reads them in 4096-byte buffers. It's still compatible with all of the range and algorithm functions.

Reading an arbitrary number of data after being wrapped.
For example
----
void func(R)(R range){//expects range of strings
    string[] elms=range.read(5);
    string[] elms2=range.read(9);
    /++..++/
}


void caller(){
    auto file=...;//unbuffered file
    file.map!(a=>a.to!string).func();
}
----
Using byChucks would cause much more reallocation.
May 05, 2015
On Tuesday, 5 May 2015 at 00:50:44 UTC, Freddy wrote:
> ----
> void func(R)(R range){//expects range of strings
>     string[] elms=range.read(5);
>     string[] elms2=range.read(9);
>     /++..++/
> }
>
>
> void caller(){
>     auto file=...;//unbuffered file
>     file.map!(a=>a.to!string).func();
> }
> ----
Wait, Bad example,
----
void func(R)(R range){//expects range of ubyte
    ubyte[] data=range.read(VERY_BIG_NUMBER);
    ubyte[] other_data=range.read(OTHER_VERY_BIG_NUMBER);
}
----
which would be more optimal for a file but still works for other ranges, compared to looping though the ranges read appending to data.
May 05, 2015
On Monday, 4 May 2015 at 00:07:27 UTC, Freddy wrote:
> Would it be a bad idea to add a read primitive to ranges for streaming?
> ----
> struct ReadRange(T){
>     size_t read(T[] buffer);
>     //and | or
>     T[] read(size_t request);
>
>     /+ empty,front,popFront,etc +/
> }
> ----

Also if so, What about adding a default read for input ranges.
Something like
----
typeof(range.front)[] read(R)(ref R range,size_t amount){
    auto data=new typeof(range.front)[amount];
    /+... read into data ...+/
    return data[0..actual_amount];
}
----
May 05, 2015
On Tuesday, 5 May 2015 at 01:28:03 UTC, Freddy wrote:
> Wait, Bad example,
> ----
> void func(R)(R range){//expects range of ubyte
>     ubyte[] data=range.read(VERY_BIG_NUMBER);
>     ubyte[] other_data=range.read(OTHER_VERY_BIG_NUMBER);
> }
> ----
> which would be more optimal for a file but still works for other ranges, compared to looping though the ranges read appending to data.

How would it be more optimal? As I said, if you pass in `file.byChunks(some_amount).joiner`, this will still read the file in large chunks. It's less optimal now because `read` has to allocate an array on every call (easily avoidable by passing in a reusable buffer, but still).

Equivalent code with ranges:

    auto range = file.byChunks(4096).joiner;
    ubyte[] data = range.take(VERY_BIG_NUMBER).array;
    ubyte[] other_data = range.take(OTHER_VERY_BIG_NUMBER).array;
May 05, 2015
> How would it be more optimal? As I said, if you pass in `file.byChunks(some_amount).joiner`, this will still read the file in large chunks. It's less optimal now because `read` has to allocate an array on every call (easily avoidable by passing in a reusable buffer, but still).
>
> Equivalent code with ranges:
>
>     auto range = file.byChunks(4096).joiner;
>     ubyte[] data = range.take(VERY_BIG_NUMBER).array;
>     ubyte[] other_data = range.take(OTHER_VERY_BIG_NUMBER).array;

The range solution copies from a buffer to a newly allocated array many times, doing many system calls.
The read(stream) solution allocates a new array and does one system call.

Sorry for the miscommunication.