Thread overview
[phobos] Improvement of stream
Sep 08, 2010
SHOO
Sep 19, 2010
SHOO
Sep 21, 2010
Shin Fujishiro
Sep 23, 2010
SHOO
Sep 27, 2010
Shin Fujishiro
September 09, 2010
Because it was gathered up about a past article by an argument about Stream of the Japanese community, I introduce it.

This remark translated the summary of the remark of @s50 into English.
( Thanks s50! ( https://twitter.com/s50 ),
See also
http://dusers.dip.jp/modules/forum/index.php?topic_id=75#post_id291 in
Japanese article.)

The past arguments about the input/output are as follows:


*
http://lists.puremagic.com/pipermail/digitalmars-d/2008-October/043320.html
This remark shows a claim that Andrei abandons std.stream for the first
time.
At this stage, it is not planning to yet become ripe.

* http://lists.puremagic.com/pipermail/digitalmars-d/2009-January/048184.html Andrei shows a thought to abolish std.stream again here.

*
http://lists.puremagic.com/pipermail/digitalmars-d/2009-February/049385.html
Andrei awfully criticized std.stream.
He seem to have thought about the input/output that used Range in this time.


*** Process to reach getNext ***
The central figures are Andrei and Steven.
Andrei has a personal view in Range for I/O.
Steven is skeptical about Range, but agrees that developing of new
input/output.

* http://lists.puremagic.com/pipermail/digitalmars-d/2009-May/056773.html The first opinion about popNext is shown.

* http://lists.puremagic.com/pipermail/digitalmars-d/2009-May/056859.html It is the topic that Steve strongly recommends popNext.

* http://lists.puremagic.com/pipermail/digitalmars-d/2009-July/060404.html
popNext to getNext.
Andrei argued that "the simplest and most natural interface for a pure
input stream has only one function getNext which at the same time gets
the element and bumps the stream."

* http://lists.puremagic.com/pipermail/digitalmars-d/2010-March/073856.html
Again.
Related: [Issue 4025] New: Making network with the std.stdio.File interface

* http://lists.puremagic.com/pipermail/phobos/2010-March/000213.html
Adam's approach to std.file.File for socket.
The focus of this topic is reading of the variable-length data packet.

* http://lists.puremagic.com/pipermail/digitalmars-d/2010-July/079011.html Andrei's opinion about getNext is shown here.



*** Ranges and Handles ***
Recent Andrei seems to think that Range(high level interface) and
Handle(primitive interface) should perform input/output.
(Just for the record, I agree this opinion.)

* http://lists.puremagic.com/pipermail/digitalmars-d/2009-May/056942.html At first, His opinion says Ranges handle I/O instead of Streams.

* http://lists.puremagic.com/pipermail/phobos/2010-March/000106.html He seems to think that Range should be offered by Handle recently.

* http://lists.puremagic.com/pipermail/phobos/2010-April/000272.html Let's not forget that File isn't a range. Let's call it a "stream handle" (I push *Steam*. Speaking of Range, it is Microwave, Oven and Overheating-Steam! :-) )

* http://lists.puremagic.com/pipermail/digitalmars-d/2010-June/078675.html
"It's best to have a handle/ranges architecture in which the handle
(e.g. File) is responsible for opening, closing, and managing the
connection, and several ranges are responsible for fetching data in
various ways (by character, by chunk, by line etc.)"

September 08, 2010
Thanks! I'm impressed by all the attention. In brief, I'd say - let's start with a simple streaming interface that accepts user-defined buffers and expand from that.

Andrei

On 9/8/10 12:29 CDT, SHOO wrote:
> Because it was gathered up about a past article by an argument about Stream of the Japanese community, I introduce it.
>
> This remark translated the summary of the remark of @s50 into English.
> ( Thanks s50! ( https://twitter.com/s50 ),
> See also
> http://dusers.dip.jp/modules/forum/index.php?topic_id=75#post_id291 in
> Japanese article.)
>
> The past arguments about the input/output are as follows:
>
>
> *
> http://lists.puremagic.com/pipermail/digitalmars-d/2008-October/043320.html
> This remark shows a claim that Andrei abandons std.stream for the first
> time.
> At this stage, it is not planning to yet become ripe.
>
> * http://lists.puremagic.com/pipermail/digitalmars-d/2009-January/048184.html Andrei shows a thought to abolish std.stream again here.
>
> *
> http://lists.puremagic.com/pipermail/digitalmars-d/2009-February/049385.html
> Andrei awfully criticized std.stream.
> He seem to have thought about the input/output that used Range in this time.
>
>
> *** Process to reach getNext ***
> The central figures are Andrei and Steven.
> Andrei has a personal view in Range for I/O.
> Steven is skeptical about Range, but agrees that developing of new
> input/output.
>
> * http://lists.puremagic.com/pipermail/digitalmars-d/2009-May/056773.html The first opinion about popNext is shown.
>
> * http://lists.puremagic.com/pipermail/digitalmars-d/2009-May/056859.html It is the topic that Steve strongly recommends popNext.
>
> * http://lists.puremagic.com/pipermail/digitalmars-d/2009-July/060404.html
> popNext to getNext.
> Andrei argued that "the simplest and most natural interface for a pure
> input stream has only one function getNext which at the same time gets
> the element and bumps the stream."
>
> * http://lists.puremagic.com/pipermail/digitalmars-d/2010-March/073856.html
> Again.
> Related: [Issue 4025] New: Making network with the std.stdio.File interface
>
> * http://lists.puremagic.com/pipermail/phobos/2010-March/000213.html
> Adam's approach to std.file.File for socket.
> The focus of this topic is reading of the variable-length data packet.
>
> * http://lists.puremagic.com/pipermail/digitalmars-d/2010-July/079011.html Andrei's opinion about getNext is shown here.
>
>
>
> *** Ranges and Handles ***
> Recent Andrei seems to think that Range(high level interface) and
> Handle(primitive interface) should perform input/output.
> (Just for the record, I agree this opinion.)
>
> * http://lists.puremagic.com/pipermail/digitalmars-d/2009-May/056942.html At first, His opinion says Ranges handle I/O instead of Streams.
>
> * http://lists.puremagic.com/pipermail/phobos/2010-March/000106.html He seems to think that Range should be offered by Handle recently.
>
> * http://lists.puremagic.com/pipermail/phobos/2010-April/000272.html Let's not forget that File isn't a range. Let's call it a "stream handle" (I push *Steam*. Speaking of Range, it is Microwave, Oven and Overheating-Steam! :-) )
>
> * http://lists.puremagic.com/pipermail/digitalmars-d/2010-June/078675.html
> "It's best to have a handle/ranges architecture in which the handle
> (e.g. File) is responsible for opening, closing, and managing the
> connection, and several ranges are responsible for fetching data in
> various ways (by character, by chunk, by line etc.)"
>
> _______________________________________________
> phobos mailing list
> phobos at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/phobos
September 19, 2010
I think that there are two problems about I/O operation.
- Location of buffering layers.
- Direction of seeking.

Other than these two problems, it is settled from requirements naturally.
Requirements are:
- Being easy to expand Handle.
   For File, Socket, Pipe, Memory, VirtualFile, USB, Bluetooth,
   SerialPort, and so on. Handle should be the interface that can
   support them by minimum work. This means that Handle cannot have
   optional functions such as byLine as required items.
- The handling should be simple for a user.
   This is equal to supporting Ranges. General-purpose functions or
   classes to generate Range are necessary to be able to treat Handle
   easily.

It is necessary for the concept to be divided in two at least to realize
them. (Handles and Ranges) Or more(+ Port or Stream).
The opening difficult item appears when I think about this.

(2010/09/09 2:29), SHOO wrote:
> Because it was gathered up about a past article by an argument about Stream of the Japanese community, I introduce it.
>
> This remark translated the summary of the remark of @s50 into English.
> ( Thanks s50! ( https://twitter.com/s50 ),
> See also
> http://dusers.dip.jp/modules/forum/index.php?topic_id=75#post_id291 in
> Japanese article.)
>
> The past arguments about the input/output are as follows:
>
>
> *
> http://lists.puremagic.com/pipermail/digitalmars-d/2008-October/043320.html
> This remark shows a claim that Andrei abandons std.stream for the first
> time.
> At this stage, it is not planning to yet become ripe.
>
> * http://lists.puremagic.com/pipermail/digitalmars-d/2009-January/048184.html Andrei shows a thought to abolish std.stream again here.
>
> *
> http://lists.puremagic.com/pipermail/digitalmars-d/2009-February/049385.html
> Andrei awfully criticized std.stream.
> He seem to have thought about the input/output that used Range in this time.
>
>
> *** Process to reach getNext ***
> The central figures are Andrei and Steven.
> Andrei has a personal view in Range for I/O.
> Steven is skeptical about Range, but agrees that developing of new
> input/output.
>
> * http://lists.puremagic.com/pipermail/digitalmars-d/2009-May/056773.html The first opinion about popNext is shown.
>
> * http://lists.puremagic.com/pipermail/digitalmars-d/2009-May/056859.html It is the topic that Steve strongly recommends popNext.
>
> * http://lists.puremagic.com/pipermail/digitalmars-d/2009-July/060404.html
> popNext to getNext.
> Andrei argued that "the simplest and most natural interface for a pure
> input stream has only one function getNext which at the same time gets
> the element and bumps the stream."
>
> * http://lists.puremagic.com/pipermail/digitalmars-d/2010-March/073856.html
> Again.
> Related: [Issue 4025] New: Making network with the std.stdio.File interface
>
> * http://lists.puremagic.com/pipermail/phobos/2010-March/000213.html
> Adam's approach to std.file.File for socket.
> The focus of this topic is reading of the variable-length data packet.
>
> * http://lists.puremagic.com/pipermail/digitalmars-d/2010-July/079011.html Andrei's opinion about getNext is shown here.
>
>
>
> *** Ranges and Handles ***
> Recent Andrei seems to think that Range(high level interface) and
> Handle(primitive interface) should perform input/output.
> (Just for the record, I agree this opinion.)
>
> * http://lists.puremagic.com/pipermail/digitalmars-d/2009-May/056942.html At first, His opinion says Ranges handle I/O instead of Streams.
>
> * http://lists.puremagic.com/pipermail/phobos/2010-March/000106.html He seems to think that Range should be offered by Handle recently.
>
> * http://lists.puremagic.com/pipermail/phobos/2010-April/000272.html Let's not forget that File isn't a range. Let's call it a "stream handle" (I push *Steam*. Speaking of Range, it is Microwave, Oven and Overheating-Steam! :-) )
>
> * http://lists.puremagic.com/pipermail/digitalmars-d/2010-June/078675.html
> "It's best to have a handle/ranges architecture in which the handle
> (e.g. File) is responsible for opening, closing, and managing the
> connection, and several ranges are responsible for fetching data in
> various ways (by character, by chunk, by line etc.)"

September 21, 2010
SHOO <zan77137 at nifty.com> wrote:
> I think that there are two problems about I/O operation.
> - Location of buffering layers.
> - Direction of seeking.
>
...snip...
>
> It is necessary for the concept to be divided in two at least to realize
> them. (Handles and Ranges) Or more(+ Port or Stream).
> The opening difficult item appears when I think about this.

How about putting a buffering layer between the two you said?  Not only it just solves the who-does-buffering problem, but also opens a bit of freedom in the lowermost I/O device layer.


Shin
September 23, 2010
Hmm... Does it mean to have to relay three classes to do I/O processing?

auto handle = FileHandle("file");
scope (exit) handle.close();
auto buf = MemoryBuffer();
auto range = byLine(range);

I think it is slightly complicatedly. What is the reason why it must come to look like it?

BTW, I don't know well what buffers must do. What is the requirement of buffers?

(2010/09/21 14:05), Shin Fujishiro wrote:
> SHOO<zan77137 at nifty.com>  wrote:
>> I think that there are two problems about I/O operation.
>> - Location of buffering layers.
>> - Direction of seeking.
>>
> ...snip...
>>
>> It is necessary for the concept to be divided in two at least to realize
>> them. (Handles and Ranges) Or more(+ Port or Stream).
>> The opening difficult item appears when I think about this.
>
> How about putting a buffering layer between the two you said?  Not only it just solves the who-does-buffering problem, but also opens a bit of freedom in the lowermost I/O device layer.
>
>
> Shin
September 24, 2010
This is a good point. Allow me to expand it a bit and say that we need a collection of solid use cases that we want to support. Without them, we come up with a design that might not work well with certain use patterns.

Example of a requirement: "Given a block-oriented stream, define a line-oriented range on top of it."

Do we need such a thing? Probably. Can it be implemented efficiently with the rawRead(ubyte[]) primitive? NO.

struct ByLine(BlockRange) {
     private BlockRange _input;
     private char[] store;
     ...
     void popFront() {
         // read one line from _input
     }
}

The problem is that the line reader must sometimes _append_ data to its buffer (in situations when it has read a long line that doesn't fit in one buffer). But the input stream does not support appending.

BTW this has been a long source of irritation with C's stdio: the only ways to read a line has been by reading one character at a time with fgetc() (pig slow) or by using fgets() (unsafe) or fgetsn() (inefficient for long lines) or by using getline() (nonportable, see http://www.gnu.org/s/libc/manual/html_node/Line-Input.html).

Moral of the story? We need to have a number of well-defined scenarios in mind when defining a streaming interface.


Andrei

On 9/23/10 8:37 CDT, SHOO wrote:
> Hmm... Does it mean to have to relay three classes to do I/O processing?
>
> auto handle = FileHandle("file");
> scope (exit) handle.close();
> auto buf = MemoryBuffer();
> auto range = byLine(range);
>
> I think it is slightly complicatedly. What is the reason why it must come to look like it?
>
> BTW, I don't know well what buffers must do. What is the requirement of buffers?
>
> (2010/09/21 14:05), Shin Fujishiro wrote:
>> SHOO<zan77137 at nifty.com> wrote:
>>> I think that there are two problems about I/O operation.
>>> - Location of buffering layers.
>>> - Direction of seeking.
>>>
>> ...snip...
>>>
>>> It is necessary for the concept to be divided in two at least to realize
>>> them. (Handles and Ranges) Or more(+ Port or Stream).
>>> The opening difficult item appears when I think about this.
>>
>> How about putting a buffering layer between the two you said? Not only it just solves the who-does-buffering problem, but also opens a bit of freedom in the lowermost I/O device layer.
>>
>>
>> Shin
> _______________________________________________
> phobos mailing list
> phobos at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/phobos
September 27, 2010
Thus I think we need a buffering layer that exposes a randomly accessible array to upper layers.  ByLine() can be easily and efficiently implemented with the following primitives defined:

Buffer
{
    // The entire buffer.
    ubyte[] buffer();

    // Slice of buffer() where data is available.
    ubyte[] available();

    // Moves the beginning of available() by n in buffer().
    void bump(sizediff_t n);

    // Reads next blob from a source.
    bool fetch();
}

Yes, cstdio-esque rawRead() is no good for high-level ByLine.  What
high-level I/O entities want is:  A randomly accessible buffer. Device
handles may expose block-oriented streaming primitives, but they
must be made "partially random accessible" by the buffering layer.


Shin

Andrei Alexandrescu <andrei at erdani.com> wrote:
> This is a good point. Allow me to expand it a bit and say that we need a collection of solid use cases that we want to support. Without them, we come up with a design that might not work well with certain use patterns.
> 
> Example of a requirement: "Given a block-oriented stream, define a line-oriented range on top of it."
> 
> Do we need such a thing? Probably. Can it be implemented efficiently with the rawRead(ubyte[]) primitive? NO.
> 
> struct ByLine(BlockRange) {
>      private BlockRange _input;
>      private char[] store;
>      ...
>      void popFront() {
>          // read one line from _input
>      }
> }
> 
> The problem is that the line reader must sometimes _append_ data to its buffer (in situations when it has read a long line that doesn't fit in one buffer). But the input stream does not support appending.
>
> BTW this has been a long source of irritation with C's stdio: the only ways to read a line has been by reading one character at a time with fgetc() (pig slow) or by using fgets() (unsafe) or fgetsn() (inefficient for long lines) or by using getline() (nonportable, see http://www.gnu.org/s/libc/manual/html_node/Line-Input.html).
> 
> Moral of the story? We need to have a number of well-defined scenarios in mind when defining a streaming interface.
> 
> 
> Andrei
> 
> On 9/23/10 8:37 CDT, SHOO wrote:
> > Hmm... Does it mean to have to relay three classes to do I/O processing?
> >
> > auto handle = FileHandle("file");
> > scope (exit) handle.close();
> > auto buf = MemoryBuffer();
> > auto range = byLine(range);
> >
> > I think it is slightly complicatedly. What is the reason why it must come to look like it?
> >
> > BTW, I don't know well what buffers must do. What is the requirement of buffers?
> >
> > (2010/09/21 14:05), Shin Fujishiro wrote:
> >> SHOO<zan77137 at nifty.com> wrote:
> >>> I think that there are two problems about I/O operation.
> >>> - Location of buffering layers.
> >>> - Direction of seeking.
> >>>
> >> ...snip...
> >>>
> >>> It is necessary for the concept to be divided in two at least to realize
> >>> them. (Handles and Ranges) Or more(+ Port or Stream).
> >>> The opening difficult item appears when I think about this.
> >>
> >> How about putting a buffering layer between the two you said? Not only it just solves the who-does-buffering problem, but also opens a bit of freedom in the lowermost I/O device layer.
> >>
> >>
> >> Shin
> > _______________________________________________
> > phobos mailing list
> > phobos at puremagic.com
> > http://lists.puremagic.com/mailman/listinfo/phobos
> _______________________________________________
> phobos mailing list
> phobos at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/phobos


Shin
September 28, 2010
On 9/26/10 23:34 PDT, Shin Fujishiro wrote:
> Thus I think we need a buffering layer that exposes a randomly accessible array to upper layers.  ByLine() can be easily and efficiently implemented with the following primitives defined:
>
> Buffer
> {
>      // The entire buffer.
>      ubyte[] buffer();
>
>      // Slice of buffer() where data is available.
>      ubyte[] available();
>
>      // Moves the beginning of available() by n in buffer().
>      void bump(sizediff_t n);
>
>      // Reads next blob from a source.
>      bool fetch();
> }
>
> Yes, cstdio-esque rawRead() is no good for high-level ByLine.  What
> high-level I/O entities want is:  A randomly accessible buffer. Device
> handles may expose block-oriented streaming primitives, but they
> must be made "partially random accessible" by the buffering layer.

But that's too big an interface. When would one ever need buffer[], when the beginning and the end of the buffer may be used for different portions of the input?

A better stream interface, which actually extends the standard input range interface:

struct Stream(T)
{
     @property T[] front();
     void munchFront(size_t bytes) in { assert(bytes <= front.length; }
     bool empty();
     void popFront();
}

This still doesn't allow filling the buffer with a new line, but it does offer the ability to a client to copy lines into its own buffer.

Andrei
January 02, 2011
Does the currently proposed streaming (on digitalmars.d) work for such needs?

Andrei

On 9/27/10 1:34 AM, Shin Fujishiro wrote:
> Thus I think we need a buffering layer that exposes a randomly accessible array to upper layers.  ByLine() can be easily and efficiently implemented with the following primitives defined:
>
> Buffer
> {
>      // The entire buffer.
>      ubyte[] buffer();
>
>      // Slice of buffer() where data is available.
>      ubyte[] available();
>
>      // Moves the beginning of available() by n in buffer().
>      void bump(sizediff_t n);
>
>      // Reads next blob from a source.
>      bool fetch();
> }
>
> Yes, cstdio-esque rawRead() is no good for high-level ByLine.  What
> high-level I/O entities want is:  A randomly accessible buffer. Device
> handles may expose block-oriented streaming primitives, but they
> must be made "partially random accessible" by the buffering layer.
>
>
> Shin
>
> Andrei Alexandrescu<andrei at erdani.com>  wrote:
>> This is a good point. Allow me to expand it a bit and say that we need a collection of solid use cases that we want to support. Without them, we come up with a design that might not work well with certain use patterns.
>>
>> Example of a requirement: "Given a block-oriented stream, define a line-oriented range on top of it."
>>
>> Do we need such a thing? Probably. Can it be implemented efficiently with the rawRead(ubyte[]) primitive? NO.
>>
>> struct ByLine(BlockRange) {
>>       private BlockRange _input;
>>       private char[] store;
>>       ...
>>       void popFront() {
>>           // read one line from _input
>>       }
>> }
>>
>> The problem is that the line reader must sometimes _append_ data to its buffer (in situations when it has read a long line that doesn't fit in one buffer). But the input stream does not support appending.
>>
>> BTW this has been a long source of irritation with C's stdio: the only
>> ways to read a line has been by reading one character at a time with
>> fgetc() (pig slow) or by using fgets() (unsafe) or fgetsn() (inefficient
>> for long lines) or by using getline() (nonportable, see
>> http://www.gnu.org/s/libc/manual/html_node/Line-Input.html).
>>
>> Moral of the story? We need to have a number of well-defined scenarios in mind when defining a streaming interface.
>>
>>
>> Andrei
>>
>> On 9/23/10 8:37 CDT, SHOO wrote:
>>> Hmm... Does it mean to have to relay three classes to do I/O processing?
>>>
>>> auto handle = FileHandle("file");
>>> scope (exit) handle.close();
>>> auto buf = MemoryBuffer();
>>> auto range = byLine(range);
>>>
>>> I think it is slightly complicatedly. What is the reason why it must come to look like it?
>>>
>>> BTW, I don't know well what buffers must do. What is the requirement of buffers?
>>>
>>> (2010/09/21 14:05), Shin Fujishiro wrote:
>>>> SHOO<zan77137 at nifty.com>  wrote:
>>>>> I think that there are two problems about I/O operation.
>>>>> - Location of buffering layers.
>>>>> - Direction of seeking.
>>>>>
>>>> ...snip...
>>>>>
>>>>> It is necessary for the concept to be divided in two at least to realize
>>>>> them. (Handles and Ranges) Or more(+ Port or Stream).
>>>>> The opening difficult item appears when I think about this.
>>>>
>>>> How about putting a buffering layer between the two you said? Not only it just solves the who-does-buffering problem, but also opens a bit of freedom in the lowermost I/O device layer.
>>>>
>>>>
>>>> Shin
>>> _______________________________________________
>>> phobos mailing list
>>> phobos at puremagic.com
>>> http://lists.puremagic.com/mailman/listinfo/phobos
>> _______________________________________________
>> phobos mailing list
>> phobos at puremagic.com
>> http://lists.puremagic.com/mailman/listinfo/phobos
>
>
> Shin
> _______________________________________________
> phobos mailing list
> phobos at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/phobos