Jump to page: 1 2 3
Thread overview
[phobos] Improvement of stream
Jun 30, 2010
SHOO
Jul 05, 2010
Ellery Newcomer
Jul 05, 2010
Ellery Newcomer
Jul 05, 2010
Shin Fujishiro
Jul 06, 2010
Shin Fujishiro
Jul 05, 2010
Michel Fortin
Jul 05, 2010
Michel Fortin
Jul 05, 2010
Sean Kelly
Jul 05, 2010
Ellery Newcomer
Jul 05, 2010
Sean Kelly
Jul 06, 2010
SHOO
Aug 27, 2010
SHOO
July 01, 2010
In Japanese community, improvement of stream is discussed.

New stream has following characteristics.

- Supporting Range
- Duck typing
- Accepting interface optionally (main is struct)

Very simple draft is here:
http://ideone.com/BU3ev

Now, satoru_h ( http://twitter.com/satoru_h ) works for more.

What do you think?
July 04, 2010
Thanks! I'll send feeback on this tomorrow morning.

Andrei

On 06/30/2010 11:48 AM, SHOO wrote:
> In Japanese community, improvement of stream is discussed.
>
> New stream has following characteristics.
>
> - Supporting Range
> - Duck typing
> - Accepting interface optionally (main is struct)
>
> Very simple draft is here:
> http://ideone.com/BU3ev
>
> Now, satoru_h ( http://twitter.com/satoru_h ) works for more.
>
> What do you think?
> _______________________________________________
> phobos mailing list
> phobos at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/phobos
July 05, 2010
Hello,


I've been looking over the streaming proposal. Allow me to make a few comments:

- The input ranges _are_ intended to be input streams, and the output ranges _are_ intended to be output streams. If they don't fulfill that purpose, they should be changed (instead of adding new categories).

- Input streams have the read primitive. What is wrong with an input range of ubyte[]? Then accessing front() gives you a buffer and popFront reads in a new buffer.

- What does flush() do for input streams?

- I don't think close() is a good primitive for an input stream. An input stream should originate in a connection handle, and it's the handle, not the stream, that should control the connection. For example:

auto s = Socket("123.456.455.1");
auto stream = s.byChunk(1024 * 16);
... stream is an input range of ubyte[] ...
s.close();

If the range defines a close() operation, then we need to start talking about it defining an open() operation, which complicates matters. Why not leave ranges for traversal and handles for connections?

- On to output streams. What's wrong with having an output range of ubyte[]? Its put() primitive would be the same as the proposed write() routine.

- flush() would be a good optional addition to an output stream.

- I have the same feeling about close() for output streams.

- The Seekable idea is good, I was thinking of it for a while. It expresses a range that is not as cheap for random access as a random-access range, but also that makes random seeking possible. What kind of algorithms could use Seekable?

- What's the purpose of StreamWrapper? And why is it reading in the write() primitive?

- ByLine is a bit awkward because it needs to read buffers of size 1. Clearly there is some problem there. The right way is to build ByLine!Char on top of a stream of Char, not a stream of Char[]. (Speaking of which, I just checked in BlockingInputReader. It does read one character at a time but it has an inefficiency caused by the FILE* interface.)

- What does FileStream do that File doesn't or can't do?

Let me know of what you think.


Andrei

On 06/30/2010 11:48 AM, SHOO wrote:
> In Japanese community, improvement of stream is discussed.
>
> New stream has following characteristics.
>
> - Supporting Range
> - Duck typing
> - Accepting interface optionally (main is struct)
>
> Very simple draft is here:
> http://ideone.com/BU3ev
>
> Now, satoru_h ( http://twitter.com/satoru_h ) works for more.
>
> What do you think?
> _______________________________________________
> phobos mailing list
> phobos at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/phobos
July 05, 2010
On 07/05/2010 12:37 AM, Andrei Alexandrescu wrote:
> - Input streams have the read primitive. What is wrong with an input range of ubyte[]? Then accessing front() gives you a buffer and popFront reads in a new buffer.
How do you handle heterogeneous reads? e.g.

ubyte[] x = new ubyte[](4);
inp.read(x);
ubyte[] y = new ubyte[](100);
inp.read(y);

> - What's the purpose of StreamWrapper?

Having input stream types which aren't runtime type compatible is kind of a sucky situation. That's what the interfaces are for, and that's what the wrapper provides for  e.g. input streams which are structs.

> And why is it reading in the write() primitive?
heh heh

>
> - ByLine is a bit awkward because it needs to read buffers of size 1. Clearly there is some problem there. The right way is to build ByLine!Char on top of a stream of Char, not a stream of Char[]. (Speaking of which, I just checked in BlockingInputReader. It does read one character at a time but it has an inefficiency caused by the FILE* interface.)
>
> - What does FileStream do that File doesn't or can't do?
>
> Let me know of what you think.
>
>

July 05, 2010
Is the handle a similar concept as the container?

That is, I reckon a handle is a reference type object that has its own resource and provides some ranges on top of it for accessing and manipulating the resource:

class ConceptualHandle
{
    void open();    // ?
    void close();
    ByChunk byChunk(size_t n);
    ByChar byChar();
    Writer blockWriter();
}


Shin

Andrei Alexandrescu <andrei at erdani.com> wrote:
> Hello,
> 
> 
> I've been looking over the streaming proposal. Allow me to make a few comments:
> 
> - The input ranges _are_ intended to be input streams, and the output ranges _are_ intended to be output streams. If they don't fulfill that purpose, they should be changed (instead of adding new categories).
> 
> - Input streams have the read primitive. What is wrong with an input range of ubyte[]? Then accessing front() gives you a buffer and popFront reads in a new buffer.
> 
> - What does flush() do for input streams?
> 
> - I don't think close() is a good primitive for an input stream. An input stream should originate in a connection handle, and it's the handle, not the stream, that should control the connection. For example:
> 
> auto s = Socket("123.456.455.1");
> auto stream = s.byChunk(1024 * 16);
> ... stream is an input range of ubyte[] ...
> s.close();
> 
> If the range defines a close() operation, then we need to start talking about it defining an open() operation, which complicates matters. Why not leave ranges for traversal and handles for connections?
> 
> - On to output streams. What's wrong with having an output range of ubyte[]? Its put() primitive would be the same as the proposed write() routine.
> 
> - flush() would be a good optional addition to an output stream.
> 
> - I have the same feeling about close() for output streams.
> 
> - The Seekable idea is good, I was thinking of it for a while. It expresses a range that is not as cheap for random access as a random-access range, but also that makes random seeking possible. What kind of algorithms could use Seekable?
> 
> - What's the purpose of StreamWrapper? And why is it reading in the write() primitive?
> 
> - ByLine is a bit awkward because it needs to read buffers of size 1. Clearly there is some problem there. The right way is to build ByLine!Char on top of a stream of Char, not a stream of Char[]. (Speaking of which, I just checked in BlockingInputReader. It does read one character at a time but it has an inefficiency caused by the FILE* interface.)
> 
> - What does FileStream do that File doesn't or can't do?
> 
> Let me know of what you think.
> 
> 
> Andrei
> 
> On 06/30/2010 11:48 AM, SHOO wrote:
> > In Japanese community, improvement of stream is discussed.
> >
> > New stream has following characteristics.
> >
> > - Supporting Range
> > - Duck typing
> > - Accepting interface optionally (main is struct)
> >
> > Very simple draft is here:
> > http://ideone.com/BU3ev
> >
> > Now, satoru_h ( http://twitter.com/satoru_h ) works for more.
> >
> > What do you think?
> > _______________________________________________
> > phobos mailing list
> > phobos at puremagic.com
> > http://lists.puremagic.com/mailman/listinfo/phobos
> _______________________________________________
> phobos mailing list
> phobos at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/phobos
July 05, 2010
On 07/05/2010 09:59 AM, Shin Fujishiro wrote:
> Is the handle a similar concept as the container?

Well not quite because a container does not need put an emphasis on "opening" and "closing".

> That is, I reckon a handle is a reference type object that has its own resource and provides some ranges on top of it for accessing and manipulating the resource:
>
> class ConceptualHandle
> {
>      void open();    // ?
>      void close();
>      ByChunk byChunk(size_t n);
>      ByChar byChar();
>      Writer blockWriter();
> }

Yes, that seems plausible.

Andrei
July 05, 2010
On 07/05/2010 09:10 AM, Ellery Newcomer wrote:
> On 07/05/2010 12:37 AM, Andrei Alexandrescu wrote:
>> - Input streams have the read primitive. What is wrong with an input range of ubyte[]? Then accessing front() gives you a buffer and popFront reads in a new buffer.
> How do you handle heterogeneous reads? e.g.
>
> ubyte[] x = new ubyte[](4);
> inp.read(x);
> ubyte[] y = new ubyte[](100);
> inp.read(y);

Good point. With ByChunk you only get to set the size once at construction time. A natural extension of that would be to define a property bufferSize that you can assign before calling popFront. That would allow code like:

inp.bufferSize = 4;
inp.popFront();
process(inp.front);
inp.bufferSize = 100;
inp.popFront();
process(inp.front);

I reckon that just calling read() with different lengths is a bit more appealing. Also if you need to save the data you need to call inp.front.dup which makes for an extra copy. The question is how often you need to do all that versus just getting whatever data is available.

>> - What's the purpose of StreamWrapper?
>
> Having input stream types which aren't runtime type compatible is kind of a sucky situation. That's what the interfaces are for, and that's what the wrapper provides for e.g. input streams which are structs.

Makes sense. Probably with time we'll need to add interfaces for all kinds of ranges.

For wrapping structs in general, I hope we could have a more general version similar to BlackHole and WhiteHole - automatic wrapping. Consider:

struct A {
     this(int);
     void foo();
     int bar(string);
}

alias Classify!A ClassA;

ClassA is as if somebody sat down and wrote the following class by hand:

class ClassA {
     private A payload;
     this(int x) { payload = A(x); }
     void foo() { return payload.foo(); }
     int bar(string x) { return payload.bar(x); }
}

Things get more complicated with functions that receive or return type A, properties, qualifiers etc. I think Classify would be an excellent test of D's introspection capabilities, in addition to being useful in practice.


Andrei
July 05, 2010
Le 2010-07-05 ? 1:37, Andrei Alexandrescu a ?crit :

> - The Seekable idea is good, I was thinking of it for a while. It expresses a range that is not as cheap for random access as a random-access range, but also that makes random seeking possible. What kind of algorithms could use Seekable?

Well, you might want to skip over certain parts of a stream. Say you know there's 32K bytes of text you don't need, you could just skip it in one go. Or it could allow you to rewind the stream to a previous location (such as when parsing a zip file, where the table of content is at the end of the file and tells you where in the archive each compressed file is located.)

Perhaps you could implement a buffered stream as seekable too, with a seek range limited to the buffer's size. It could act as a form of lookahead, or lookbehind.

But I have to admit the ability of a range to seek outside of the range's range (like rewinding) seems a little odd. It does fit better with a 'stream' concept.

-- 
Michel Fortin
michel.fortin at michelf.com
http://michelf.com/



July 05, 2010
On 07/05/2010 11:14 AM, Michel Fortin wrote:
> Le 2010-07-05 ? 1:37, Andrei Alexandrescu a ?crit :
>
>> - The Seekable idea is good, I was thinking of it for a while. It expresses a range that is not as cheap for random access as a random-access range, but also that makes random seeking possible. What kind of algorithms could use Seekable?
>
> Well, you might want to skip over certain parts of a stream. Say you know there's 32K bytes of text you don't need, you could just skip it in one go. Or it could allow you to rewind the stream to a previous location (such as when parsing a zip file, where the table of content is at the end of the file and tells you where in the archive each compressed file is located.)
>
> Perhaps you could implement a buffered stream as seekable too, with a seek range limited to the buffer's size. It could act as a form of lookahead, or lookbehind.
>
> But I have to admit the ability of a range to seek outside of the range's range (like rewinding) seems a little odd. It does fit better with a 'stream' concept.

Ranges are streams!

Andrei
July 05, 2010
On 07/05/2010 10:49 AM, Andrei Alexandrescu wrote:
> Good point. With ByChunk you only get to set the size once at construction time. A natural extension of that would be to define a property bufferSize that you can assign before calling popFront. That would allow code like:
>
> inp.bufferSize = 4;
> inp.popFront();
> process(inp.front);
> inp.bufferSize = 100;
> inp.popFront();
> process(inp.front);
>
> I reckon that just calling read() with different lengths is a bit more appealing. Also if you need to save the data you need to call inp.front.dup which makes for an extra copy. The question is how often you need to do all that versus just getting whatever data is available.

Well, most of my D IO experience has been with binary files where you're trying to get out fields of specific byte length and put them in dedicated structures. Currently, I read the entire file (if phobos had a standard InputStream interface thing I wouldn't be limited to files) into an array and work with that. But that's kind of a horrid situation. Note that std.zip is in the same boat.

The other thing is that extra copy. Personally, I'd rather the user provide the buffer.

Also, I currently use an ad-hoc InputStream interface whose read signature is

ubyte[] read(ubyte[] buffer);

which returns the slice of the buffer which was read in.

Do you have any opinion of this signature vs. the traditional signature?

> For wrapping structs in general, I hope we could have a more general version similar to BlackHole and WhiteHole - automatic wrapping. Consider:
>
> struct A {
>     this(int);
>     void foo();
>     int bar(string);
> }
>
> alias Classify!A ClassA;
>
> ClassA is as if somebody sat down and wrote the following class by hand:
>
> class ClassA {
>     private A payload;
>     this(int x) { payload = A(x); }
>     void foo() { return payload.foo(); }
>     int bar(string x) { return payload.bar(x); }
> }
>
> Things get more complicated with functions that receive or return type A, properties, qualifiers etc. I think Classify would be an excellent test of D's introspection capabilities, in addition to being useful in practice.

That would be sweet

>
>
> Andrei

« First   ‹ Prev
1 2 3