July 05, 2010
On 07/05/2010 11:14 AM, Michel Fortin wrote:
> Le 2010-07-05 ? 1:37, Andrei Alexandrescu a ?crit :
>
> 
>> - The Seekable idea is good, I was thinking of it for a while. It expresses a range that is not as cheap for random access as a random-access range, but also that makes random seeking possible. What kind of algorithms could use Seekable?
>> 
> Well, you might want to skip over certain parts of a stream.
> 

like 'drop'

July 05, 2010
Sounds like an algorithm, not a range method. Seeking backwards with ranges seems weird though. Isn't one intrinsic property of ranges that they don't expand?

Sent from my iPhone

On Jul 5, 2010, at 9:44 AM, Ellery Newcomer <ellery-newcomer at utulsa.edu> wrote:

> On 07/05/2010 11:14 AM, Michel Fortin wrote:
>> Le 2010-07-05 ? 1:37, Andrei Alexandrescu a ?crit :
>> 
>> 
>>> - The Seekable idea is good, I was thinking of it for a while. It expresses a range that is not as cheap for random access as a random-access range, but also that makes random seeking possible. What kind of algorithms could use Seekable?
>>> 
>> Well, you might want to skip over certain parts of a stream.
>> 
> 
> like 'drop'
> 
> _______________________________________________
> phobos mailing list
> phobos at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/phobos
July 05, 2010
Le 2010-07-05 ? 12:23, Andrei Alexandrescu a ?crit :

> On 07/05/2010 11:14 AM, Michel Fortin wrote:
>> But I have to admit the ability of a range to seek outside of the range's range (like rewinding) seems a little odd. It does fit better with a 'stream' concept.
> 
> Ranges are streams!

I'm not contesting that (at this time). What I meant is that allowing a range to seek beyond its bounds looks counter nature to me. If it was called a stream it wouldn't be so bad. It's mostly a naming thing.

For instance, let's say I have a file with this content:

	[0,1,2,3,4,5,6,7,8,9];

I call popFront 5 times, so the content of the range conceptually becomes this:

	[5,6,7,8,9];

Then I call a function, say seek(2), to return to the third element. Does this add back elements in my range?

	[2,3,4,5,6,7,8,9]

Adding back elements to a range seems quite un-range-like to me. Seeking to a position relative to the first element in the "container" (the file) would be even more so.

Now call it a stream and it looks more natural to do these operations. But perhaps it's only me.


-- 
Michel Fortin
michel.fortin at michelf.com
http://michelf.com/



July 05, 2010
On 07/05/2010 11:44 AM, Ellery Newcomer wrote:
> On 07/05/2010 11:14 AM, Michel Fortin wrote:
>> Le 2010-07-05 ? 1:37, Andrei Alexandrescu a ?crit :
>>
>>> - The Seekable idea is good, I was thinking of it for a while. It expresses a range that is not as cheap for random access as a random-access range, but also that makes random seeking possible. What kind of algorithms could use Seekable?
>> Well, you might want to skip over certain parts of a stream.
>
> like 'drop'

We have popFrontN in std.range that could detect and take advantage of the existence of a seek() function.

Andrei
July 05, 2010
I haven't decided if it's desirable, but this could be done by copying the range, just like with any other range.  Some logic like this has to be in place anyway, since one could ask for two ranges from the same file object. Each range would either need it's own handle or do some detection and possibly a seek on every operation.  Or an attribute of file objects is that all ranges they provide share the same state.

Sent from my iPhone

On Jul 5, 2010, at 9:57 AM, Michel Fortin <michel.fortin at michelf.com> wrote:

> Le 2010-07-05 ? 12:23, Andrei Alexandrescu a ?crit :
> 
>> On 07/05/2010 11:14 AM, Michel Fortin wrote:
>>> But I have to admit the ability of a range to seek outside of the range's range (like rewinding) seems a little odd. It does fit better with a 'stream' concept.
>> 
>> Ranges are streams!
> 
> I'm not contesting that (at this time). What I meant is that allowing a range to seek beyond its bounds looks counter nature to me. If it was called a stream it wouldn't be so bad. It's mostly a naming thing.
> 
> For instance, let's say I have a file with this content:
> 
>    [0,1,2,3,4,5,6,7,8,9];
> 
> I call popFront 5 times, so the content of the range conceptually becomes this:
> 
>    [5,6,7,8,9];
> 
> Then I call a function, say seek(2), to return to the third element. Does this add back elements in my range?
> 
>    [2,3,4,5,6,7,8,9]
> 
> Adding back elements to a range seems quite un-range-like to me. Seeking to a position relative to the first element in the "container" (the file) would be even more so.
> 
> Now call it a stream and it looks more natural to do these operations. But perhaps it's only me.
> 
> 
> -- 
> Michel Fortin
> michel.fortin at michelf.com
> http://michelf.com/
> 
> 
> 
> _______________________________________________
> phobos mailing list
> phobos at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/phobos
July 06, 2010
Andrei Alexandrescu <andrei at erdani.com> wrote:
> On 07/05/2010 09:59 AM, Shin Fujishiro wrote:
> > Is the handle a similar concept as the container?
> 
> Well not quite because a container does not need put an emphasis on "opening" and "closing".
> 
> > That is, I reckon a handle is a reference type object that has its own resource and provides some ranges on top of it for accessing and manipulating the resource:
> >
> > class ConceptualHandle
> > {
> >      void open();    // ?
> >      void close();
> >      ByChunk byChunk(size_t n);
> >      ByChar byChar();
> >      Writer blockWriter();
> > }
> 
> Yes, that seems plausible.
> 
> Andrei

Can handles (optionally or necessarily) have low-level I/O primitives
like rawRead() and rawWrite()?

class SomeHandle
{
     void open();
     void close();
    ByChunk byChunk(size_t n);
    ByChar byChar();

    size_t rawRead(ubyte[] buffer);    // low-level input primitive
}


byChunk etc. would suffice for usual purposes.  But sometimes we want to read some user-defined structure from a handle.  With the rawRead primitive, I can create my own input range for reading a sequence of variable-length packets from a handle:

ByPacket byPacket(SomeHandle handle);

// This input range uses rawRead() for reading Packets.
struct ByPacket
{
    @property bool empty();
    @property Packet front();
    void popFront();
}
struct Packet
{
    uint signature;
    ubyte[] content;
    uint checksum;
}


Shin
July 06, 2010
Thanks for your reply.

Perhaps Stream which you imagine and Stream which I imagine are
different. Stream which I imagine corresponds approximately to File and
Socket and so on.
The name may be bad. And I think  it to be close in Handle which you say.
Even if the name is Handle, Resource or Device, etc. , I will accept it.
(For convenience, in my sentence of this time, I unify it in Stream.)
Addition template to input device would bring general versatility to
Ranges for streaming.

Please take it into consideration.


(2010/07/05 14:37), Andrei Alexandrescu wrote:
> Hello,
>
>
> I've been looking over the streaming proposal. Allow me to make a few comments:
>
> - The input ranges _are_ intended to be input streams, and the output ranges _are_ intended to be output streams. If they don't fulfill that purpose, they should be changed (instead of adding new categories).
>

Firstly I said, The input ranges are not intended to be input streams which I imagine. The input ranges are more high level interface that _use_ streams.

> - Input streams have the read primitive. What is wrong with an input range of ubyte[]? Then accessing front() gives you a buffer and popFront reads in a new buffer.
>

For example, reading of binary data from Stream for initialization of
following struct data is like a nightmare:
struct S{
     int a;
     ubyte[] b;
     double c;
}

For this initialization, you will read data that has different size
least 4 times.
You might try following code:

File f;
... = f.byChunk(4);
... = f.byChunk(size_t.sizeof);
... = f.byChunk(s.b.length);
... = f.byChunk(8);

I hardly see it effectively. And I think we should become careful to complicate Range more than current one.

> - What does flush() do for input streams?
>

I don't think deeply. File.rewind() may correspond to this.

> - I don't think close() is a good primitive for an input stream. An input stream should originate in a connection handle, and it's the handle, not the stream, that should control the connection. For example:
>
> auto s = Socket("123.456.455.1");
> auto stream = s.byChunk(1024 * 16);
> ... stream is an input range of ubyte[] ...
> s.close();
>
> If the range defines a close() operation, then we need to start talking
> about it defining an open() operation, which complicates matters. Why
> not leave ranges for traversal and handles for connections?
>

Maybe, your Handle is similar to my Stream.
The open() operation can replace it by a constructor, but the close()
operation is substitute impossibility by destructor.
Because, destructor may not be called when a struct data managed by GC.
The story is different if D offers a method to perform better RTTI.

> - On to output streams. What's wrong with having an output range of
> ubyte[]? Its put() primitive would be the same as the proposed write()
> routine.
>

I think it is good idea that Ranges for OutputStream. Now, LockingTextWriter just likes this.

> - flush() would be a good optional addition to an output stream.
>

Do you mean BufferedRange? It's interesting.

> - I have the same feeling about close() for output streams.
>

Me too.

> - The Seekable idea is good, I was thinking of it for a while. It expresses a range that is not as cheap for random access as a random-access range, but also that makes random seeking possible. What kind of algorithms could use Seekable?
>

I think that Seekable is unnecessary for Range.
Seekable should be had by Stream. In Container, slice is this.

> - What's the purpose of StreamWrapper? And why is it reading in the
> write() primitive?
>

For override.

interface X {
     InputStream getInputStream();
}
class Y: X {
     InputStream getInputStream(){ return
wrap(FileStream("/path/to/file")); }
}
class Z: X {
     InputStream getInputStream(){ return wrap(SocketStream(socket)); }
}

Or, for arrays.

InputStream[] ins;
ins ~= wrap(FileStream("/path/to/file"));
ins ~= wrap(SocketStream(socket));

wrap is workaround for case that someone wants to employ interface by all means. If override of template function is possible, the story may be different.

And sorry, reading in the write() primitive is my miss.

> - ByLine is a bit awkward because it needs to read buffers of size 1. Clearly there is some problem there. The right way is to build ByLine!Char on top of a stream of Char, not a stream of Char[]. (Speaking of which, I just checked in BlockingInputReader. It does read one character at a time but it has an inefficiency caused by the FILE* interface.)
>

I think so, too. Because it is only a mere demonstration, the implementation is slipshod.

> - What does FileStream do that File doesn't or can't do?
>

It is just an Adapter.

> Let me know of what you think.
>
>
> Andrei
August 26, 2010
After giving it some more thought, I'd like to reopen this discussion. The link is stale though. Do you have a fresh link to the code and documentation?

Thanks,

Andrei

On 6/30/10 9:48 PDT, SHOO wrote:
> In Japanese community, improvement of stream is discussed.
>
> New stream has following characteristics.
>
> - Supporting Range
> - Duck typing
> - Accepting interface optionally (main is struct)
>
> Very simple draft is here:
> http://ideone.com/BU3ev
>
> Now, satoru_h ( http://twitter.com/satoru_h ) works for more.
>
> What do you think?
> _______________________________________________
> phobos mailing list
> phobos at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/phobos
August 28, 2010
(2010/08/27 15:26), Andrei Alexandrescu wrote:
> After giving it some more thought, I'd like to reopen this discussion. The link is stale though. Do you have a fresh link to the code and documentation?
>
> Thanks,
>
> Andrei
>

Fresh code is here:
     http://ideone.com/YuJ99

Please forgive that there is lack for documents a little :)

I am not so positive about this code.
All of you add improvement in each, and please consider it.
September 05, 2010
(I'm continuing to catch up with old messages. We need to get a streaming interface done.)

1. I agree that the classic range interface offering byte[] from front() is awkward to use when you want to read data in different fixed sizes. I don't think that's a very frequent use case, but that doesn't really matter.

2. I agree that the extra copy could become an efficiency problem.

3. I think this:

size_t read(ubyte[] buf);

is a bit better than this:

ubyte[] read(ubyte[] buf);

because the latter favors people writing things like:

buf = stream.read(buf);

which is often undesirable (forces people to later allocate other buffers of the appropriate size etc.)


Andrei


On 07/05/2010 11:37 AM, Ellery Newcomer wrote:
> On 07/05/2010 10:49 AM, Andrei Alexandrescu wrote:
>> Good point. With ByChunk you only get to set the size once at construction time. A natural extension of that would be to define a property bufferSize that you can assign before calling popFront. That would allow code like:
>>
>> inp.bufferSize = 4;
>> inp.popFront();
>> process(inp.front);
>> inp.bufferSize = 100;
>> inp.popFront();
>> process(inp.front);
>>
>> I reckon that just calling read() with different lengths is a bit more appealing. Also if you need to save the data you need to call inp.front.dup which makes for an extra copy. The question is how often you need to do all that versus just getting whatever data is available.
>
> Well, most of my D IO experience has been with binary files where you're trying to get out fields of specific byte length and put them in dedicated structures. Currently, I read the entire file (if phobos had a standard InputStream interface thing I wouldn't be limited to files) into an array and work with that. But that's kind of a horrid situation. Note that std.zip is in the same boat.
>
> The other thing is that extra copy. Personally, I'd rather the user provide the buffer.
>
> Also, I currently use an ad-hoc InputStream interface whose read signature is
>
> ubyte[] read(ubyte[] buffer);
>
> which returns the slice of the buffer which was read in.
>
> Do you have any opinion of this signature vs. the traditional signature?
>
>> For wrapping structs in general, I hope we could have a more general version similar to BlackHole and WhiteHole - automatic wrapping. Consider:
>>
>> struct A {
>> this(int);
>> void foo();
>> int bar(string);
>> }
>>
>> alias Classify!A ClassA;
>>
>> ClassA is as if somebody sat down and wrote the following class by hand:
>>
>> class ClassA {
>> private A payload;
>> this(int x) { payload = A(x); }
>> void foo() { return payload.foo(); }
>> int bar(string x) { return payload.bar(x); }
>> }
>>
>> Things get more complicated with functions that receive or return type A, properties, qualifiers etc. I think Classify would be an excellent test of D's introspection capabilities, in addition to being useful in practice.
>
> That would be sweet
>
>>
>>
>> Andrei
>
> _______________________________________________
> phobos mailing list
> phobos at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/phobos