May 18, 2012
2012/5/19 Steven Schveighoffer <schveiguy@yahoo.com>:
> On Fri, 18 May 2012 10:39:55 -0400, kenji hara <k.hara.pg@gmail.com> wrote:
[snip]
>
> On non-blocking i/o, why not just not support range interface at all?  I don't have any problem with that.  In other words, if your input source is non-blocking, and you try to use range primitives, it simply won't work.
>
> I admit, all of my code so far is focused on blocking i/o.  I have some experience with non-blocking i/o, but it was to make a blocking interface that supported waiting for data with a timeout.  Making a cross-platform (i.e. both windows and Posix) non-blocking interface is difficult because you use very different mechanisms on both OSes.
>
> And a lot of times, you don't want non-blocking i/o, but rather parallel i/o.

[snip]
>> No, we cannot map output range concept to non-blocking output. 'put' operation always requires blocking.
>
> Yes, but again, put can use whatever stream primitives we have.
>
> In other words, it's quite possible to write range primitives which utilize stream primitivies.  It's impossible to write good stream primitives which utilize range primitives.

[snip]
>> My policy is very similar. But, as described above, I think range
>> cannot cover non-blocing IO.
>> And I think non-blocking IO interface is important for library
>> implementations.
>
>
> I think you misunderstand, I'm not trying to make ranges be the base of i/o, I'm trying to expose a range interface *based on* stream i/o interface.

The reasons why not use range primitives directly for stream I/O.

1. To specify a buffer for storing read bytes from upper layer.

Input range doesn't have a way to specify buffer for storing read
bytes to lower layer.
Because input range is designed as a view of underlying container.

Comparison of primitive count.
The four or more primitives: empty + front + popFront +
specifiy-buffer-for-storing-read-bytes + ...
vs.
My 'pull' primitive

Which is better?

2. To avoid confusing I/O operation/interfaces and range ones.

Yes, if you only needs blocking-io, you can use range i/f instead of
i/o specific primitives, but it is very confusable.
I think that enforcing to wrap IO objects (like File) with thin range
wrapper is better for orthogonality.

  foreach (ubyte b; RawFile(fname).ranged) { ... }

Kenji Hara
May 18, 2012
On Fri, 18 May 2012 13:27:22 -0400, kenji hara <k.hara.pg@gmail.com> wrote:

> 2012/5/19 Steven Schveighoffer <schveiguy@yahoo.com>:
>> On Fri, 18 May 2012 10:39:55 -0400, kenji hara <k.hara.pg@gmail.com> wrote:
>>>>> I'm designing experimental IO primitives:
>>>>> https://github.com/9rnsr/dio
>>
>> I'm having trouble following the code, is there a place with the generated
>> docs?   I'm looking for an overview to understand where to look.
>
> I have created gh-pages:
> http://9rnsr.github.com/dio/d/io_core.html

OK, *now* I understand what you mean by non-blocking.  There are some I/O packages that use asynchronous i/o which return even before any data is given to the buffer.  I thought this is what you were talking about.

I'm fully on board with synchronous but non-blocking.  That's what I assumed we would be doing, and it's well supported by low-level OS routines on all OSes.

In my implementation for a buffer, I have two calls:

read(buf[]) -> read until buf.length bytes are read or EOF
readPartial(buf[]) -> read from 1 to buf.length bytes, but performs at most 1 low-level read.  Returns 0 bytes on EOF.

readPartial will block if no data is yet available, but obviously can be made to not block if the underlying OS handle is marked as non-blocking (I need to add some extra structure to account for this).

Typically, this is the normal mechanism that I use for reading data that is not always available.  First, I select on a socket until data is available, then use synchronous read to get whatever data exists.

continuing reading...

-Steve
May 18, 2012
On 05/18/12 20:18, Artur Skawina wrote:
> On 05/18/12 17:43, kenji hara wrote:
>>>>>> I'm designing experimental IO primitives: https://github.com/9rnsr/dio
> 
>> It has a sample benchmark to compare performance with std.stdio for line iteration.
> 
> It's not exactly what i had i mind, but i tried to build it; it wants a 'io/wrapper.d' module which can not be found.

And is apparently windows-only; missing HANDLE type, non- existent TextOutputRange. I gave up after running into:

io/file.d:263: Error: static assert  (isSource!(File)) is false

artur
May 18, 2012
On 05/18/12 17:43, kenji hara wrote:
>>>>> I'm designing experimental IO primitives: https://github.com/9rnsr/dio

> It has a sample benchmark to compare performance with std.stdio for line iteration.

It's not exactly what i had i mind, but i tried to build it; it wants a 'io/wrapper.d' module which can not be found.

artur
May 19, 2012
Sorry, I have updated it.
Run 'make runbench' or 'make runbench_opt'.

Kenji Hara

2012/5/19 Artur Skawina <art.08.09@gmail.com>:
> On 05/18/12 17:43, kenji hara wrote:
>>>>>> I'm designing experimental IO primitives: https://github.com/9rnsr/dio
>
>> It has a sample benchmark to compare performance with std.stdio for line iteration.
>
> It's not exactly what i had i mind, but i tried to build it; it wants a 'io/wrapper.d' module which can not be found.
>
> artur
May 19, 2012
On Friday, 18 May 2012 at 19:18:21 UTC, Artur Skawina wrote:
> On 05/18/12 20:18, Artur Skawina wrote:
>> On 05/18/12 17:43, kenji hara wrote:
>>>>>>> I'm designing experimental IO primitives:
>>>>>>> https://github.com/9rnsr/dio
>> 
>>> It has a sample benchmark to compare performance with std.stdio for
>>> line iteration.
>> 
>> It's not exactly what i had i mind, but i tried to build it;
>> it wants a 'io/wrapper.d' module which can not be found.
>
> And is apparently windows-only; missing HANDLE type, non-
> existent TextOutputRange. I gave up after running into:
>
> io/file.d:263: Error: static assert  (isSource!(File)) is false
>

Current dio is PoC for new IO design.
If we go with such design, I will add Linux/Mac support to dio.


Masahiro
May 19, 2012
Please add README to top directory.
(Contents are benchmark command, support environment and etc)

We can see such information on web browser ;)

P.S.
I want to do pull request for supporting other environments.
But I'm busy right now...


Masahiro

On Saturday, 19 May 2012 at 15:22:37 UTC, kenji hara wrote:
> Sorry, I have updated it.
> Run 'make runbench' or 'make runbench_opt'.
>
> Kenji Hara
>
> 2012/5/19 Artur Skawina <art.08.09@gmail.com>:
>> On 05/18/12 17:43, kenji hara wrote:
>>>>>>> I'm designing experimental IO primitives:
>>>>>>> https://github.com/9rnsr/dio
>>
>>> It has a sample benchmark to compare performance with std.stdio for
>>> line iteration.
>>
>> It's not exactly what i had i mind, but i tried to build it;
>> it wants a 'io/wrapper.d' module which can not be found.
>>
>> artur


May 21, 2012
I don't have time to read the whole discussion right now, but I've thought since our exchange here about buffered stream. I've imagined something close to, but quite different from you buffered stream, where the length of the buffer chunk can be adapted, and the buffer be poped by an arbitrary amount of bytes:

I reuse the name front, popFront and empty, but it may not be such a good idea.

struct BufferedStream(T)
{
  T[] buf;
  size_t cursor;
  size_t decoded;
  InputStream input;

  // returns a slice to the n next elements of the input stream.
  // this slice is valid until next call to front only.
  T[] front(size_t n)
  {
    if (n <= decoded - cursor) return buf[cursor..cursor+n];
    if (n <= buffer.length)
      {
       ... // move data to the front of the buffer and read new data to
           // fill the buffer.
        return buf[0..n];
      }
    if (n > buf.length)
     {
       ... // resize buffer and read new data to fill the buffer
       return buf[0..n];
     }
  }
  // pop the next n elements from the buffer.
  void popFront(size_t n) { cursor += n; }
  void empty() { return input.eof && cursor == buf.length; }
}

This kind of buffered stream enable you read data by varying chunk size, but always read data by an amount that is convenient for the input stream. (and front could be made to return a buffer with the size that is most adequate for the stream when called with size_t.max as n).

More importantly, it allows to peak at an arbitrary amount of data, use it, and decide how many items you want to consume. For example, if allows to write stuff like "ReadAWord" without double buffering: you get enough characters from the buffer until you find a space, and then you consume only the characters that are the space.

"Steven Schveighoffer" , dans le message (digitalmars.D:167733), a
 écrit :
> OK, so I had a couple partially written replies on the 'deprecating std.stream etc' thread, then I had to go home.
> 
> But I thought about this a lot last night, and some of the things Andrei and others are saying is starting to make sense (I know!).  Now I've scrapped those replies and am thinking about redesigning my i/o package (most of the code can stay intact).
> 
> I'm a little undecided on some of the details, but here is what I think makes sense:
> 
> 1. We need a buffering input stream type.  This must have additional
> methods besides the range primitives, because doing one-at-a-time byte
> reads is not going to cut it.
> 2. I realized, buffering input stream of type T is actually an input range
> of type T[].  Observe:
> 
> struct /*or class*/ buffer(T)
> {
>       T[] buf;
>       InputStream input;
>       ...
>       @property T[] front() { return buf; }
>       void popFront() {input.read(buf);} // flush existing buffer, read
> next.
>       @property bool empty() { return buf.length == 0;}
> }
> 
> Roughly speaking, not all the details are handled, but this makes a feasible input range that will perform quite nicely for things like std.algorithm.copy.  I haven't checked, but copy should be able to handle transferring a range of type T[] to an output range with element type T, if it's not able to, it should be made to work.

Or with joiner(buffer);

> I know at least, an
> output stream with element type T supports putting T or T[].  What I think
> really makes sense is to support:
> 
> buffer!ubyte b;
> outputStream o;
> 
> o.put(b); // uses range primitives to put all the data to o, one element
> (i.e. ubyte[]) of b at a time

Of course, output stream should not have a consistent interface with input stream.

> 3. An ultimate goal of the i/o streaming package should be to be able to do this:
> 
> auto x = new XmlParser("<rootElement></rootElement>");
> 
> or at least
> 
> auto x = new XmlParser(buffered("<rootElement></rootElement>"));
> 
> So I think arrays need to be able to be treated as a buffering streams.  I tried really hard to think of some way to make this work with my existing system, but I don't think it will without unnecessary baggage, and losing interoperability with existing range functions.

A simple string stream can be built on top of a string, with no
other member than the string itself, can't it ?
With my definition of buffered stream, at least, it can, and any array
could support:
T[] front(size_t i) { return this[0..min(i, $)]; }
void popFront(size_t i) { this = this[i..$]; }

-- 
Christophe
May 21, 2012
> Well, because that's what i/o buffers are :)  There isn't an OS primitive that reads a file descriptor into an e.g. linked list.  Anything other than a slice would go through a translation.
>
It's a pity that iovec and T[] have switch length/ptr fields.
Otherwise one could directly map read(ubyte[] bufs...) to libc.readv.

It did wrote a buffered range that uses a linked list to promote an
input range to a forward range. This is somewhat similar to lazy
ByteStrings in haskell.
There were some issue with reference counting and the implicit copy
in foreach loops but other than that it's fairly useful.

https://gist.github.com/1257196

The trouble with block-wise primitives (T[] input ranges) like byChunk is
that they make common things like parsing very difficult because the client
has to account for buffer wraps. Things like double buffering or a ringbuffer
would help for this.

martin
1 2 3 4
Next ›   Last »