February 18, 2016
On 2/17/16 5:54 AM, John Colvin wrote:
> On Wednesday, 17 February 2016 at 07:15:01 UTC, Steven Schveighoffer wrote:
>> On 2/17/16 1:58 AM, Rikki Cattermole wrote:
>>
>>> A few things:
>>> https://github.com/schveiguy/iopipe/blob/master/source/iopipe/traits.d#L126
>>>
>>> why isn't that used more especially with e.g. window?
>>> After all, window seems like a very well used word...
>>
>> Not sure what you mean.
>>
>>> I don't like that a stream isn't inherently an input range.
>>> This seems to me like a good place to use this abstraction by default.
>>
>> What is front for an input stream? A byte? A character? A word? A line?
>
> Why not just say it's a ubyte and then compose with ranges from there?

If I provide a range by element (it may not be ubyte), then that's likely not the most useful range to have.

For example, the byLine iopipe gives you one more line of data each time you call extend. But the data in the window is not necessarily one line, and the element type is char, wchar, or dchar. None of those I would this is what someone would expect or want.

This is why I think it's better to have the user specifically tell me "this is how I want to range-ify this stream" rather than assume.

-Steve
February 18, 2016
On 2/17/16 9:52 AM, Adam D. Ruppe wrote:
> On Wednesday, 17 February 2016 at 10:54:56 UTC, John Colvin wrote:
>> Why not just say it's a ubyte and then compose with ranges from there?
>
> You could put a range interface on it... but I think it would be of very
> limited value. For one, what about fseek? How does that interact with
> the range interface?

seeking a stream is not a focus of my library. I'm focusing on raw data throughput for an established pipeline that you expect not to move around.

A seek would require resetting the pipeline (something that is possible, but I haven't planned for it).

> Or, what about reading a network interface where you get variable-sized
> packets?

This I HAVE planned for, and it should work quite nicely. I agree that providing a by-default range interface may not be the most useful thing.

> Copying it into a buffer is probably the most sane... but it is a
> wasteful copy if your existing buffer has enough space. But how to you
> say that to a range? popFront takes no arguments.

The asInputRange adapter in iopipe/bufpipe.d provides the following crude interface:

1. front is the current window
2. empty returns true if the window is empty.
3. popFront discards the window, and extends in the next window.

With this, any ioPipe can be turned into a crude range. It should be good enough for things like std.algorithm.copy. And in the case of byLine, it allows one to create an iopipe that caters to creating a range, while also giving useful functionality as a pipe.

I'm on the fence as to whether all ioPipes should be ranges. Yes, it's easy to do (though a lot of boilerplate, you can't UFCS this), but I just can't see the use case being worth it.

> Ranges are great for a sequence of data that is the same type on each
> call. Files, however, tend to have variable length (which you might want
> to skip large sections of) and different types of data as you iterate
> through them.

Very much agree.

> I find std.stdio's byChunk and byLine to be almost completely useless in
> my cases.

byLine I find useful (think of grep), byChunk I've never found a reason to use.

-Steve
February 18, 2016
On Wednesday, 17 February 2016 at 06:45:41 UTC, Steven Schveighoffer wrote:
>
> foreach(line; (new IODevice(0)).bufferedInput
>     .asText!(UTFType.UTF8)
>     .byLine
>     .asInputRange)
>    // handle line
>
This looks pretty all-right so far.  Would something like this work?

foreach(pollItem; zmqSocket.bufferedInput
    .as!(zmqPollItem)
    .asInputRange)

> 3. The focus of this library is NOT replacement of std.stream, or even low-level i/o in general.
>
Oh.  Well maybe that's not the case, but it may have potential anyway.  If nothing else, for testing API concepts.

> 6. There is a concept in here I called "valves". It's very weird, but it allows unifying input and output into one seamless chain. In fact, I can't think of how I could have done output in this regime without them. See the convert example application for details on how it is used.
>
This... might be cool?  It bears some similarity to my own ideas.  I'd like to see more examples, though.

-Wyatt
February 18, 2016
On 2/17/16 5:47 PM, deadalnix wrote:
> First, I'm very happy to see that. Sounds like a good project. Some
> remarks:
>   - You seems to be using classes. These are good to compose at runtime,

I have one class, the IODevice. As I said in the announcement, this isn't a focus of the library, just a way to play with the other pieces :) It's utility isn't very important. One thing it does do (a relic from when I was thinking of trying to replace stdio.File innards) is take over a FILE *, and close the FILE * on destruction.

But I'm steadfastly against using classes for the meat of the library (i.e. the range-like pipeline types). I do happen to think classes work well for raw i/o, since the OS treats i/o items that way (e.g. a network socket is a file descriptor, not some other type), but it would be nice if you could have class features for non-GC lifetimes. Classes are bad for correct deallocation of i/o resources.

>   - Being able to read.write from an io device in a generator like
> manner is I think important if we are rolling out something new.

I'm not quite sure what this means.

> Literally the only thing that can explain the success of Node.js is this
> (everything else is crap). See async/await in C#

async I/O I was hoping could be handled like vibe does (i.e. under the hood with fibers).

>   - Please explain valves more.

Valves allow all the types that process buffered input to process buffered output without changing pretty much anything. It allows me to have a "push" mechanism by pulling from the other end automatically.

In essence, the problem of buffered input is very different from the problem of buffered output. One is pulling data chunks at a time, and processing in finer detail, the other is processing data in finer detail and then pushing out chunks that are ready.

The big difference is the end of the pipe that needs user intervention. For input, the user is the consumer of data. With output, the user is the provider of data.

The problem is, how do you construct such a pipeline? The iopipe convention is to wrap the upstream data. For output, the upstream data is what you need access to. A std.algorithm.map doesn't give you access to the underlying range, right? So if you need access to the earlier part of the pipeline, how do you get to it? And how do you know how FAR to get to it (i.e. pipline.subpipe.subpipe.subpipe....)

This is what the valve is for. The valve has 3 parts, the inlet, the processed data, and the outlet. The inlet works like a normal iopipe, but instead of releasing data upstream, it pushes the data to the processed data area. The outlet can only pull data from the processed data. So this really provides a way for the user to control the flow of data. (note, a lot of this is documented in the concepts.txt document)

The reason it's special is because every iopipe is required to provide access to an upstream valve inlet if it exists. This makes the API of accessing the upstream data MUCH easier to deal with. (i.e. pipeline.valve)

Then I have this wrapper called autoValve, which automatically flushes the downstream data when more space is needed, and makes it look like you are just dealing with the upstream end. This is exactly the model we need for buffered output.

This way, I can have a push mechanism for output, and all the processing pieces (for instance, byte swapping, converting to a different array type, etc.) don't even need to care about providing a push mechanism.

>   - Profit ?

Yes, absolutely :)

-Steve
February 18, 2016
On 2/18/16 11:07 AM, Wyatt wrote:
> On Wednesday, 17 February 2016 at 06:45:41 UTC, Steven Schveighoffer wrote:
>>
>> foreach(line; (new IODevice(0)).bufferedInput
>>     .asText!(UTFType.UTF8)
>>     .byLine
>>     .asInputRange)
>>    // handle line
>>
> This looks pretty all-right so far.  Would something like this work?
>
> foreach(pollItem; zmqSocket.bufferedInput
>      .as!(zmqPollItem)
>      .asInputRange)

Yes, that is the intent. All without copying.

Note, asInputRange may not do what you want here. If multiple zmqPollItems come in at once (I'm not sure how your socket works), the input range's front will provide the entire window of data, and flush it on popFront.

I'll also point at arrayCastPipe (https://github.com/schveiguy/iopipe/blob/master/source/iopipe/bufpipe.d#L399), which simply casts the input array window to a new type of array window (if the items are coming in binary form).

I'm thinking I'll change the name byInputRange to byWindow, and add a byElement for an element-wise input range.

>
>> 6. There is a concept in here I called "valves". It's very weird, but
>> it allows unifying input and output into one seamless chain. In fact,
>> I can't think of how I could have done output in this regime without
>> them. See the convert example application for details on how it is used.
>>
> This... might be cool?  It bears some similarity to my own ideas.  I'd
> like to see more examples, though.

I'm hoping people can come up with ideas for other uses for them. I really like the concept, but the only use case I have right now is output streams.

It would be cool to see if there's a use case for multiple valves.

-Steve
February 18, 2016
On Thursday, 18 February 2016 at 15:44:00 UTC, Steven Schveighoffer wrote:
> On 2/17/16 5:54 AM, John Colvin wrote:
>> On Wednesday, 17 February 2016 at 07:15:01 UTC, Steven Schveighoffer wrote:
>>> On 2/17/16 1:58 AM, Rikki Cattermole wrote:
>>>
>>>> A few things:
>>>> https://github.com/schveiguy/iopipe/blob/master/source/iopipe/traits.d#L126
>>>>
>>>> why isn't that used more especially with e.g. window?
>>>> After all, window seems like a very well used word...
>>>
>>> Not sure what you mean.
>>>
>>>> I don't like that a stream isn't inherently an input range.
>>>> This seems to me like a good place to use this abstraction by default.
>>>
>>> What is front for an input stream? A byte? A character? A word? A line?
>>
>> Why not just say it's a ubyte and then compose with ranges from there?
>
> If I provide a range by element (it may not be ubyte), then that's likely not the most useful range to have.
>
I hadn't thought of this before, but if we accept that a stream is raw, untyped data, it may be best _not_ to provide a range interface directly.  It's easy enough to

alias source = sourceStream.as!ubyte;

anyway, right?

> This is why I think it's better to have the user specifically tell me "this is how I want to range-ify this stream" rather than assume.
>
I think this makes more sense with TLV encodings, too.  Thinking of things like:

switch(source.as!(BERType).popFront){
    case(UNIVERSAL|PRIMITIVE|UTF8STRING){
        int len;
        if(source.as!(BERLength).front & 0b10_00_00_00) {
            // X.690? Never heard of 'em!
        } else {
            len = source.as!(BERLength).popFront;
        }
        return source.buffered(len).as!(string).popFront;
    }
    ...etc.
}

Musing: I'd probably want a helper like popAs!() so I don't forget popFront()...

-Wyatt
February 18, 2016
On Thursday, 18 February 2016 at 16:36:37 UTC, Steven Schveighoffer wrote:
> On 2/18/16 11:07 AM, Wyatt wrote:
>> This looks pretty all-right so far.  Would something like this work?
>>
>> foreach(pollItem; zmqSocket.bufferedInput
>>      .as!(zmqPollItem)
>>      .asInputRange)
>
> Yes, that is the intent. All without copying.
>
Great!

> Note, asInputRange may not do what you want here. If multiple zmqPollItems come in at once (I'm not sure how your socket works), the input range's front will provide the entire window of data, and flush it on popFront.
>
Not so great!  That's really not what I'd expect at all. :(  (This isn't to say it doesn't make sense semantically, but I don't like how it feels.)

> I'm thinking I'll change the name byInputRange to byWindow, and add a byElement for an element-wise input range.
>
Oh, I see.  Naming.  Naming is hard.

-Wyatt
February 18, 2016
On 2/18/16 12:16 PM, Wyatt wrote:
> On Thursday, 18 February 2016 at 16:36:37 UTC, Steven Schveighoffer wrote:
>> Note, asInputRange may not do what you want here. If multiple
>> zmqPollItems come in at once (I'm not sure how your socket works), the
>> input range's front will provide the entire window of data, and flush
>> it on popFront.
>>
> Not so great!  That's really not what I'd expect at all. :( (This isn't
> to say it doesn't make sense semantically, but I don't like how it feels.)

The philosophy that I settled on is to create an iopipe that extends one "item" at a time, even if more are available. Then, apply the range interface on that.

When I first started to write byLine, I made it a range. Then I thought, "what if you wanted to iterate by 2 lines at a time, or iterate by one line at a time, but see the last 2 for context?", well, then that would be another type, and I'd have to abstract out the functionality of line searching.

So I decided to just make an abstract "asInputRange" and just wrap the functionality of extending data one line at a time. The idea is to make building blocks as simple and useful as possible.

So what I think may be a good fit for your application (without knowing all the details) is to create an iopipe that delineates each message and extends exactly one message per call to extend. Then, you can wrap that in asInputRange, or create your own range which translates the actual binary data to a nicer object for each call to front.

So something like:

foreach(pollItem; zmqSocket.bufferedInput
    .byZmqPacket
    .asInputRange)

I'm still not 100% sure that this is the right way to do it...

Hm... if asInputRange took a template parameter of what type it should return, then asInputRange!zmqPacket could return zmqPacket(pipe.window) for front. That's kind of nice.

>> I'm thinking I'll change the name byInputRange to byWindow, and add a
>> byElement for an element-wise input range.
>>
> Oh, I see.  Naming.  Naming is hard.

Yes. It's especially hard when you haven't seen how others react to it :)

-Steve
February 18, 2016
On 2/18/16 12:08 PM, Wyatt wrote:
> On Thursday, 18 February 2016 at 15:44:00 UTC, Steven Schveighoffer wrote:
>> On 2/17/16 5:54 AM, John Colvin wrote:
>>> On Wednesday, 17 February 2016 at 07:15:01 UTC, Steven Schveighoffer
>>> wrote:
>>>> On 2/17/16 1:58 AM, Rikki Cattermole wrote:
>>>>
>>>>> A few things:
>>>>> https://github.com/schveiguy/iopipe/blob/master/source/iopipe/traits.d#L126
>>>>>
>>>>>
>>>>> why isn't that used more especially with e.g. window?
>>>>> After all, window seems like a very well used word...
>>>>
>>>> Not sure what you mean.
>>>>
>>>>> I don't like that a stream isn't inherently an input range.
>>>>> This seems to me like a good place to use this abstraction by default.
>>>>
>>>> What is front for an input stream? A byte? A character? A word? A line?
>>>
>>> Why not just say it's a ubyte and then compose with ranges from there?
>>
>> If I provide a range by element (it may not be ubyte), then that's
>> likely not the most useful range to have.
>>
> I hadn't thought of this before, but if we accept that a stream is raw,
> untyped data, it may be best _not_ to provide a range interface
> directly.  It's easy enough to
>
> alias source = sourceStream.as!ubyte;
>
> anyway, right?

An iopipe is typed however you want it to be.

bufferedInput by default uses an ArrayBuffer!ubyte. You can have it use any type of buffer you want, it doesn't discriminate. The only requirement is that the buffer's window is a random-access range (although I'm having thoughts that I should just require it to be an array).

But the concept of what constitutes an "item" in a stream may not be the "element type". That's what I'm getting at.

>
>> This is why I think it's better to have the user specifically tell me
>> "this is how I want to range-ify this stream" rather than assume.
>>
> I think this makes more sense with TLV encodings, too.  Thinking of
> things like:
>
> switch(source.as!(BERType).popFront){
>      case(UNIVERSAL|PRIMITIVE|UTF8STRING){
>          int len;
>          if(source.as!(BERLength).front & 0b10_00_00_00) {
>              // X.690? Never heard of 'em!
>          } else {
>              len = source.as!(BERLength).popFront;
>          }
>          return source.buffered(len).as!(string).popFront;
>      }
>      ...etc.
> }

Very cool looking!

However, you have some issues there :) popFront doesn't return anything. And I think parsing/processing stream data works better by examining the buffer than shoehorning range functions in there.

-Steve
February 18, 2016
On Thursday, 18 February 2016 at 18:35:40 UTC, Steven Schveighoffer wrote:
> On 2/18/16 12:08 PM, Wyatt wrote:
>>
>> I hadn't thought of this before, but if we accept that a stream is raw,
>> untyped data, it may be best _not_ to provide a range interface
>> directly.  It's easy enough to
>>
>> alias source = sourceStream.as!ubyte;
>>
>> anyway, right?
>
> An iopipe is typed however you want it to be.
>
Sorry, sorry, just thinking (too much?) in terms of the conceptual underpinnings.

But I don't think we really disagree, either: if you don't give a stream a type it doesn't have one "naturally", so it's best to be explicit even if you're just asking for raw bytes.  That's all I'm really saying there.

> But the concept of what constitutes an "item" in a stream may not be the "element type". That's what I'm getting at.
>
Hmm, I guess I'm not seeing it.  Like, what even is an "item" in a stream?  It sort of precludes that by definition, which is why we have to give it a type manually.  What benefit is there to giving the buffer type separately from the window that gives you a typed slice into it? (I like that, btw.)

> However, you have some issues there :) popFront doesn't return anything.

Clearly, as!() returns the data! ;)

But criminy, I do actually forget that ALL the damn time!  (I blame Broadcom.)  The worst part is I think I've even read the rationale for why it's like that and agreed with it with much nodding of the head and all that. :(

> And I think parsing/processing stream data works better by examining the buffer than shoehorning range functions in there.
>
I think it's debatable.  But part of stream semantics is being able to use it like a stream, and my BER toy was in that vein.  Sorry again, this is probably not the place for it unless you try to replace the std.stream for real.

-Wyatt