[RFC] I/O and Buffer Range (page 4) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » [RFC] I/O and Buffer Range (page 4)

January 16, 2014

Re: [RFC] I/O and Buffer Range

Posted by Dmitry Olshansky
in reply to Walter Bright

Dmitry Olshansky

Posted in reply to Walter Bright

17-Jan-2014 00:18, Walter Bright пишет:
> On 12/29/2013 2:02 PM, Dmitry Olshansky wrote:
>> The BufferRange concept itself (for now called simply Buffer) is
>> defined here:
>> http://blackwhale.github.io/datapicked/dpick.buffer.traits.html
>
> I am confused because there are 4 terms conflated here:
>
> BufferRange
> Buffer
One and the same fr the moment.

You even quoted it:
>BufferRange .. (for now called simply Buffer)...

> InputStream
> Stream

One and the same as I dealt with the input side of the problem only.

-- 
Dmitry Olshansky

January 16, 2014

Re: [RFC] I/O and Buffer Range

Posted by Steven Schveighoffer
in reply to Dmitry Olshansky

Steven Schveighoffer

Posted in reply to Dmitry Olshansky

On Thu, 16 Jan 2014 17:28:31 -0500, Dmitry Olshansky <dmitry.olsh@gmail.com> wrote:

>
> The other way around :) 4 code units - 1 code point.

I knew that was going to happen :)

>> This would be a key addition for ANY type in order to properly work with
>> shared. BUT, I don't see how it works safely generically because you
>> necessarily have to cast away shared in order to call the methods. You
>> would have to limit this to only working on types it was intended for.
>
> The requirement may be that it's pure or should I say "well-contained". In other words as long as it doesn't smuggle references somewhere else it should be fine.
> That is to say it's 100% fool-proof, nor do I think that essentially simulating a synchronized class is a always a good thing to do...

I think you meant *not* 100% :) But yeah, regardless of how generalized it is, this is likely the best path. I think this is the tack that std.stdio.File takes anyway, except it's using FILE *'s locking mechanism.

>
> Convenient to work with does ring good to me. I simply see no need to reinvent std.algorithm on buffers especially the ones that just scan left-to-right.
> Example would be calculating a checksum of a stream (say data comes from a pipe or socket i.e. buffered). It's a trivial application of std.algorithm.reduce and there no need to reinvent that wheel IMHO.
>

I again think I am being misunderstood. Example might be appropriate:

struct Buffer {...} // implements BOTH forward range and Buffer primitives

struct OtherBuffer {...} // only implements Buffer primitives.

If isBuffer is modified to not require isForwardRange, then both Buffer and OtherBuffer will work as buffers, but only Buffer works as a range.

But as you have it now, isBuffer!OtherBuffer is false. Is this necessary?

So we can implement buffers that are both ranges and buffers, and will work with std.algorithm without modification (and that's fine and expected by me), but do we need to *require* that? Are we over-specifying? Is there a possibility that someone can invent a buffer that satisfies the primitives of say a line-by-line reader, but cannot possibly be a forward range?

-Steve

January 16, 2014

Re: [RFC] I/O and Buffer Range

Posted by Walter Bright
in reply to Dmitry Olshansky

Walter Bright

Posted in reply to Dmitry Olshansky

On 1/16/2014 2:30 PM, Dmitry Olshansky wrote:
> 17-Jan-2014 00:18, Walter Bright пишет:
>> On 12/29/2013 2:02 PM, Dmitry Olshansky wrote:
>>> The BufferRange concept itself (for now called simply Buffer) is
>>> defined here:
>>> http://blackwhale.github.io/datapicked/dpick.buffer.traits.html
>>
>> I am confused because there are 4 terms conflated here:
>>
>> BufferRange
>> Buffer
> One and the same fr the moment.
>
> You even quoted it:
>  >BufferRange .. (for now called simply Buffer)...

I know. But using two names for the same thing makes for confusing documentation. There also appears to be no rationale for "for the moment".


>> InputStream
>> Stream
>
> One and the same as I dealt with the input side of the problem only.

Same problem with multiple names for the same thing, also there's no definition of either name.

January 17, 2014

Re: [RFC] I/O and Buffer Range

Posted by Kagamin
in reply to Steven Schveighoffer

Kagamin

Posted in reply to Steven Schveighoffer

On Thursday, 16 January 2014 at 15:55:07 UTC, Steven Schveighoffer wrote:
> I am thinking of this layout for streams/buffers:
>
> 1. Unbuffered stream used for raw i/o, based on a class hierarchy (which I have pretty much written)
> 2. Buffer like you have, based on a struct, with specific primitives. It's job is to collect data from the underlying stream, and present it to consumers as a random-access buffer.

If you have a struct-based buffer, how would you enlarge the buffer? Won't it suffer from AA syndrome?

January 17, 2014

Re: [RFC] I/O and Buffer Range

Posted by Dmitry Olshansky
in reply to Kagamin

Dmitry Olshansky

Posted in reply to Kagamin

17-Jan-2014 13:19, Kagamin пишет:
> On Thursday, 16 January 2014 at 15:55:07 UTC, Steven Schveighoffer wrote:
>> I am thinking of this layout for streams/buffers:
>>
>> 1. Unbuffered stream used for raw i/o, based on a class hierarchy
>> (which I have pretty much written)
>> 2. Buffer like you have, based on a struct, with specific primitives.
>> It's job is to collect data from the underlying stream, and present it
>> to consumers as a random-access buffer.
>
> If you have a struct-based buffer, how would you enlarge the buffer?

What's the problem? I don't see how struct/class can change there anything, it's a member field that is an array that we surely can expand.

> Won't it suffer from AA syndrome?

Buffer is created with factory functions only. It's not like an AA that grows from an empty/null state. Empty buffer (.init) doesn't grow it's simply empty.

-- 
Dmitry Olshansky

January 17, 2014

Re: [RFC] I/O and Buffer Range

Posted by Dmitry Olshansky
in reply to Steven Schveighoffer

Dmitry Olshansky

Posted in reply to Steven Schveighoffer

17-Jan-2014 02:41, Steven Schveighoffer пишет:
> On Thu, 16 Jan 2014 17:28:31 -0500, Dmitry Olshansky
> <dmitry.olsh@gmail.com> wrote:
>
>>
>> The other way around :) 4 code units - 1 code point.
>
> I knew that was going to happen :)

Aye. BTW I haven't thought of writing into the buffer, but it works exactly the same. It could be even read/write -"discarded" data is written to underlying stream, freshly loaded is read from stream. Now in case of input stream only writes are nop, for output-only reads are nops.

With pinning it makes for cool multi-pass algorithms that actually output stuff into the file.

>>> This would be a key addition for ANY type in order to properly work with
>>> shared. BUT, I don't see how it works safely generically because you
>>> necessarily have to cast away shared in order to call the methods. You
>>> would have to limit this to only working on types it was intended for.
>>
>> The requirement may be that it's pure or should I say
>> "well-contained". In other words as long as it doesn't smuggle
>> references somewhere else it should be fine.
>> That is to say it's 100% fool-proof, nor do I think that essentially
>> simulating a synchronized class is a always a good thing to do...
>
> I think you meant *not* 100% :) But yeah, regardless of how generalized
> it is, this is likely the best path. I think this is the tack that
> std.stdio.File takes anyway, except it's using FILE *'s locking mechanism.
>

Aye.

>> Convenient to work with does ring good to me. I simply see no need to
>> reinvent std.algorithm on buffers especially the ones that just scan
>> left-to-right.
>> Example would be calculating a checksum of a stream (say data comes
>> from a pipe or socket i.e. buffered). It's a trivial application of
>> std.algorithm.reduce and there no need to reinvent that wheel IMHO.
>>
>
> I again think I am being misunderstood. Example might be appropriate:
>

Looking at the post by Walter I see I need to clarify things.
And if you browse the thread you'd see my understanding also changed with time - I started with no stinkin' forward ranges with buffered I/O only to later eat my words and make them happen.

First let's call buffer a thing that pack an array and few extras to support pinning, refiling of data from underlying stream and extending.
It exposes current "window" of underlying stream.

Now we have pins in that buffer that move along. A pin not only enforces that all data "to the left" stays accessible but also is a "view" of buffer. It's even conceptually a range with extra primitives outlined here:
http://blackwhale.github.io/datapicked/dpick.buffer.traits.html

I should stick to naming it BufferRange. The "real buffer" is internal construct and BufferRanges share ownership of it.

This is exactly what I ended up with in my second branch.
"real" buffer:
https://github.com/blackwhale/datapicked/blob/fwd-buffer-range/dpick/buffer/buffer.d#L417
vs buffer range over it:
https://github.com/blackwhale/datapicked/blob/fwd-buffer-range/dpick/buffer/buffer.d#L152

Naming issues are apparently even worse then Walter implies.

> struct Buffer {...} // implements BOTH forward range and Buffer primitives
>
> struct OtherBuffer {...} // only implements Buffer primitives.
>
> If isBuffer is modified to not require isForwardRange, then both Buffer
> and OtherBuffer will work as buffers, but only Buffer works as a range.
>
> But as you have it now, isBuffer!OtherBuffer is false. Is this necessary?
>

I think I should call it BufferRange from now on. And bring my docs in line. It may make sense to provide naked Buffer itself as a construct (it has simpler interface) and have generic BufferRange wrapper. I'm just not seeing user code using the former - too error prone and unwieldy.

> So we can implement buffers that are both ranges and buffers, and will
> work with std.algorithm without modification (and that's fine and
> expected by me), but do we need to *require* that? Are we
> over-specifying?

Random-Access range had to require .front/.back maybe it was over-specification too? Some stuff is easy to index but not "drop off an item at either end". But now it really doesn't matter - if there are such things, they are not random-access ranges.

> Is there a possibility that someone can invent a buffer
> that satisfies the primitives of say a line-by-line reader, but cannot
> possibly be a forward range?

I hardly can see that:
front --> lookahead(1)[0]
empty --> lookahead(1).length != 0
popFront --> seek(1) or enforce(seek(1))

save -> well there got to be a way to pin the data in the buffer?

And they surely could be better implemented inside of a specific buffer range.

-- 
Dmitry Olshansky

January 17, 2014

Re: [RFC] I/O and Buffer Range

Posted by Jakob Ovrum
in reply to Kagamin

Jakob Ovrum

Posted in reply to Kagamin

On Friday, 17 January 2014 at 09:19:12 UTC, Kagamin wrote:
> On Thursday, 16 January 2014 at 15:55:07 UTC, Steven Schveighoffer wrote:
>> I am thinking of this layout for streams/buffers:
>>
>> 1. Unbuffered stream used for raw i/o, based on a class hierarchy (which I have pretty much written)
>> 2. Buffer like you have, based on a struct, with specific primitives. It's job is to collect data from the underlying stream, and present it to consumers as a random-access buffer.
>
> If you have a struct-based buffer, how would you enlarge the buffer? Won't it suffer from AA syndrome?

It is the lazily initialized nature of AAs that causes the problem. Structs can, but don't have to, replicate the behaviour.

January 17, 2014

Re: [RFC] I/O and Buffer Range

Posted by Kagamin
in reply to Dmitry Olshansky

Kagamin

Posted in reply to Dmitry Olshansky

On Friday, 17 January 2014 at 09:33:41 UTC, Dmitry Olshansky wrote:
> What's the problem? I don't see how struct/class can change there anything, it's a member field that is an array that we surely can expand.

Ah, I thought one can copy the buffer, now I see you pass it by ref.

January 17, 2014

Re: [RFC] I/O and Buffer Range

Posted by Steven Schveighoffer
in reply to Dmitry Olshansky

Steven Schveighoffer

Posted in reply to Dmitry Olshansky

On Fri, 17 Jan 2014 05:01:35 -0500, Dmitry Olshansky <dmitry.olsh@gmail.com> wrote:

> BTW I haven't thought of writing into the buffer, but it works exactly the same. It could be even read/write -"discarded" data is written to underlying stream, freshly loaded is read from stream. Now in case of input stream only writes are nop, for output-only reads are nops.
>
> With pinning it makes for cool multi-pass algorithms that actually output stuff into the file.

In my experience, the code/process for a write buffer is significantly different than the code for a read buffer. A read/write buffer is very difficult to make, because you either need 2 file pointers, or need to constantly seek the single file pointer to overwrite the existing data.

>> But as you have it now, isBuffer!OtherBuffer is false. Is this necessary?
>>
>
> I think I should call it BufferRange from now on. And bring my docs in line. It may make sense to provide naked Buffer itself as a construct (it has simpler interface) and have generic BufferRange wrapper. I'm just not seeing user code using the former - too error prone and unwieldy.

In a sense, I agree. There aren't many things to do directly with a buffer (i.e. there will likely be few filters), and a buffer range provides enough primitives to make front/popFront/empty trivial.

>> So we can implement buffers that are both ranges and buffers, and will
>> work with std.algorithm without modification (and that's fine and
>> expected by me), but do we need to *require* that? Are we
>> over-specifying?
>
> Random-Access range had to require .front/.back maybe it was over-specification too? Some stuff is easy to index but not "drop off an item at either end". But now it really doesn't matter - if there are such things, they are not random-access ranges.
>
>> Is there a possibility that someone can invent a buffer
>> that satisfies the primitives of say a line-by-line reader, but cannot
>> possibly be a forward range?
>
> I hardly can see that:
> front --> lookahead(1)[0]
> empty --> lookahead(1).length != 0
> popFront --> seek(1) or enforce(seek(1))
>
> save -> well there got to be a way to pin the data in the buffer?
>
> And they surely could be better implemented inside of a specific buffer range.

I'll have to take a closer look at your code to have a reasonable response. But I think this looks fine so far.

-Steve

January 17, 2014

Re: [RFC] I/O and Buffer Range

Posted by Dmitry Olshansky
in reply to Steven Schveighoffer

Dmitry Olshansky

Posted in reply to Steven Schveighoffer

17-Jan-2014 18:03, Steven Schveighoffer пишет:
> On Fri, 17 Jan 2014 05:01:35 -0500, Dmitry Olshansky
> <dmitry.olsh@gmail.com> wrote:
>
>> BTW I haven't thought of writing into the buffer, but it works exactly
>> the same. It could be even read/write -"discarded" data is written to
>> underlying stream, freshly loaded is read from stream. Now in case of
>> input stream only writes are nop, for output-only reads are nops.
>>
>> With pinning it makes for cool multi-pass algorithms that actually
>> output stuff into the file.
>
> In my experience, the code/process for a write buffer is significantly
> different than the code for a read buffer. A read/write buffer is very
> difficult to make, because you either need 2 file pointers, or need to
> constantly seek the single file pointer to overwrite the existing data.
>

Agreed, read/write buffer is a bad idea. As for write-only buffer implemented in the same style as read-only, well I need to try it first.


-- 
Dmitry Olshansky

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation