October 16, 2017
On Monday, 16 October 2017 at 14:45:21 UTC, Steven Schveighoffer wrote:
> On 10/12/17 8:41 AM, Steven Schveighoffer wrote:
>> On 10/12/17 1:48 AM, Dmitry Olshansky wrote:
>>> On Thursday, 12 October 2017 at 04:22:01 UTC, Steven Schveighoffer wrote:
>>>> [...]
>>>
>>> Might be able to help you on that using WinAPI for I/O. (I assume bypassing libc is one of goals).
>> 
>> That would be awesome! Yes, the idea is to avoid any "extra" buffering. So using CreateFile, ReadFile, etc.
>
> Dmitry hold off on this if you were going to do it. I have been looking at Jason White's io library, and think I'm going to just extract all the low-level types he has there as a basic io library, as they are fairly complete, and start from there. His library includes the ability to use Windows.
>
Meh, not that I had mich spare time to actually do anything ;)

Might help by reviewing what you have there.
> -Steve

October 16, 2017
On Monday, 16 October 2017 at 19:36:20 UTC, Dmitry Olshansky wrote:
>> Dmitry hold off on this if you were going to do it. I have been looking at Jason White's io library, and think I'm going to just extract all the low-level types he has there as a basic io library, as they are fairly complete, and start from there. His library includes the ability to use Windows.
>>
> Meh, not that I had mich spare time to actually do anything ;)
>
> Might help by reviewing what you have there.

Started to work on unbuffered I/O, had it in mind for quite a while already.
http://github.com/MartinNowak/io



October 16, 2017
On Friday, 13 October 2017 at 17:08:18 UTC, Steven Schveighoffer wrote:
>> I keep https://github.com/MartinNowak/bloom also as example/scaffold repo, it's using an automated docs setup with gh-branches.
>> 
>> Just create a doc deployment token (https://github.com/settings/tokens) with public_repo access and store that encrypted in your .travis-ci.yml.
>
> Martin, I would appreciate and I think many people would, a blog/tutorial on how to do this.

Indeed, that already crossed my mind a couple of times ;).
October 17, 2017
On 10/16/17 4:56 PM, Martin Nowak wrote:
> On Monday, 16 October 2017 at 19:36:20 UTC, Dmitry Olshansky wrote:
>>> Dmitry hold off on this if you were going to do it. I have been looking at Jason White's io library, and think I'm going to just extract all the low-level types he has there as a basic io library, as they are fairly complete, and start from there. His library includes the ability to use Windows.
>>>
>> Meh, not that I had mich spare time to actually do anything ;)
>>
>> Might help by reviewing what you have there.
> 
> Started to work on unbuffered I/O, had it in mind for quite a while already.
> http://github.com/MartinNowak/io

Awesome!

Is the plan to put this into Phobos? If so, I would put it under std/experimental/io. However, if not, it should not be std/io.

Looks like it has all the stuff I had for my basic io type (and I see you have scatter read/write, that will help), so I will migrate iopipe to depend on it. I was thinking about using Jason White's io library, but I haven't seen him around in a while. Plus if this is going into Phobos, it would be the best thing for me to use.

Will pitch in when I can.

Thanks!

-Steve

October 17, 2017
>I was thinking about using Jason White's io library, but I haven't seen him around in a while
Yes, it would be interesting if you will get some from his lib. He have very good API
October 18, 2017
On Tuesday, 17 October 2017 at 13:45:02 UTC, Suliman wrote:
>>I was thinking about using Jason White's io library, but I haven't seen him around in a while
> Yes, it would be interesting if you will get some from his lib. He have very good API

I previously collaborated a bit on that library as it was very close to the design I had in mind since years. Unfortunately the lib seems unmaintained now and also went somewhat off-track with a std.socket wrapper (https://github.com/jasonwhite/io/commit/3bbe43954d9c11cc892da10f656f31fff863875a#diff-a5ef7b1ce67d62b95f9bdf019adc4784). But indeed I'll try to contact him.

Furthermore I want this to be very focused (no stat/fs functionality beyond what's necessary), but also add hooks for Fiber based async event loops.

It's really not that much work to write an unbuffered I/O library, so let's see where that goes.
October 18, 2017
On Tuesday, 17 October 2017 at 12:28:28 UTC, Steven Schveighoffer wrote:
> Is the plan to put this into Phobos? If so, I would put it under std/experimental/io. However, if not, it should not be std/io.

I don't know yet how it will turn, but phobos is very much in need of a better Files and Sockets. Certainly the ambition is to write a standard-worthy library.

Honestly it seems to me that the std.experimental-experiment didn't succeed. It's still too much overhead to develop in phobos (and get it reviewed/merged), there is no clear path from std.experimental -> std, and if sth. is well-proofed outside of phobos there is no point in putting it into std.experimental in the first place.

Developing std.io-v0.1.0 on dub until it reaches v1.0.0, seems like a straightforward and obvious approach. Also at our current community size, I'm hardly worried about namespace clashes.
Plus I'm already using std.internal.cstring as workhorse to support any string-like ranges (including @nogc std.path ranges) and core.internal.string : unsignedToTempString to avoid the fat and exception throwing formattedWrite (even the templated variant isn't nothrow).
October 19, 2017
On 10/13/2017 08:39 PM, Steven Schveighoffer wrote:
> What would be nice is a mechanism to detect this situation, since the above is both un-@safe and incorrect code.
> 
> Possibly you could instrument a window with a mechanism to check to see if it's still correct on every access, to be used when compiled in non-release mode for checking program correctness.
> 
> But in terms of @safe code in release mode, I think the only option is really to rely on the GC or reference counting to allow the window to still exist.

We should definitely find a @nogc solution to this, but it's a good
litmus test for the RC compiler support I'll work on.
Why do IOPipe have to hand over the window to the caller?
They could just implement the RandomAccessRange interface themselves.

Instead of
```d
auto w = f.window();
f.extend(random());
w[0];
```
you could only do
```d
f[0];
f.extend(random());
f[0]; // bug, but no memory corruption
```

This problem seems to be very similar to the Range vs. Iterators difference, the former can perform bounds checks on indexing, the later are inherently unsafe (with expensive runtime debug checks e.g. in VC++). Similarly always accessing the buffer through IOPipe would allow cheap bounds checking, and sure you could still offer IOPipe.ptr for unsafe code.

-Martin
October 19, 2017
On 10/19/17 7:13 AM, Martin Nowak wrote:
> On 10/13/2017 08:39 PM, Steven Schveighoffer wrote:
>> What would be nice is a mechanism to detect this situation, since the
>> above is both un-@safe and incorrect code.
>>
>> Possibly you could instrument a window with a mechanism to check to see
>> if it's still correct on every access, to be used when compiled in
>> non-release mode for checking program correctness.
>>
>> But in terms of @safe code in release mode, I think the only option is
>> really to rely on the GC or reference counting to allow the window to
>> still exist.
> 
> We should definitely find a @nogc solution to this, but it's a good
> litmus test for the RC compiler support I'll work on.
> Why do IOPipe have to hand over the window to the caller?
> They could just implement the RandomAccessRange interface themselves.
> 
> Instead of
> ```d
> auto w = f.window();
> f.extend(random());
> w[0];
> ```
> you could only do
> ```d
> f[0];
> f.extend(random());
> f[0]; // bug, but no memory corruption
> ```

So the idea here (If I understand correctly) is to encapsulate the window into the pipe, such that you don't need to access the buffer separately? I'm not quite sure because of that last comment. If f[0] is equivalent to previous code f.window[0], then the second f[0] is not a bug, it's valid, and accessing the first element of the window (which may have moved).

But let me assume that was just a misunderstanding...

> 
> This problem seems to be very similar to the Range vs. Iterators
> difference, the former can perform bounds checks on indexing, the later
> are inherently unsafe (with expensive runtime debug checks e.g. in VC++).

But ranges have this same problem.

For instance:
const(char[])[] lines = stdin.byLine.array;

Here, since byLine uses GC buffering, it's @safe (but wrong). If non-GC buffers are used, then it's not @safe.

I think as long as the windows are backed by GC data, it should be @safe. In this sense, your choice of buffering scheme can make something @safe or not @safe. I'm OK with that, as long as iopipes can be @safe in some way (and that happens to be the default).

> Similarly always accessing the buffer through IOPipe would allow cheap
> bounds checking, and sure you could still offer IOPipe.ptr for unsafe code.

It's an interesting idea to simply make the iopipe the window, not just for @safety reasons:

1. this means the iopipe itself *is* a random access range, allowing it to automatically fit into existing algorithms.
2. Existing random-access ranges can be easily shoehorned into being ranges (I already did it with arrays, and it's not much harder with popFrontN). Alternatively, code that uses iopipes can simply check for the existence of iopipe-like methods, and use them if they are present.
3. Less verbose usage, and more uniform access. For instance if an iopipe defines opIndex, then iopipe.window[0] and iopipe[0] are possibly different things, which would be confusing.

Some downsides however:

1. iopipes can be complex and windows are not. They were a fixed view of the current buffer. The idea that I can fetch a window of data from an iopipe and then deal simply with that part of the data was attractive.

2. The iopipe is generally not copyable once usage begins. In other words, the feature of ranges that you can copy them and they just work, would be difficult to replicate in iopipe.

A possible way forward could be:

* iopipe is a random-access range (not necessarily a forward range).
* iopipe.window returns a non-extendable window of the buffer itself, which is a forward/random-access range. If backed by the GC or some form of RC, everything is @safe.
* Functions which now take iopipes could be adjusted to take random-access ranges, and if they are also iopipes, could use the extend features to get more data.
* iopipe.release(size_t) could be hooked by popFrontN. I don't like the idea of supporting slicing on iopipes, for the non-forward aspect of iopipe. Much better to have an internal hook that modifies the range in-place.

This would make iopipes fit right into the range hierarchy, and therefore could be integrated easily into Phobos.

In fact, I can accomplish most of this by simply adding the appropriate range operations to iopipes. I have resisted this in the past but I can't see how it hurts.

For Phobos inclusion, however, I don't know how to reconcile auto-decoding. I absolutely need to treat buffers of char, wchar, and dchar data as normal buffers, and not something else. This one thing may keep it from getting accepted.

-Steve
October 21, 2017
On 10/19/2017 03:12 PM, Steven Schveighoffer wrote:
> On 10/19/17 7:13 AM, Martin Nowak wrote:
>> On 10/13/2017 08:39 PM, Steven Schveighoffer wrote:
>>> What would be nice is a mechanism to detect this situation, since the above is both un-@safe and incorrect code.
>>>
>>> Possibly you could instrument a window with a mechanism to check to see if it's still correct on every access, to be used when compiled in non-release mode for checking program correctness.
>>>
>>> But in terms of @safe code in release mode, I think the only option is really to rely on the GC or reference counting to allow the window to still exist.
>>
>> We should definitely find a @nogc solution to this, but it's a good
>> litmus test for the RC compiler support I'll work on.
>> Why do IOPipe have to hand over the window to the caller?
>> They could just implement the RandomAccessRange interface themselves.
>>
>> Instead of
>> ```d
>> auto w = f.window();
>> f.extend(random());
>> w[0];
>> ```
>> you could only do
>> ```d
>> f[0];
>> f.extend(random());
>> f[0]; // bug, but no memory corruption
>> ```
> 
> So the idea here (If I understand correctly) is to encapsulate the window into the pipe, such that you don't need to access the buffer separately? I'm not quite sure because of that last comment. If f[0] is equivalent to previous code f.window[0], then the second f[0] is not a bug, it's valid, and accessing the first element of the window (which may have moved).

The above sample with the window is a bug and memory corruption because
of iterator/window invalidation by extend.
If you didn't thought of the invalidation, then the latter example would
still be a bug to you, but not a memory corruption.

>> This problem seems to be very similar to the Range vs. Iterators difference, the former can perform bounds checks on indexing, the later are inherently unsafe (with expensive runtime debug checks e.g. in VC++).
> 
> But ranges have this same problem.
> 
> For instance:
> const(char[])[] lines = stdin.byLine.array;
> 
> Here, since byLine uses GC buffering, it's @safe (but wrong). If non-GC buffers are used, then it's not @safe.
> 
> I think as long as the windows are backed by GC data, it should be @safe. In this sense, your choice of buffering scheme can make something @safe or not @safe. I'm OK with that, as long as iopipes can be @safe in some way (and that happens to be the default).
> 
>> Similarly always accessing the buffer through IOPipe would allow cheap bounds checking, and sure you could still offer IOPipe.ptr for unsafe code.
> 
> It's an interesting idea to simply make the iopipe the window, not just for @safety reasons:
> 
> 1. this means the iopipe itself *is* a random access range, allowing it
> to automatically fit into existing algorithms.
> 2. Existing random-access ranges can be easily shoehorned into being
> ranges (I already did it with arrays, and it's not much harder with
> popFrontN). Alternatively, code that uses iopipes can simply check for
> the existence of iopipe-like methods, and use them if they are present.
> 3. Less verbose usage, and more uniform access. For instance if an
> iopipe defines opIndex, then iopipe.window[0] and iopipe[0] are possibly
> different things, which would be confusing.
> 
> Some downsides however:
> 
> 1. iopipes can be complex and windows are not. They were a fixed view of the current buffer. The idea that I can fetch a window of data from an iopipe and then deal simply with that part of the data was attractive.

You could still have a window internally and just forward to that.

> 2. The iopipe is generally not copyable once usage begins. In other words, the feature of ranges that you can copy them and they just work, would be difficult to replicate in iopipe.

That's a general problem. Unique ownership is really useful, but most phobos range methods don't care, and assume copying is implicit saving. Not too nice and I guess this will bite us again with RC/Unique/Weak.

The current workaround for this is `refRange`.

> A possible way forward could be:
> 
> * iopipe is a random-access range (not necessarily a forward range).
> * iopipe.window returns a non-extendable window of the buffer itself,
> which is a forward/random-access range. If backed by the GC or some form
> of RC, everything is @safe.
> * Functions which now take iopipes could be adjusted to take
> random-access ranges, and if they are also iopipes, could use the extend
> features to get more data.
> * iopipe.release(size_t) could be hooked by popFrontN. I don't like the
> idea of supporting slicing on iopipes, for the non-forward aspect of
> iopipe. Much better to have an internal hook that modifies the range
> in-place.
> 
> This would make iopipes fit right into the range hierarchy, and therefore could be integrated easily into Phobos.

I made an interesting experiment with buffered input ranges quite a
while ago.
https://gist.github.com/MartinNowak/1257196

This would use popFront to fetch new data and ref-counts a list of buffers depending on older saved ranges still using earlier buffers. With a bit of creative use, the existing Range primitives could be used to implement infinite look-ahead.

auto beg = rng.save;
auto end = rng.find("bla");
auto window = beg[0 .. end]; // get a random access window

The main problem with this has been, that the many implicit copies (e.g.
in foreach) bump the reference-count, so the RC buffer release would
often not work.
Could be avoided by making them non-copyable, but again phobos and
foreach currently don't support this hybrid of input (consuming) and
forward (saveable) range.

-Martin