September 04, 2011
== Quote from Jonathan M Davis (jmdavisProg@gmx.com)'s article
> Any overhaul of existing functionality needs to improve on existing
> functionality. Changes just to change aren't valuable. So, changes should
> generally avoiding breaking backwards compatibility unless we gain something
> from it. So, as long as these changes are an overall improvement, then we'll
> just have to deal with the code breakage. However, if the code breakage
> doesn't actually gain us anything, then we should avoid it. So, complaints
> about code breakage are valid, but they aren't deal breaking.
> - Jonathan M Davis

I mostly agree with what you said, except that this proposal breaks a frequently used standard library module severely and without a clear gradual migration path.
September 04, 2011
== Quote from Andrej Mitrovic (andrej.mitrovich@gmail.com)'s article
> Seems to me like virtually every module in Phobos gets a complete rewrite sooner or later. Yikes! Afaik the upcoming ones are also std.xml, std.variant, maybe std.json too? (can't recall). Was there really so much bad code written in Phobos all along that they all require a rewrite?

It's really amazing how much cruft 2-3 year old D code tends to have:  Workarounds for compiler bugs, workarounds for previously missing features, a generally lower standard for quality before we implemented a proper review process, etc.  Heck, I've got a pull request in Github that rewrites a substantial portion of std.parallelism to take advantage of better implementations I've found for parallel foreach and amap, fix a couple bugs and get rid of tons of cruft, and this module's only been in Phobos a few months.  These changes are purely under the hood, though, and there should be zero code breakage.
September 04, 2011
On 9/3/11 9:53 PM, Walter Bright wrote:
> On 9/3/2011 5:58 PM, Jonathan M Davis wrote:
>> However, if the code breakage
>> doesn't actually gain us anything, then we should avoid it. So,
>> complaints
>> about code breakage are valid, but they aren't deal breaking.
>
> The larger the amount of code that is broken, the more gain there must
> be to justify it.
>
> Breaking std.stdio, which is used everywhere, this thoroughly needs a
> very high bar of justification.

I agree. I'm hoping the new stuff could build on top of std.stdio.

Andrei
September 04, 2011
On 9/3/11 10:11 PM, Steven Schveighoffer wrote:
> On Sat, 03 Sep 2011 17:20:53 -0400, dsimcha <dsimcha@yahoo.com> wrote:
>
>> == Quote from Andrei Alexandrescu (SeeWebsiteForEmail@erdani.org)'s
>> article
>>> Hello,
>>> There are a number of issues related to D's current handling of streams,
>>> including the existence of the imperfect etc.stream and the
>>> over-specialization of std.stdio.
>>> Steve has worked on an extensive overhaul of std.stdio which would
>>> obviate the need for etc.stream and would improve both the generality
>>> and efficiency of std.stdio.
>>> Please chime in with feedback; he's away from the Usenet but allowed me
>>> to post this on his behalf. I uploaded the docs to
>>> http://erdani.com/d/new-stdio/phobos-prerelease/std_stdio.html
>>> Thanks,
>>> Andrei
>>
>> After a quick look, I have two concerns:
>>
>> 1. File is a class, not a struct. This precludes using reference
>> counting as the
>> current std.stdio.File does, meaning you have to close all your Files
>> manually. I
>> loved the reference counting semantics, especially the last few
>> releases since
>> most of the relevant compiler bugs have been fixed.
>
> As long as a class can contain a File as a member, this argument makes
> no sense to me. In other words, it's impossible to remove the GC from
> the File destructor/refcounting system.

The meaning of the argument is that just because there is the possibility of a File leaking, we shouldn't increase the likelihood of such a leak.


Andrei
September 04, 2011
On Sat, 03 Sep 2011 21:23:26 -0400, Walter Bright <newshound2@digitalmars.com> wrote:

> On 9/3/2011 3:53 PM, dsimcha wrote:
>> Agreed, but in the big picture this overhaul still breaks way too much code
>> without either a clear migration path or a clear argument about why such extensive
>> breakage is necessary.  The part about File(someFileName, someMode) is just the
>> first thing I noticed.
>
> [rant]
>
> I agree. I agree that std.stream should be replaced, but I have a lot of misgivings about replacing std.stdio. I do not want to rewrite every darn D program I've ever written. I think it is a bad idea to break everyone else's D program.
>
> Everything in dsource will break in non-trivial ways. I don't think we can afford this. I do not know of any successful system or language that breaks user code with such aplomb as D does. Not even C++ dares to break that Piece Of S*** that everyone knows iostreams is. I can compile and run unix C code from 30 years ago on Linux with no changes at all. Same with DOS code.
>
> There needs to be huge improvement to justify such breakage.
>
> [I also don't like it that all my code that uses std.path is now broken.]
>
> I would prefer to see all the energy that is going into refactoring existing, working modules go into designing new, not existing, modules that there's a crying need for.
>
> [/rant]

Please, leave all pitchforks and torches at rest for the moment :)  I want to stress, this is *NOT* a proposal for inclusion or generating a pull request tomorrow.  It's a very very early version, almost a proof of concept, to show *why* we need to change things.  Most of the library is up for debate.  I agree it needs to be more compatible with current code.

In hindsight, I probably should have said no when Andrei asked to post this on the NG, and did it myself when I could stress the state of it.  The two most important things are:

1. the interface additions, in particular the readUntil portion (which I think provides a very powerful interface for parsing systems).
2. the performance.  It's much better than current stdio.  Aren't people continuously complaining at how slow i/o is in Phobos compared to other libraries?

> Enough ranting for now, as for the proposed std.stdio,
>
> 1. It does look fairly straightforward, but:
>
> 2. There is only one example. Have any commonly done programming tasks been tried out with it to see how they work?

My main testing has been for:

1. utf input/output correctness of all formats
2. implementing readf/writef
3. testing performance.

I have not written any "real world" tests.  Probably the most interesting tests I've written are reading a UTF-X file and writing the data to a UTF-Y file (where X and Y are one of UTF-8, UTF-16LE, UTF-16BE, UTF-32LE, UTF-32BE).
>
> 3. There is no indication of how it interacts with C stdio. A primary goal of std.stdio was interoperability with C stdio.

useCStdio();

>
> 4. There are no benchmarks. The current std.stdio was designed/written in parallel with some benchmarks Andrei and others cooked up, as a primary goal was performance.

I can include these.

>
> 5. flushCheck - flushing should be done based on the file type. tty's should be \n flushed, files when the buffer is full. I question the performance of using a delegate to check for flushing. How often will it be called?

Once per write to the buffer.  Data is only checked once (the delegate is never given the same data to check again).  If you want, I can look at adding a means to avoid using a delegate when the trigger is a single character.
And TextInput/TextOutput auto detect whether a device is a tty, and install the right flushcheck function if necessary.

> 6. There is no provision for multithreaded writing, i.e. what happens when two threads write to stdout. Ideally, there should be a way to 'lock' the stream to oneself, in order to appropriately interleave the output.

Again, I wish I had not told Andrei to post :(  Multithreaded is not supported, but will be.  When that is ready, a locking mechanism (and hopefully an auto-unlock mechanism) will be provided.

> 7. I see nothing for 'raw' character by character input.

The interface is geared to read by processing the buffer, not one character at a time.  Given access to the buffer, you can process one character at a time if you want.

See InputRange in TextInput to see how raw character-by-character input can be done.

That being said, I think I need to add a peek function.

>
> 8. I see nothing for determining if a char is available on the input. How would one implement "press any key to continue"?

I need more information.  I would probably implement this as a read(ubyte[1]), so I don't see why it can't be that way.

-Steve
September 04, 2011
On Sat, 03 Sep 2011 18:55:08 -0400, Andrej Mitrovic <andrej.mitrovich@gmail.com> wrote:

> I dislike naming things with a leading "D" like "DInput". Shouldn't we
> keep code that relies on C to be put in etc.c or somewhere?

I think the names are not great.  The names are somewhat based on the metamorphosis of the entire interface structure.

What about BufferedInput and BufferedOutput?  Michel Fortin suggested those.

-Steve
September 04, 2011
On Sat, 03 Sep 2011 20:47:05 -0400, Walter Bright <newshound2@digitalmars.com> wrote:

>
> What happens if I write:


>
>     printf("hello ");
>     writeln("world");

useCStdio();

This makes all the standard handles C-based.

And crap, I see I did not document it.... grr....

See here:

https://github.com/schveiguy/phobos/blob/new-io/std/stdio.d#L3332

Sorry....

-Steve
September 04, 2011
On Sat, 03 Sep 2011 21:58:09 -0400, Steven Schveighoffer <schveiguy@yahoo.com> wrote:

please read the previous comment, it includes a link to the source as well as further explanations...

Boy, I could have planned this better...

-Steve
September 04, 2011
On 9/3/11 10:02 PM, Andrej Mitrovic wrote:
> Seems to me like virtually every module in Phobos gets a complete
> rewrite sooner or later. Yikes! Afaik the upcoming ones are also
> std.xml, std.variant, maybe std.json too? (can't recall). Was there
> really so much bad code written in Phobos all along that they all
> require a rewrite?

It's not that bad. First, it's understandable that now there are considerably more contributors and it's a bit easier tinkering with existing stuff than coming up with all new stuff.

Second, historically we're at an all-time high of talent involved in D. I'm sure it will go up much more, but previously we've had a more accepting attitude to new functionality at the cost of scrutiny (e.g. std.xml and std.json, both written by episodic contributors). (I really regret having had that attitude, it hurt us.) So now that there are so many eyeballs focused on the code, and not just any eyeballs but eyeballs connected to good brains, there is pressure building up.

There are quite a few pieces in Phobos that are withstanding scrutiny quite well: getopt, algorithm, variant (which can be, I think, safely extended to new great functionality), range, conv, random, and more. There are, unfortunately, others that didn't start off the right foot and right now are somewhat of an eyesore. I trust we will figure what to do about each on a by-case basis, though I agree with Walter that we should balance the breakage cost with correspondingly high rewards in terms of functionality improvements.


Andrei
September 04, 2011
Ah, reading your post I see this is just a start of the overhaul. I assumed this was already getting ready for a review. Names can be fixed eventually. :)