June 24, 2004
On Thu, 24 Jun 2004 20:32:52 +0000 (UTC), Sean Kelly <sean@f4.ca> wrote:
> In article <cbf8jg$221b$1@digitaldaemon.com>, Arcane Jill says...
>>
>> I'd be quite happy if std.stream were to be improved. Here are some suggestions.
>> You'll probably think that many of them are trivial, but each, in their own way,
>> contributes a small amount of annoyance, and I'm sure these things could be
>> easily got rid of.
>>
>> 1) Since it is more normal to want buffered file access than non-buffered file
>> access (in C, fopen() is called more often than open()), it makes sense that
>> File should be buffered by default, and there should be a separate class, maybe
>> called RawFileStream or something, for the unbuffered case.
>
> I was actually going to take a different approach and modify BufferedStream like
> so:
>
> BuferedStream( BaseStream ) : BaseStream {
> // override the low-level i/o methods to do buffering
> }
>
> So a buffered file stream would be:
>
> alias BufferedStream!(FileStream) BufferedFileStream;

I agree. This seems most logical to me.

> But as you say, file i/o is almost always buffered, so it may make sense to
> change the name of "FileStream" to "UnbufferedFileStream" and thus make buffered
> file i/o the default.

Perhaps.. Files may be an exception to the rule, but, if you can handle that exception as you have shown above, at no cost, then why not.

>> 2) File should in any case be renamed FileStream
>>
>> 3) FileMode.In and FileMode.Out should be renamed Filemode.IN and Filemode.OUT
>> respectively.
>
> Both already done :)  Well, it was going to be Stream.IN and Stream.OUT, but
> same thing.

Excellent. Generic names are good. i.e. you could have a template that too a stream type File, Socket etc, and pass the same IN OUT etc constants.

>> 5) In fact, all possible combinations of file opening supported by fopen()
>> should be supported by File. It should be possible to assert that the file does
>> or does not exist before opening it (atomically), to truncate or not truncate,
>> to position the file pointer at the start or end of the file, to allow
>> append-only access, etc.
>
> Right now the file stuff uses CreateFile in Windows.  Would it be better to use
> fopen and the other ANSI calls instead?

No.. well it depends if you want to do the buffering yourself i.e. using D arrays etc or make use of the existing fopen buffering.

If CreateFile on windows and open on unix with your own buffering is more efficient then go that way.

>> 7) EITHER Stream classes should be auto (likely to be an unpopular suggestion, I
>> know), OR there should be an auto wrapper class that you can construct from a
>> Stream, in order to guarantee that the file will be closed in the event of an
>> exception (which could of course be thrown by ANY piece of code). Currently we
>> have to either roll our own auto wrapper, or use a try/catch block.
>
> Interesting idea.  This may be another good template wrapper:
>
> auto class MakeAuto( BaseClass ) : BaseClass {}
>
>> 9) There should be a FilterStream class, from which BufferedStream inherits, so
>> that we can write our own stream filters. (Java does this. It's neat).
>>
>> 10) Streams don't necessarily have to do transcoding (see - I learnt a new
>> word), but nonetheless it should be POSSIBLE to construct them from a
>> Reader/Writer in order to make such extensions possible in the future.
>
> This kid of stuff should be saved for a later discussion.  All great ideas but
> they're the tip of a rather large iceberg.
>
>> 11) I want the function available(), as Java has. A buffered stream always knows
>> how much it's got left in its buffer, and I have no problem with an unbuffered
>> stream returning zero.
>
> Easy enough.  I was going to add this to the BufferedStream class, though
> perhaps it would be useful everywhere?
>
>> 12) stdin, stdout and stderr should be globally available D streams. (Maybe they
>> are already, but point (8) means there's a lot I don't know about existing
>> capabilities)
>
> I think they are.  I kind of consider Phobos to still be in the state where
> looking at the source files is best way to find out what's available.  Doxygen
> or other documentation is crucial, but I have a feeling that the stream API will
> be in flux for quite some time yet.
>
>> 13) Streams should overload the << and >> operators. (Someone suggested using ~.
>> That would be fine too).
>
> Overloading ~ wouldn't work I'm afraid, unless there's something I'm missing.
> Say I have a FileStream and I want to both read and write from it.  How do I
> know which I want to do if I'm using the same operator for both?
>
>> None of these is particularly difficult in and of itself, but together they add
>> up to a frustrating gripe list. But I'm fairly confident that if these flaws are
>> fixed (along with any other gripes which others may mention in the course of
>> this thread) then I imagine that most people will be pretty happy with new
>> improved std.stream.
>
> I agree.  But as D is still pretty early in its development I don't really want
> people to be happy with anything if it means losing constructive dialog.  I'm
> willing to sacrifice productivity in the short term if it means a better library
> in the long term.
>
>> But there is logic behind it. Currently, D does no transcoding - that is,
>> writeLine() will spit out raw UTF-8. Now that's fine if your output is going to
>> a text file, but if it's going to a console, you're screwed. Now you COULD
>> simplify this a bit by "automatically" encoding the output in the operating
>> system default encoding - but that would just reverse the problem. Now, output
>> to the console would be fine, but output destined to leave your machine and end
>> up on someone else's machine (e.g. text file, socket, etc.) would also be
>> similarly munged. UTF-8 is pretty much the best portable format, so ideally you
>> only want to encode at the last minute, just before the stream hits the user.
>
> This is the start of a rather long discussion.  I've had enough issues come up
> with formatted i/o that I'm going to leave that aspect rather bare and see what
> develops.  For the moment, I'm starting to think that handling plain ol' ASCII
> is probably enough until the rest can be worked out.

What do you think of my filters idea, as long as you can snap any number of filters to streams and each other your data will be transcoded etc from one end to the other, and back again in the other direction.

Regan.

-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
June 25, 2004
"Arcane Jill" <Arcane_member@pathlink.com> escribió en el mensaje
news:cbf8jg$221b$1@digitaldaemon.com
|
| ...
|
| 7) EITHER Stream classes should be auto (likely to be an unpopular
suggestion, I
| know), OR there should be an auto wrapper class that you can construct
from a
| Stream, in order to guarantee that the file will be closed in the event of
an
| exception (which could of course be thrown by ANY piece of code).
Currently we
| have to either roll our own auto wrapper, or use a try/catch block.
|
| ...
|
| Arcane Jill

You can do:

auto File myFile = new File (...);

And it'll work just as if File was auto.

-----------------------
Carlos Santander Bernal


June 25, 2004
"Carlos Santander B."  wrote
> You can do:
>
> auto File myFile = new File (...);
>
> And it'll work just as if File was auto.

Yeah, isn't that cool? I think it's the biz.


June 25, 2004
-1

I love mango.io truly I do.  I've been using it for pretty much everything since it was still called DSC.  /But/ there are still times when I just want to pump something small out, and get it done fast, and at times std.stream is quite useful for that.  I'd just assume continue to have std.stream get improved (and split into multiple modules, frankly, ie std.io.(stream, file, mem, ...)) and keep mango as a "non-standard" independant open-source library that just happens to kick some arse.  :)

-Chris S.
-Invironz
June 25, 2004
In article <cbfdpk$2aku$1@digitaldaemon.com>, Sean Kelly says...

>>5) In fact, all possible combinations of file opening supported by fopen() should be supported by File. It should be possible to assert that the file does or does not exist before opening it (atomically), to truncate or not truncate, to position the file pointer at the start or end of the file, to allow append-only access, etc.
>
>Right now the file stuff uses CreateFile in Windows.  Would it be better to use fopen and the other ANSI calls instead?

Probably not, but I was talking about capabilities, not implementation. On the
Windows platform, CreateFile() is presumably better because (a) fopen() is
written in terms of CreateFile() anyway, (b) CreateFile() lets you open UNC
pathnames, device-drivers (with paths starting with "//?/"), and so on. Also, I
believe that CreateFile() can cope with NTFS "streams" (which is one of
Microsoft's dumber ideas, but it's there) wheras fopen() can't. So, on the
whole, I think you made the right choice in choosing the method with the most
capabilities. The gripe, however, is that the most basic of those capabilities
are not passed on to the Phobos user. You can't do, for example, the equivalent
of fopen(filename, "a+").

From a user's point of view, fopen() is easy to use, and CreateFile() is hard to
use. fopen() can be used blindfold, asleep, and/or half drunk, but CreateFile()
requires a trip to the manual every time, and about ten lines of code. So
ideally, you'd want the SIMPLICITY of fopen(), but the POWER of CreateFile().
Maybe that's too much to ask. In any case, if std.stream.Streams are less
powerful than fopen(), then in many cases, we'll be forced to go back to using
fopen(), fgets(), fread(), etc., simply because std.streams don't cut the
mustard. fopen() functionality has to be MINIMAL functionality for std.streams.

Oh - one other thing I forgot. I think we need functions like basename(),
dirname(), pathinfo(), realpath() and so on, (stolen from PHP), and some
function to append a pathname-component to a pathname. Of course, these things
are dead easy to do with ordinary string manipulation ... IF you assume that the
file separator is "/". But that won't work on a Mac. Such functions would let us
manipulate pathnames in a platform independent way. (These should go in
std.file, not std.stream, obviously).

Arcane Jill


June 25, 2004
In article <opr94eenw35a2sq9@digitalmars.com>, Regan Heath says...

>If we want to stream it [File], we pass it into the constructor of a Stream or BufferedStream

A File /IS/ a stream. How could it not be? Sorry, I just didn't understand you here.



>> 11) I want the function available(), as Java has. A buffered stream
>> always knows
>> how much it's got left in its buffer, and I have no problem with an
>> unbuffered
>> stream returning zero.
>
>Isn't this true for a normal unbuffered file as well. at the point of opening you know how big it is. it could grow.. but until you reach that initial size you know there is more or not etc.

Ah - now it's I who was misunderstood. Allow me clarify. available() must return a number which is less than or equal to the number of bytes which may be read from a stream ... and this is the important part ... WITHOUT BLOCKING. available() MUST return immediately, without causing a thread-switch. It must *NOT* return the number of bytes left in a file - unless all of them are already buffered.

This is SO important in bits of code which MUST NOT WAIT.

Arcane Jill


June 25, 2004
"Arcane Jill"  wrote ...
> Ah - now it's I who was misunderstood. Allow me clarify. available() must
return
> a number which is less than or equal to the number of bytes which may be
read
> from a stream ... and this is the important part ... WITHOUT BLOCKING. available() MUST return immediately, without causing a thread-switch. It
must
> *NOT* return the number of bytes left in a file - unless all of them are
already
> buffered.
>
> This is SO important in bits of code which MUST NOT WAIT.
>
> Arcane Jill

May I enquire, Jill, as to why you need such functionality? I'm thinking at the 50,000' level rather than the intimate details of some IO implementation. It's always useful to understand the application.

Secondly, if the IO were always buffered, and you had access to the content thereof (plus the number of readable bytes), would that satisfy the requirement?

- Kris


June 25, 2004
In article <opr94etton5a2sq9@digitalmars.com>, Regan Heath says...

>What do you think of my filters idea, as long as you can snap any number of filters to streams and each other your data will be transcoded etc from one end to the other, and back again in the other direction.


Regan, your filters are /almost/ the same idea as mango's Readers/Writers. We're pretty much talking the same thing here, only by a different name.

But there is nonetheless a very important difference between the two concepts, which you may have missed. This is that a character sequence is a sequence of 32-bit-wide dchars, wheras a traditional stream is a sequence of 8-bit-wide bytes. So, at some stage, you need a "filter" which converts from ubyte[] to dchar[]. Such filters do not chain, because the output from one will not be the same type as the input to the next. Now, you COULD insist that everything be done on an 8-bit stream (mandating UTF-8 as the format for actual characters), but there is an efficiency issue there. UTF-32 is always going to be faster to process than UTF-8.

Besides which - you don't NEED a chain of filters when transcoding in D, because one end WILL be Unicode, always.

So I'd say the ideal situation would be:

(1) Reader classes which convert ubtyes from a stream (of known encoding) into
dchars (Unicode). You'd need one Reader for each encoding standard.

(2) Writer classes which convert dchars (Unicode) into ubytes (of some known
encoding) to be sent to a stream (again, one for each encoding standard)

(3) Filters, as described by you, which convert ubytes into more ubytes, and can do completely arbitrary things.

But I don't think your 8-bit-wide filters should be trying to handle dchars. That's a different job. I think the above would give you maximum flexibility, however, without losing any efficiency. What do you think?

Arcane Jill


June 25, 2004
In article <cbgjnv$105r$1@digitaldaemon.com>, Kris says...

>May I enquire, Jill, as to why you need such functionality? I'm thinking at the 50,000' level rather than the intimate details of some IO implementation. It's always useful to understand the application.

For example, consider a cryptographically secure random number stream. You'd want the ultra-secure version which always blocks until sufficient entropy is available - no problem there - but some folk would also want a non-blocking (less secure) version (like the difference between Unix's /dev/random and /dev/urandom). The non-blocking version would call available() on the entropy stream before trying to collect the entropy, in order to provide a guarantee of non-blocking. If bytes were available, it could read them, and be as secure as possible. If bytes were not available it could re-stir the exising entropy pool, and still return immediately. This sort of thing is absolutely crucial in crypto.



>Secondly, if the IO were always buffered, and you had access to the content thereof (plus the number of readable bytes), would that satisfy the requirement?

Not all streams which are able to deliver bytes on demand without waiting necessarily have a buffer. I have a proof-of-concept stream in my in-progress crypto random stuff which simply delivers bytes by calling rand(). Such a stream will never block, and it's available() function could simply always return 2, or 128, or any other arbitrary number. It does not, however, have a buffer to return.

Access to the contents of an internal buffer implies a certain implementation. This assumption may not always be correct, or relevant.

available(), by itself, would be enough. Thereafter you could get the "buffer
contents" with a straightforward read().

Arcane Jill


June 25, 2004
"Arcane Jill"  wrote
> Not all streams which are able to deliver bytes on demand without waiting necessarily have a buffer. I have a proof-of-concept stream in my
in-progress
> crypto random stuff which simply delivers bytes by calling rand(). Such a
stream
> will never block, and it's available() function could simply always return
2, or
> 128, or any other arbitrary number. It does not, however, have a buffer to return.

Right; poor phrasing on my part. In terms of D exposure, would something like an IAvailable interface suffice? If so, what about the equivalent for writing? Is there a similar need to never perform a thread-switch?

- Kris