September 08, 2011
On Thu, 08 Sep 2011 09:16:40 -0400, Jacob Carlborg <doob@me.com> wrote:

> On 2011-09-08 13:25, Steven Schveighoffer wrote:
>> On Tue, 06 Sep 2011 17:59:44 -0400, Jonathan M Davis
>>> A new std.xml is already in the works. It'll be range-based, unlike
>>> the Tango
>>> parser. But there's no reason why Phobos shouldn't be able to have a
>>> similarly-fast XML parser. As I understand it, the primary reason that
>>> the
>>> current std.xml is slow is because it uses delegates quite a bit, but I
>>> haven't used it myself, so I don't know all of the details.
>>
>> No, the issue is, and always will be, buffer access. C's FILE * just
>> doesn't provide anything decent. It's the primary motivation for wanting
>> to revamp it. With slicing and copy avoidance (i.e. only read into a
>> buffer, never copy out), we can achieve the same with Phobos, but I
>> think we have to replace C's buffering system (at least for this usage).
>>
>> Tango's I/O libraries use delegates and virtual functions galore. I
>> think too big a stigma is attached to those. The difference between
>> calling a virtual function/delegate and calling a normal function is
>> very insignificant, the real savings for not using virtual functions is
>> to allow inlining.
>>
>> However, in this case, I/O is so diverse that you *need* polymorphism.
>>
>> -Steve
>
> The Tango XML parser doesn't read from a file, it takes the input as a string. The parser isn't affected by I/O at all.

So you have to read the entire file before sending it to the parser?

Isn't that a bit limited?  What if I have a 50MB file, I have to read it into a continuous memory block first?

-Steve
September 08, 2011
On Thu, 08 Sep 2011 15:17:51 +0200, Steven Schveighoffer <schveiguy@yahoo.com> wrote:

> I wonder if there's a way to give the option of using a template parameter or using a positional parameter without having two different symbol names.  hm...
>
> openFile!(string modedefault = "r")(string filename, string mode = modedefault) if (isValidOpenMode(modedefault))
> {
>     if(!isValidOpenMode(mode))
>        throw new Exception("invalid file open mode: " ~ mode);
>     ...
> }
>
> Would that work?

Neat! And yes, it certainly does work. I'm still unsure when someone
will actually need to specify that at runtime, but maybe for scripting
languages?

-- 
  Simen
September 08, 2011
On Thu, 08 Sep 2011 11:40:01 +0200, Sean Cavanaugh <WorksOnMyMachine@gmail.com> wrote:

> In the COM based land for D3D, there is just a number tacked onto the class name.  We are up to version 11 (e.x. ID3D11Device).  It works well and is definitely nicer once you are used to it, than calling everything New or FunctionEx, and left wondering what to do when you rev the interface again

In the case of D3D though, D3D itself has a version number. The next version
of std.xml will not be parsing XMLv2.0. When a version 2.0 of the XML spec
shows up, what do we do about std.xml2, which parses version 1.1? And what
do we call the new one? Should std.xml3 parse XMLv2.0?


-- 
  Simen
September 08, 2011
Am 08.09.2011, 18:52 Uhr, schrieb Simen Kjaeraas <simen.kjaras@gmail.com>:

> On Thu, 08 Sep 2011 11:40:01 +0200, Sean Cavanaugh <WorksOnMyMachine@gmail.com> wrote:
>
>> In the COM based land for D3D, there is just a number tacked onto the class name.  We are up to version 11 (e.x. ID3D11Device).  It works well and is definitely nicer once you are used to it, than calling everything New or FunctionEx, and left wondering what to do when you rev the interface again
>
> In the case of D3D though, D3D itself has a version number. The next version
> of std.xml will not be parsing XMLv2.0. When a version 2.0 of the XML spec
> shows up, what do we do about std.xml2, which parses version 1.1? And what
> do we call the new one? Should std.xml3 parse XMLv2.0?

That is late in the discussion, but a valid point.
September 08, 2011
On 9/8/11 11:11 AM, Simen Kjaeraas wrote:
> On Thu, 08 Sep 2011 15:17:51 +0200, Steven Schveighoffer
> <schveiguy@yahoo.com> wrote:
>
>> I wonder if there's a way to give the option of using a template
>> parameter or using a positional parameter without having two different
>> symbol names. hm...
>>
>> openFile!(string modedefault = "r")(string filename, string mode =
>> modedefault) if (isValidOpenMode(modedefault))
>> {
>> if(!isValidOpenMode(mode))
>> throw new Exception("invalid file open mode: " ~ mode);
>> ...
>> }
>>
>> Would that work?
>
> Neat! And yes, it certainly does work. I'm still unsure when someone
> will actually need to specify that at runtime, but maybe for scripting
> languages?

My opinion: we're spending way too much energy on this. File I/O poses much more difficult problems than choosing representation of open flags.

Andrei

September 08, 2011
On Thursday, September 08, 2011 07:13:48 Steven Schveighoffer wrote:
> On Wed, 07 Sep 2011 03:30:17 -0400, Jacob Carlborg <doob@me.com> wrote:
> > On 2011-09-06 19:39, Steven Schveighoffer wrote:
> >> I like enums in terms of writing code that processes them, but in
> >> terms
> >> of calling functions with them, I mean look at a sample fstream
> >> constructor in C++:
> >> 
> >> fstream ifs("filename.txt", ios_base::in | ios_base::out);
> >> 
> >> vs.
> >> 
> >> File("filename.txt", "r+"); // or "rw"
> >> 
> >> There's just no way you can think "rw" is less descriptive or understandable than ios_base::in | ios_base::out.
> >> 
> >> -Steve
> > 
> > BTW, I think that using:
> > 
> > Mode.read | Mode.write
> > 
> > Instead of "rw" is the same thing as one should name variables with a proper descriptive names instead of just "a" or "b".
> 
> It's not the same.  "a" and "b" do not have any meaning, they are just variable names.  "r" stands for read and "w" stands for write.  It's pretty obvious that they do, especially in the context of opening a file.
> 
> I'd equate it to using i, j, k for index variables -- they are not descriptive, but in context, everyone knows what they mean.
> 
> And in response to the discussion about enum flags not being & or | together, I emphatically think enums should be used for bitfields. Remember, enum is not just an enumeration, it's a manifest constant.  I see no reason that we should not use the namespace-creation ability of enum to create such constants.  I don't see the downside.

I think that it makes perfect sense to use enums for flags. What I don't think makes sense is making the type of the variable which holds the flags to be that enum type unless _every_ possible combination of flags has its own flag so that &ing or |ing enums always results in a valid enum. I have no gripe with using enums for flags. It's using an enum to hold a value which is not a valid value for that enum which is the problem IMHO.

- Jonathan M Davis
September 08, 2011
On 2011-09-08 15:22, Steven Schveighoffer wrote:
> On Thu, 08 Sep 2011 09:16:40 -0400, Jacob Carlborg <doob@me.com> wrote:
>> The Tango XML parser doesn't read from a file, it takes the input as a
>> string. The parser isn't affected by I/O at all.
>
> So you have to read the entire file before sending it to the parser?
>
> Isn't that a bit limited? What if I have a 50MB file, I have to read it
> into a continuous memory block first?
>
> -Steve

I'm just telling how Tango currently works, not how the XML module in Phobos should work. But I guess it might be somewhat limited. 50MB isn't that big to read into memory?

I think it would be nice to be able to do both. If you read the whole file before sending it to the parser you would know it doesn't perform any I/O operations.

-- 
/Jacob Carlborg
September 08, 2011
On 2011-09-08 15:17, Steven Schveighoffer wrote:
> You can if you make it a template parameter. For example, my openFile
> function that I wrote does this (in fact, I needed a template mode
> string because the return type depends on it). The downside is you
> cannot pass a runtime-generated string. I cannot actually think of any
> use cases for that however.
>
> In any case, the existing API does not use a template parameter, and we
> have to try and break as little code as possible.
>
> I wonder if there's a way to give the option of using a template
> parameter or using a positional parameter without having two different
> symbol names. hm...
>
> openFile!(string modedefault = "r")(string filename, string mode =
> modedefault) if (isValidOpenMode(modedefault))
> {
> if(!isValidOpenMode(mode))
> throw new Exception("invalid file open mode: " ~ mode);
> ...
> }
>
> Would that work?
>
> -Steve

That looks nice if it works.

-- 
/Jacob Carlborg
September 08, 2011
On Thursday, September 08, 2011 21:38:43 Jacob Carlborg wrote:
> On 2011-09-08 15:22, Steven Schveighoffer wrote:
> > On Thu, 08 Sep 2011 09:16:40 -0400, Jacob Carlborg <doob@me.com> wrote:
> >> The Tango XML parser doesn't read from a file, it takes the input as a string. The parser isn't affected by I/O at all.
> > 
> > So you have to read the entire file before sending it to the parser?
> > 
> > Isn't that a bit limited? What if I have a 50MB file, I have to read it into a continuous memory block first?
> > 
> > -Steve
> 
> I'm just telling how Tango currently works, not how the XML module in Phobos should work. But I guess it might be somewhat limited. 50MB isn't that big to read into memory?
> 
> I think it would be nice to be able to do both. If you read the whole file before sending it to the parser you would know it doesn't perform any I/O operations.

I expect that the the new std.xml will work on ranges of dchar (certainly, if it doesn't it should) such that it could be used on a string that's the entire file or on a stream over the file. If it's tied to reading in the whole file first, it's a design flaw. But I don't know what the current state of the new std.xml is. I don't think that I've seen Tomek around here recently.

- Jonathan M Davis
September 08, 2011
On Thu, 08 Sep 2011 15:38:43 -0400, Jacob Carlborg <doob@me.com> wrote:

> On 2011-09-08 15:22, Steven Schveighoffer wrote:
>> On Thu, 08 Sep 2011 09:16:40 -0400, Jacob Carlborg <doob@me.com> wrote:
>>> The Tango XML parser doesn't read from a file, it takes the input as a
>>> string. The parser isn't affected by I/O at all.
>>
>> So you have to read the entire file before sending it to the parser?
>>
>> Isn't that a bit limited? What if I have a 50MB file, I have to read it
>> into a continuous memory block first?
>>
>> -Steve
>
> I'm just telling how Tango currently works, not how the XML module in Phobos should work. But I guess it might be somewhat limited. 50MB isn't that big to read into memory?

Um... yeah, it is :)  I have 1 GB of memory, my system starts thrashing with an app that consumes 750MB.  So that's like 13 xml files read?  Especially if I want to use DOM, I have to keep them around...

Not to mention that the GC has to allocate a contiguous space for it.  So even if I have 100MB of garbage space, maybe none of it is usable, I still have to allocate a new block.  I'm just surprised there isn't at least an option for a stream-based xml parser in Tango.

One thing this does though, I always assumed it was Tango's I/O that accounts for its xml superiority.  I wonder, does anyone count reading the file in any of the benchmarks?

I still think we can come close without having to pre-read an entire file.

> I think it would be nice to be able to do both. If you read the whole file before sending it to the parser you would know it doesn't perform any I/O operations.

I totally agree.  I think there's ways to abstract the functionality for both memory-based and device-based i/o into one interface (part of the reason for the revamp).

-Steve