std.stdio overhaul by Steve Schveighoffer (page 15) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » std.stdio overhaul by Steve Schveighoffer (page 15)

September 06, 2011

Re: std.stdio overhaul by Steve Schveighoffer

Posted by Timon Gehr
in reply to notna

Timon Gehr

Posted in reply to notna

On 09/06/2011 09:36 PM, notna wrote:
> Sorry upfront, I didn't read this hole thread, so maybe I'm missing or
> mixing something...
>
> How about a D binding for http://www.xmlsoft.org/ ?
>
> In other words, taking the "curl or sqlite3 path", something like
> /etc/c/xml2

That is about 4 times slower than the Tango XML parser:

http://dotnot.org/blog/archives/2008/03/10/xml-benchmarks-updated-graphs-with-rapidxml/


>
> On 06.09.2011 19:54, Walter Bright wrote:
>> On 9/6/2011 7:51 AM, Andrei Alexandrescu wrote:
>>> Let's leave the likes of std.xml and std.json in peace, then pick a
>>> naming convention for the new ones and create whole new modules
>>> replacing them.
>>
>> std.xml2
>>
>> will do fine.
>

September 06, 2011

Re: std.stdio overhaul by Steve Schveighoffer

Posted by Paul D. Anderson
in reply to Mafi

Paul D. Anderson

Posted in reply to Mafi

Mafi Wrote:

> > Along these same lines I'm wondering why not simply call this new module
> > std.io <http://std.io> rather than use the existing name std.stdio?
> >   It'd avoid the code breaking issue and help reflect that this new
> > module isn't based around C's stdio FILE (at least that's what I
> > gather).  Also, the code is written from scratch so that's another
> > reason for why I don't think it should have the same name.  The only
> > reason I can think of is if it provided significant improvements over
> > the existing std.stdio without causing massive breakage.
> >
> > Regards,
> > Brad Anderson
> 
> I think this is a good idea. I think std.io sounds and feels much better.
> 
> Mafi

I think this is a terrific suggestion.

Paul

September 06, 2011

Re: std.stdio overhaul by Steve Schveighoffer

Posted by Marco Leise
in reply to Timon Gehr

Marco Leise

Posted in reply to Timon Gehr

Am 06.09.2011, 22:28 Uhr, schrieb Timon Gehr <timon.gehr@gmx.ch>:

> On 09/06/2011 09:36 PM, notna wrote:
>> Sorry upfront, I didn't read this hole thread, so maybe I'm missing or
>> mixing something...
>>
>> How about a D binding for http://www.xmlsoft.org/ ?
>>
>> In other words, taking the "curl or sqlite3 path", something like
>> /etc/c/xml2
>
> That is about 4 times slower than the Tango XML parser:
>
> http://dotnot.org/blog/archives/2008/03/10/xml-benchmarks-updated-graphs-with-rapidxml/

You are so right, Timon. How deep is the trench between Phobos and Tango devs? Tango's XML parser should really make it into Phobos.

September 06, 2011

Re: std.stdio overhaul by Steve Schveighoffer

Posted by Jonathan M Davis
in reply to Timon Gehr

Jonathan M Davis

Posted in reply to Timon Gehr

On Tuesday, September 06, 2011 22:28:05 Timon Gehr wrote:
> On 09/06/2011 09:36 PM, notna wrote:
> > Sorry upfront, I didn't read this hole thread, so maybe I'm missing or mixing something...
> > 
> > How about a D binding for http://www.xmlsoft.org/ ?
> > 
> > In other words, taking the "curl or sqlite3 path", something like /etc/c/xml2
> 
> That is about 4 times slower than the Tango XML parser:

Yeah. Thanks to array slicing, parsing is actually one of the areas that D libraries should be able to generally beat C/C++ libraries in terms of speed.

That being said, creating bindings and wrappers for existing libraries is a great way to increase Phobos' functionality without reiventing the wheel in many cases. But there are definitely cases, where redoing something in D would actually be much better. It all depends on what you're trying to do and what libraries already exist in C or C++.

- Jonathan M Davis

September 06, 2011

Re: std.stdio overhaul by Steve Schveighoffer

Posted by Jonathan M Davis
in reply to Marco Leise

Jonathan M Davis

Posted in reply to Marco Leise

On Tuesday, September 06, 2011 23:51:48 Marco Leise wrote:
> Am 06.09.2011, 22:28 Uhr, schrieb Timon Gehr <timon.gehr@gmx.ch>:
> > On 09/06/2011 09:36 PM, notna wrote:
> >> Sorry upfront, I didn't read this hole thread, so maybe I'm missing or mixing something...
> >> 
> >> How about a D binding for http://www.xmlsoft.org/ ?
> >> 
> >> In other words, taking the "curl or sqlite3 path", something like /etc/c/xml2
> > 
> > That is about 4 times slower than the Tango XML parser:
> > 
> > http://dotnot.org/blog/archives/2008/03/10/xml-benchmarks-updated-graphs -with-rapidxml/
> You are so right, Timon. How deep is the trench between Phobos and Tango devs? Tango's XML parser should really make it into Phobos.

A new std.xml is already in the works. It'll be range-based, unlike the Tango parser. But there's no reason why Phobos shouldn't be able to have a similarly-fast XML parser. As I understand it, the primary reason that the current std.xml is slow is because it uses delegates quite a bit, but I haven't used it myself, so I don't know all of the details.

- Jonathan M Davis

September 06, 2011

Re: std.stdio overhaul by Steve Schveighoffer

Posted by Sean Kelly
in reply to Marco Leise

Sean Kelly

Posted in reply to Marco Leise

On Sep 6, 2011, at 2:51 PM, Marco Leise wrote:

> Am 06.09.2011, 22:28 Uhr, schrieb Timon Gehr <timon.gehr@gmx.ch>:
> 
>> On 09/06/2011 09:36 PM, notna wrote:
>>> Sorry upfront, I didn't read this hole thread, so maybe I'm missing or mixing something...
>>> 
>>> How about a D binding for http://www.xmlsoft.org/ ?
>>> 
>>> In other words, taking the "curl or sqlite3 path", something like /etc/c/xml2
>> 
>> That is about 4 times slower than the Tango XML parser:
>> 
>> http://dotnot.org/blog/archives/2008/03/10/xml-benchmarks-updated-graphs-with-rapidxml/
> 
> You are so right, Timon. How deep is the trench between Phobos and Tango devs? Tango's XML parser should really make it into Phobos.

That will never happen.  Though on a positive note, a major reason the Tango parser is so fast because there's no copying or translation of the underlying data.  Attributes are passed to the user as-is via a slice of the input range.  Most parsers in other languages simply don't work this way.

September 06, 2011

Re: std.stdio overhaul by Steve Schveighoffer

Posted by bearophile
in reply to Paul D. Anderson

bearophile

Posted in reply to Paul D. Anderson

Paul D. Anderson:

> I think this is a terrific suggestion.

I have suggested std.io time ago, but someone doesn't like it: http://d.puremagic.com/issues/show_bug.cgi?id=4718

Bye,
bearophile

September 06, 2011

Re: std.stdio overhaul by Steve Schveighoffer

Posted by Jonathan M Davis
in reply to bearophile

Jonathan M Davis

Posted in reply to bearophile

On Tuesday, September 06, 2011 18:48:24 bearophile wrote:
> Paul D. Anderson:
> > I think this is a terrific suggestion.
> 
> I have suggested std.io time ago, but someone doesn't like it: http://d.puremagic.com/issues/show_bug.cgi?id=4718

It's not enough of an improvement to rename std.stdio to std.io just to rename it. However, if Steven's ultimate changes are different enough that a separate module is needed for a clean migration path, and those changes do get accepted into Phobos, then naming the new module std.io makes good sense.

- Jonathan M Davis

September 07, 2011

Re: std.stdio overhaul by Steve Schveighoffer

Posted by Marco Leise
in reply to Sean Kelly

Marco Leise

Posted in reply to Sean Kelly

Am 07.09.2011, 00:23 Uhr, schrieb Sean Kelly <sean@invisibleduck.org>:

> On Sep 6, 2011, at 2:51 PM, Marco Leise wrote:
>
>> Am 06.09.2011, 22:28 Uhr, schrieb Timon Gehr <timon.gehr@gmx.ch>:
>>
>>> On 09/06/2011 09:36 PM, notna wrote:
>>>> Sorry upfront, I didn't read this hole thread, so maybe I'm missing or
>>>> mixing something...
>>>>
>>>> How about a D binding for http://www.xmlsoft.org/ ?
>>>>
>>>> In other words, taking the "curl or sqlite3 path", something like
>>>> /etc/c/xml2
>>>
>>> That is about 4 times slower than the Tango XML parser:
>>>
>>> http://dotnot.org/blog/archives/2008/03/10/xml-benchmarks-updated-graphs-with-rapidxml/
>>
>> You are so right, Timon. How deep is the trench between Phobos and Tango devs? Tango's XML parser should really make it into Phobos.
>
> That will never happen.  Though on a positive note, a major reason the Tango parser is so fast because there's no copying or translation of the underlying data.  Attributes are passed to the user as-is via a slice of the input range.  Most parsers in other languages simply don't work this way.

So in the benchmark neither white-space is collapsed, nor are entities like &amp; converted?

September 07, 2011

Re: std.stdio overhaul by Steve Schveighoffer

Posted by Sean Kelly
in reply to Marco Leise

Sean Kelly

Posted in reply to Marco Leise

On Sep 6, 2011, at 6:49 PM, Marco Leise wrote:

> Am 07.09.2011, 00:23 Uhr, schrieb Sean Kelly <sean@invisibleduck.org>:
> 
>> On Sep 6, 2011, at 2:51 PM, Marco Leise wrote:
>> 
>>> Am 06.09.2011, 22:28 Uhr, schrieb Timon Gehr <timon.gehr@gmx.ch>:
>>> 
>>>> On 09/06/2011 09:36 PM, notna wrote:
>>>>> Sorry upfront, I didn't read this hole thread, so maybe I'm missing or mixing something...
>>>>> 
>>>>> How about a D binding for http://www.xmlsoft.org/ ?
>>>>> 
>>>>> In other words, taking the "curl or sqlite3 path", something like /etc/c/xml2
>>>> 
>>>> That is about 4 times slower than the Tango XML parser:
>>>> 
>>>> http://dotnot.org/blog/archives/2008/03/10/xml-benchmarks-updated-graphs-with-rapidxml/
>>> 
>>> You are so right, Timon. How deep is the trench between Phobos and Tango devs? Tango's XML parser should really make it into Phobos.
>> 
>> That will never happen.  Though on a positive note, a major reason the Tango parser is so fast because there's no copying or translation of the underlying data.  Attributes are passed to the user as-is via a slice of the input range.  Most parsers in other languages simply don't work this way.
> 
> So in the benchmark neither white-space is collapsed, nor are entities like &amp; converted?

I don't believe so.  That's expected to be done by the user if he cares about decoding the field.  Compare this to the Xerces (Apache) XML parser that passes in all attributes as wide chars regardless of the input format and you can see why parsing XML in D can be so fast: passing values via array slicing and having Unicode as the native character format.  If the input text is UTF-8 you use XmlParser!char, if it's UTF-16 you use XmlParser!wchar, etc.  I'm actually surprised that more C/C++ parsers don't work this way.

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation