August 29, 2013 Re: Replacing std.xml | ||||
---|---|---|---|---|
| ||||
Posted in reply to Joakim Attachments:
| and imagine someone forced to use xml who reads this answer from the community :p
std.xml is a must, no doubt.
2013/8/29 Joakim <joakim@airpost.net>
> On Thursday, 29 August 2013 at 07:47:35 UTC, Jonathan M Davis wrote:
>
>> There are several D XML libraries floating around, but no one has taken
>> the
>> time to get any of the prepared for the Phobos review queue, and I suspect
>> that very few of them are range-based like the Phobos XML solution needs
>> to
>> be, but I don't know.
>>
> I think it's great that there's no std.xml, as it implies that nobody using D would use a dumb tech like XML. Let's keep it that way. :)
>
|
August 29, 2013 Re: Replacing std.xml | ||||
---|---|---|---|---|
| ||||
Posted in reply to Joakim | On Thursday, 29 August 2013 at 09:24:31 UTC, Joakim wrote:
> I think it's great that there's no std.xml, as it implies that nobody using D would use a dumb tech like XML. Let's keep it that way. :)
No way around XML. A must have, as has been said in this thread. But what would you suggest as a better alternative to XML. It might be worth creating modules for alternative too (like JSON).
|
August 29, 2013 Re: Replacing std.xml | ||||
---|---|---|---|---|
| ||||
Posted in reply to Jonathan M Davis | On 2013-08-29 11:23, Jonathan M Davis wrote: > IIRC, everything in XML is > ASCII anyway, with stuff like HTML codes to indicate Unicode characters. And if > that's the case, avoiding unnecessary decoding is trivial when operating on > strings. What! I hardly believe that. That might be the case for HTML but I don't think it is for XML. There are many file formats that are based on XML. I don't think all those use HTML codes. This is what W3 Schools says: "XML documents can contain non ASCII characters, like Norwegian æ ø å , or French ê è é. To avoid errors, specify the XML encoding, or save XML files as Unicode.". -- /Jacob Carlborg |
August 29, 2013 Re: Replacing std.xml | ||||
---|---|---|---|---|
| ||||
Posted in reply to Jacob Carlborg | On Thursday, 29 August 2013 at 13:20:40 UTC, Jacob Carlborg wrote:
> On 2013-08-29 11:23, Jonathan M Davis wrote:
>
>> IIRC, everything in XML is
>> ASCII anyway, with stuff like HTML codes to indicate Unicode characters. And if
>> that's the case, avoiding unnecessary decoding is trivial when operating on
>> strings.
>
> What! I hardly believe that. That might be the case for HTML but I don't think it is for XML. There are many file formats that are based on XML. I don't think all those use HTML codes.
>
> This is what W3 Schools says:
>
> "XML documents can contain non ASCII characters, like Norwegian æ ø å , or French ê è é.
>
> To avoid errors, specify the XML encoding, or save XML files as Unicode.".
And while we're at it, what about YAML? It's a subset of JSON which means the new json.d module will handle it, I suppose.
|
August 29, 2013 Re: Replacing std.xml | ||||
---|---|---|---|---|
| ||||
Posted in reply to Chris | On Thu, Aug 29, 2013 at 01:14:19PM +0200, Chris wrote: > On Thursday, 29 August 2013 at 09:24:31 UTC, Joakim wrote: > >I think it's great that there's no std.xml, as it implies that nobody using D would use a dumb tech like XML. Let's keep it that way. :) > > No way around XML. A must have, as has been said in this thread. But what would you suggest as a better alternative to XML. It might be worth creating modules for alternative too (like JSON). While I do agree that in the current state of affairs, XML support is a must, I also think that XML is just way overengineered, IMNSHO. It has adds too much overhead and therefore requires compression to be efficient, and it is needlessly complex for what it does (tag attributes, all the different cases of CDATA / non-CDATA, etc.). This complexity makes it impractical to edit by hand, relegating it to machine reading/writing only, which then begs the question of why a binary format wasn't chosen instead. And don't get me started on DTDs, which are incredibly convoluted and can't even express certain things that one might want to express in an automatic validation system. Or that 17-headed monster called XSLT, which, thankfully, is fading into the obscurity of time. JSON is a nicer, simpler alternative, though there may be limitations with it that I don't know about. Word on the street is that many people are abandoning XML for JSON due to lower maintenance overhead (and this includes one of my friends, who was a hardcore XML fanatic -- I was frankly quite surprised when he told me he was considering migrating to JSON, since the original reason he chose XML was so that his data will future-proof... well, so much for *that*). But all of this is irrelevant... it doesn't alleviate the need for a std.xml replacement, since we have to live in the real world where XML exists and must be supported. :) T -- Life would be easier if I had the source code. -- YHL |
August 29, 2013 Re: Replacing std.xml | ||||
---|---|---|---|---|
| ||||
Posted in reply to w0rp | On 8/29/13 12:25 AM, w0rp wrote:
> Hello everybody. I've been wondering, what are the current plans to
> replace std.xml? I'd like to help with the effort to get a final XML
> library in phobos. So, I have a few questions.
>
> First, and most importantly, what do we except out of a D XML library?
> I'd really like to have a discussion of the form, "Here is exactly the
> interface the structs/classes need to implement, go forth and
> implement." The general idea in my mind is "something SAX-like, with
> something a little DOM-like." I'm aware that std.xml has some issues
> support different encodings, so obvious that's included.
>
> Second, is there an existing library that has gotten close to meeting
> whatever we need for the first point? If so, how far away is it from
> being able to meet all of the requirements and become the standard
> library version?
I don't know much about XML, but I noticed there are a few popular libraries and models. I'd expect a replacement for std.xml would choose one of these popular models that is most appropriate for D.
Andrei
|
August 29, 2013 Re: Replacing std.xml | ||||
---|---|---|---|---|
| ||||
Posted in reply to Chris | On Thursday, 29 August 2013 at 11:14:21 UTC, Chris wrote: > On Thursday, 29 August 2013 at 09:24:31 UTC, Joakim wrote: >> I think it's great that there's no std.xml, as it implies that nobody using D would use a dumb tech like XML. Let's keep it that way. :) > > No way around XML. A must have, as has been said in this thread. But what would you suggest as a better alternative to XML. It might be worth creating modules for alternative too (like JSON). We already have a std.json in Phobos for years now. I think it'd be great for Phobos to nudge users in better directions, by having a std.json but no std.xml. There'll always be outside libraries to process XML, for those who can't go without, perhaps a list of XML libraries can be added to the wiki: http://wiki.dlang.org/Libraries_and_Frameworks I see no use for XML, as it's a horrible solution in search of a problem, but for those who must use it, they can always get it outside Phobos. Just a suggestion. |
August 29, 2013 Re: Replacing std.xml | ||||
---|---|---|---|---|
| ||||
Posted in reply to H. S. Teoh | On Thursday, 29 August 2013 at 15:43:36 UTC, H. S. Teoh wrote:
> While I do agree that in the current state of affairs, XML support is a
> must, I also think that XML is just way overengineered, IMNSHO. It has
> adds too much overhead and therefore requires compression to be
> efficient, and it is needlessly complex for what it does (tag
> attributes, all the different cases of CDATA / non-CDATA, etc.). This
> complexity makes it impractical to edit by hand, relegating it to
> machine reading/writing only, which then begs the question of why a
> binary format wasn't chosen instead. And don't get me started on DTDs,
> which are incredibly convoluted and can't even express certain things
> that one might want to express in an automatic validation system. Or
> that 17-headed monster called XSLT, which, thankfully, is fading into
> the obscurity of time.
>
> JSON is a nicer, simpler alternative, though there may be limitations
> with it that I don't know about. Word on the street is that many people
> are abandoning XML for JSON due to lower maintenance overhead (and this
> includes one of my friends, who was a hardcore XML fanatic -- I was
> frankly quite surprised when he told me he was considering migrating to
> JSON, since the original reason he chose XML was so that his data will
> future-proof... well, so much for *that*).
>
> But all of this is irrelevant... it doesn't alleviate the need for a
> std.xml replacement, since we have to live in the real world where XML
> exists and must be supported. :)
>
>
> T
I am moving away from XML too. Wanted to use it for a private project. But I soon realized the madness of it, especially when there are people involved who are not programmers and have no clue whatsoever about markup languages, data storage formats etc. I think JSON and YAML are good candidates for the private project which revolves around collecting words and phrases and archiving them. I don't know exactly what I will use, but XML definitely won't get the job.
DTD sounds too much like DDT!
|
August 29, 2013 Re: Replacing std.xml | ||||
---|---|---|---|---|
| ||||
Posted in reply to Jonathan M Davis | On 2013-08-29 07:47:17 +0000, Jonathan M Davis <jmdavisProg@gmx.com> said: > On Thursday, August 29, 2013 09:25:35 w0rp wrote: >> The general idea in my mind is >> "something SAX-like, with something a little DOM-like." > > What I personally think would be best is to have multiple parsers. First you > have something STAX-like (or maybe even lower level - I don't recall exactly > what STAX gives you at the moment) that basically tokenizes the XML and > returns a range of that. Then SAX and DOM parsers can be built on top of that. > That way, you get the fastest parser possible as well as higher level, more > functional parsers. > > But two of the biggest points of the design are that it's going to have to be > range-based, and it's going to need to be able to take full advantage of > slices (when used with any strings or random-access ranges) in order to avoid > copying any of the data. That's the key design point which will allow a D > parser to be extremely fast in comparison to parsers in most other languages. I wrote something like that a while ago. It only accepted arrays as input because of the lack of a "buffered range" concept that'd allow lookahead and efficient slicing from any kind of range, but that could be retrofitted in. It implements pretty much all of the XML spec, except for documents having an internal subset (which is something a little arcane). It does not deal with namespaces either, I feel like that should be done a layer above, but I'm not entirely sure. Lower-level parser: http://michelf.ca/docs/d/mfr/xmltok.html Higher-level parser built on the first one: http://michelf.ca/docs/d/mfr/xml.html The code: http://michelf.ca/docs/d/mfr-xml-2010-10-19.zip That code hasn't been compiled in a while, but it used to work very well for me. Feel free to use as a starting point. -- Michel Fortin michel.fortin@michelf.ca http://michelf.ca |
August 29, 2013 Re: Replacing std.xml | ||||
---|---|---|---|---|
| ||||
Posted in reply to H. S. Teoh | On Thursday, 29 August 2013 at 15:43:36 UTC, H. S. Teoh wrote:
> JSON is a nicer, simpler alternative, though there may be limitations
> with it that I don't know about.
The main disavantage of JSON vs XML is lack of validation. Whenever I write code that works with JSON (or any data format), I have to write extra code to perform validation. If there was a validation addon for JSON, you could nix XML for good.
Regards
Jason
|
Copyright © 1999-2021 by the D Language Foundation