On Thursday, 25 August 2022 at 19:41:19 UTC, solidstate1991 wrote:
>I took a look at experimental.xml. According to its tests, it's biggest issue is that it accepts malformed documents. I'll attempt to reverse-engineer the code, then add the necessary checks to reject the malformed documents. Since it has multiple options for allocators (stdx-allocator), it'll be a bit of a challenge, but at worst I can strip that function and replace it with GC only.
So work have begun here: https://github.com/ZILtoid1991/experimental.xml
Things I've done so far:
- Stripped the allocators and the custom error handling functions. Not much people are using allocators anyways, it just complicates the project, and GC is otherwise the best option for anything that builds a complex tree structure. With that gone, I can just use exceptions for error handling, which can be toggled with a flag: turning it off will enable parsing badly formed XML documents, and even SGML in theory.
- Simplifying a lot of things in general, with array slicing and appending.
- Enabled character escaping, which led me into the DTD hellhole.
- Enabled checking for bad characters in names and texts.
- Started working on the processing of XML declarations (important for setting version and checking for correct encoding), and the DTD.
I know that the removal of the allocators might doom my project from the inclusion in the Phobos library, but even then I can just release it as a regular dub library. Soon I'll be renaming it to newXML or something similar, while keeping the credits to its previous authors.