Thread overview
D:YAML 0.3 released
Nov 16, 2011
Kiith-Sa
Nov 17, 2011
torhu
Nov 17, 2011
Kiith-Sa
Nov 17, 2011
torhu
Nov 17, 2011
Kiith-Sa
Nov 17, 2011
torhu
Nov 18, 2011
Kiith-Sa
November 16, 2011
I've released D:YAML 0.3 . This release brings some API improvements and many optimizations, drastically improving parsing speed and decreasing memory usage. There are also many bugfixes, more examples, and both the API documentation and tutorials have seen various improvements.

D:YAML is a YAML parser and emitter library for D.

API has been streamlined, Constructor in particular is now easier to use to add support for custom YAML data types. Constructor API compatibility has been broken as a result. Another breaking change is removal of the Node.getToVar method as it turned out to be a premature optimization. Node.get can now be used in a shorter form - Node.as , which might eventually replace it (1.0 release).

Nodes now preserve their styles between loading and dumping. There is no way to access those styles, though (as it should be according to the specification).

Focus of this release was optimization, greatly reducing parsing/emitting time and memory usage (in particular, decreasing garbage collector usage). There are no comparative benchmarks at the moment, but parsing time seems to be reduced to about 10%, and dumping time to about 50%.

Error messages, API documentation and tutorials were improved as well. There are also new example and benchmark applications. Lastly, there were many bugfixes. See CHANGES.txt in the source package for detailed information.


As before, it should be noted that D:YAML API is a work in progress, and there WILL be more breaking changes (although most of the API should now be in place).

GitHub: https://github.com/kiith-sa/D-YAML
Docs  : dyaml.alwaysdata.net/docs

You can get D:YAML 0.3 here: https://github.com/kiith-sa/D-YAML/downloads

November 17, 2011
On 16.11.2011 21:15, Kiith-Sa wrote:
...
> GitHub: https://github.com/kiith-sa/D-YAML
> Docs  : dyaml.alwaysdata.net/docs
>
> You can get D:YAML 0.3 here: https://github.com/kiith-sa/D-YAML/downloads
>

Great, I've been looking into YAML lately.  Would be interesting to see how the speed of your library compares to Tango's XML parser.  Maybe I'll do some benchmarking.

I think your API could be simplified in some places.  I rewrote one of your examples:

---
bool constructMyStructScalar(ref Node node, out MyStruct result, out string customError)
{
    auto parts = node.as!string().split(":");

    try
    {
        result = MyStruct(to!int(parts[0]), to!int(parts[1]), to!int(parts[2]));
        return true;
    }
    catch(Exception e)
    {
        return false;
    }
}
---

If the value is invalid, you just return false and let the library take care of the rest.  If you want to give detailed info about the error, assign to customError.  The code calling this function would throw an exception that contains the standard line and column info, plus your custom message.
November 17, 2011
torhu wrote:

> On 16.11.2011 21:15, Kiith-Sa wrote:
> ...
>> GitHub: https://github.com/kiith-sa/D-YAML
>> Docs  : dyaml.alwaysdata.net/docs
>>
>> You can get D:YAML 0.3 here: https://github.com/kiith-sa/D-YAML/downloads
>>
> 
> Great, I've been looking into YAML lately.  Would be interesting to see how the speed of your library compares to Tango's XML parser.  Maybe I'll do some benchmarking.
> 
> I think your API could be simplified in some places.  I rewrote one of your examples:
> 
> ---
> bool constructMyStructScalar(ref Node node, out MyStruct result, out
> string customError)
> {
>      auto parts = node.as!string().split(":");
> 
>      try
>      {
>          result = MyStruct(to!int(parts[0]), to!int(parts[1]),
> to!int(parts[2]));
>          return true;
>      }
>      catch(Exception e)
>      {
>          return false;
>      }
> }
> ---
> 
> If the value is invalid, you just return false and let the library take care of the rest.  If you want to give detailed info about the error, assign to customError.  The code calling this function would throw an exception that contains the standard line and column info, plus your custom message.

Thanks for your input. Your example doesn't seem to be significantly simpler, but a good point is that the Mark structures don't need to be passed. I'm considering this style:

---
bool constructMyStructScalar(ref Node node)
{
     auto parts = node.as!string().split(":");

     try
     {
         return MyStruct(to!int(parts[0]), to!int(parts[1]), to!int(parts[2]));
     }
     catch(Exception e)
     {
         throw SomeExceptionType("message: " ~ e.msg); //wrapped by the caller
     }
}
---

This would avoid passing the Marks and force the user to specify an error message.


As for performance, I wouldn't expect D:YAML to be faster than a well written XML parser. YAML has a rather complicated spec due to its goal of human readability. A faster parser could be written if various less commonly used features were left out, e.g. if only a subset equivalent to JSON would be supported.

Still, I'm interested in the results - both time taken and memory usage - maybe something could be found to further improve D:YAML performance.

If you have any more ideas on how to simplify D:YAML API, I'd like to see them. I'm only planning to freeze the API with a 1.0 release (which must wait for some Phobos changes), and would like to make it as good as possible until then.
November 17, 2011
On 17.11.2011 15:01, Kiith-Sa wrote:
>
> Thanks for your input. Your example doesn't seem to be significantly
> simpler, but a good point is that the Mark structures don't need to
> be passed. I'm considering this style:
>
> ---
> bool constructMyStructScalar(ref Node node)
> {
>       auto parts = node.as!string().split(":");
>
>       try
>       {
>           return MyStruct(to!int(parts[0]), to!int(parts[1]), to!int(parts[2]));
>       }
>       catch(Exception e)
>       {
>           throw SomeExceptionType("message: " ~ e.msg); //wrapped by the caller
>       }
> }
> ---
>
> This would avoid passing the Marks and force the user to specify an
> error message.

I'm sorry if I wasn't clear.  The main simplification was that I don't have to throw the exception myself.  If all the custom tag handlers are going to throw the same exception, I think you should not have to write that code over and over again.

But I think your new example is a good idea, if you change it to this:

---
bool constructMyStructScalar(ref Node node)
{
      auto parts = node.as!string().split(":");

      return MyStruct(to!int(parts[0]), to!int(parts[1]), to!int(parts[2]));
}
---

The exception throw by to() could be wrapped in a YAMLException or whatever, that contains the position information.  The toString would then add the ConvException's error message to the standard YAML error. If you want a custom message, you could just throw a plain Exception.

I supppose wrapping every call in a try/catch block would have a negative impact on performance, though.
November 17, 2011
Performance is actually not an issue here, insignificant part of total parsing time is spent in Constructor (only about 2%) and any slowdown there should not be noticeable.

The idea you're proposing here would indeed simplify the API, but I'm not sure if the result would always be what the user wants.

Any exception (potentially user-defined) would be handled and its
message added to a YAMLException, as we would need to catch(Exception)
in the calling code. I can't think of an example where this could be
a problem, since the exceptions we don't expect to be handled
are usually derived from Throwable, but what if the user expects an
exception to be thrown at MyStruct construction and not handled by
D:YAML at all?

November 17, 2011
On 17.11.2011 17:21, Kiith-Sa wrote:
> Performance is actually not an issue here, insignificant part of total
> parsing time is spent in Constructor (only about 2%) and any slowdown
> there should not be noticeable.
>
> The idea you're proposing here would indeed simplify the API, but
> I'm not sure if the result would always be what the user wants.
>
> Any exception (potentially user-defined) would be handled and its
> message added to a YAMLException, as we would need to catch(Exception)
> in the calling code. I can't think of an example where this could be
> a problem, since the exceptions we don't expect to be handled
> are usually derived from Throwable, but what if the user expects an
> exception to be thrown at MyStruct construction and not handled by
> D:YAML at all?
>

Do you have an example of what that could be?  OutOfMemoryError and things like that would probably go straight through, since they are Errors and not Exceptions.
November 18, 2011
torhu wrote:

> On 17.11.2011 17:21, Kiith-Sa wrote:
>> Performance is actually not an issue here, insignificant part of total parsing time is spent in Constructor (only about 2%) and any slowdown there should not be noticeable.
>>
>> The idea you're proposing here would indeed simplify the API, but I'm not sure if the result would always be what the user wants.
>>
>> Any exception (potentially user-defined) would be handled and its
>> message added to a YAMLException, as we would need to catch(Exception)
>> in the calling code. I can't think of an example where this could be
>> a problem, since the exceptions we don't expect to be handled
>> are usually derived from Throwable, but what if the user expects an
>> exception to be thrown at MyStruct construction and not handled by
>> D:YAML at all?
>>
> 
> Do you have an example of what that could be?  OutOfMemoryError and things like that would probably go straight through, since they are Errors and not Exceptions.

I have changed the Constructor API in this way for now.
Couldn't come up with a good counterexample.
If it turns out to be a mistake, I'll change it back before 1.0 .