Thread overview | |||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
February 13, 2014 std.serialization | ||||
---|---|---|---|---|
| ||||
Well, I wrote the code for this a while back, and although it was originally intended as a replacement for just std.json (thus the repo name), it does have the framework in place to be a generalized serialization framework, and there is the start of xml, and bson implementations, so I'm releasing it as std.serialization. The JSON implementation is the only one I'd consider ready for production use however. The (de)serialization framework takes a step back and asks, "Why do we need pull parsers?", the answer to which is that allocations are slow, so don't allocate. And that's exactly what I do. The serializer does absolutely *no* allocations of it's own (except for float->string conversion, which I don't understand the algorithms enough to implement myself) even going so far as to create an output range based version of to!string(int/uint/long/ulong/etc.). And the benefits of doing it this way are very clearly reflected in the pure speed of the serializer. On my 2ghz i5 Macbook Air, it takes 50ms to serialize 100k objects with roughly 600k integers contained in them when compiled with DMD, this roughly half the time it takes to generate the data to serialize. Compile it with GDC or LDC and that time is cut in half. I have done the exact same thing with deserialization as well, the only allocations done are for the output objects, because there is no intermediate representation. So how do I use this greatness? Simple! import std.serialization, and apply the @serializable UDA to the class/struct you want to serialize, then call toJOSN(yourObject) and fromJSON!YourType(yourString) to your heart's content! Now, there are other serialization libraries out there, such as orange, that take the compile-time reflection approach, but the amount of code required to implement a single format is just massive 2100 lines for the XMLArchive. The entire JSON (de)serialization, which *includes* both the lexer and parser is only 900 lines. Wow, that went a bit more towards a salesman-like description than I as aiming for, so I'll just end this here and give you the link, before this ends up looking like a massive, badly written, sales pitch :D https://github.com/Orvid/JSONSerialization |
February 14, 2014 Re: std.serialization | ||||
---|---|---|---|---|
| ||||
Posted in reply to Orvid King Attachments:
| Nice, hope the code is prettier than your speech. :D
On Fri, Feb 14, 2014 at 12:56 AM, Orvid King <blah38621@gmail.com> wrote:
> Well, I wrote the code for this a while back, and although it was
> originally intended as a replacement for just std.json (thus the repo
> name), it does have the framework in place to be a generalized
> serialization framework, and there is the start of xml, and bson
> implementations, so I'm releasing it as std.serialization. The JSON
> implementation is the only one I'd consider ready for production use
> however. The (de)serialization framework takes a step back and asks, "Why
> do we need pull parsers?", the answer to which is that allocations are
> slow, so don't allocate. And that's exactly what I do. The serializer does
> absolutely *no* allocations of it's own (except for float->string
> conversion, which I don't understand the algorithms enough to implement
> myself) even going so far as to create an output range based version of
> to!string(int/uint/long/ulong/etc.). And the benefits of doing it this
> way are very clearly reflected in the pure speed of the serializer. On my
> 2ghz i5 Macbook Air, it takes 50ms to serialize 100k objects with roughly
> 600k integers contained in them when compiled with DMD, this roughly half
> the time it takes to generate the data to serialize. Compile it with GDC or
> LDC and that time is cut in half. I have done the exact same thing with
> deserialization as well, the only allocations done are for the output
> objects, because there is no intermediate representation.
>
> So how do I use this greatness? Simple! import std.serialization, and apply the @serializable UDA to the class/struct you want to serialize, then call toJOSN(yourObject) and fromJSON!YourType(yourString) to your heart's content!
>
> Now, there are other serialization libraries out there, such as orange, that take the compile-time reflection approach, but the amount of code required to implement a single format is just massive 2100 lines for the XMLArchive. The entire JSON (de)serialization, which *includes* both the lexer and parser is only 900 lines.
>
>
>
>
> Wow, that went a bit more towards a salesman-like description than I as aiming for, so I'll just end this here and give you the link, before this ends up looking like a massive, badly written, sales pitch :D
>
> https://github.com/Orvid/JSONSerialization
>
|
February 14, 2014 Re: std.serialization | ||||
---|---|---|---|---|
| ||||
Posted in reply to Orvid King | On Thursday, 13 February 2014 at 22:56:38 UTC, Orvid King wrote:
> so I'm releasing it as std.serialization.
What does that even mean? I'm pretty sure you should NEVER call a library "std.something" if it hasn't been approved for inclusion into standard library.
Other than that, nice work.
|
February 14, 2014 Re: std.serialization | ||||
---|---|---|---|---|
| ||||
Posted in reply to Orvid King | "Orvid King" wrote in message news:ntpjdeutsxqicjywtoxc@forum.dlang.org... > (except for float->string conversion, which I don't understand the algorithms enough to implement myself) even going so far as to create an output range based version of to!string(int/uint/long/ulong/etc.). std.format.formatValue / std.format.formattedWrite ? |
February 14, 2014 Re: std.serialization | ||||
---|---|---|---|---|
| ||||
Posted in reply to Orvid King | On Thursday, 13 February 2014 at 22:56:38 UTC, Orvid King wrote: > Wow, that went a bit more towards a salesman-like description than I as aiming for, so I'll just end this here and give you the link, before this ends up looking like a massive, badly written, sales pitch :D > > https://github.com/Orvid/JSONSerialization It's much easier for people to get the gist of it if you generate documentation. It's kind of off-putting to have to rummage through source files with no direction. In the vein of shameless self-promotion, I recommend using bootDoc[1]. :) [1] https://github.com/JakobOvrum/bootDoc |
February 14, 2014 Re: std.serialization | ||||
---|---|---|---|---|
| ||||
Posted in reply to Daniel Murphy | On Friday, 14 February 2014 at 11:22:22 UTC, Daniel Murphy wrote:
> "Orvid King" wrote in message news:ntpjdeutsxqicjywtoxc@forum.dlang.org...
>
>> (except for float->string conversion, which I don't understand the algorithms enough to implement myself) even going so far as to create an output range based version of to!string(int/uint/long/ulong/etc.).
>
> std.format.formatValue / std.format.formattedWrite ?
Both of them fall back on a form of printf internally.
|
February 14, 2014 Re: std.serialization | ||||
---|---|---|---|---|
| ||||
Posted in reply to Francesco Cattoglio | On Friday, 14 February 2014 at 10:41:54 UTC, Francesco Cattoglio wrote:
> On Thursday, 13 February 2014 at 22:56:38 UTC, Orvid King wrote:
>> so I'm releasing it as std.serialization.
> What does that even mean? I'm pretty sure you should NEVER call a library "std.something" if it hasn't been approved for inclusion into standard library.
>
> Other than that, nice work.
Yes, well, I'm bad at coming up with creative names, and, once I get around to writing full documentation for it, as well as do a bit of other cleanup, I'll submit it for inclusion in Phobos.
|
February 14, 2014 Re: std.serialization | ||||
---|---|---|---|---|
| ||||
Posted in reply to Orvid King | On 2014-02-13 23:56, Orvid King wrote: > Well, I wrote the code for this a while back, and although it was > originally intended as a replacement for just std.json (thus the repo > name), it does have the framework in place to be a generalized > serialization framework, and there is the start of xml, and bson > implementations, so I'm releasing it as std.serialization. The JSON > implementation is the only one I'd consider ready for production use > however. The (de)serialization framework takes a step back and asks, > "Why do we need pull parsers?", the answer to which is that allocations > are slow, so don't allocate. And that's exactly what I do. The > serializer does absolutely *no* allocations of it's own (except for > float->string conversion, which I don't understand the algorithms enough > to implement myself) even going so far as to create an output range > based version of to!string(int/uint/long/ulong/etc.). And the benefits > of doing it this way are very clearly reflected in the pure speed of the > serializer. On my 2ghz i5 Macbook Air, it takes 50ms to serialize 100k > objects with roughly 600k integers contained in them when compiled with > DMD, this roughly half the time it takes to generate the data to > serialize. Compile it with GDC or LDC and that time is cut in half. I > have done the exact same thing with deserialization as well, the only > allocations done are for the output objects, because there is no > intermediate representation. What features does it support? How does it handle: * Arrays * Slices * Pointers * Reference types * Support for events * Custom serialization * Serialization of third party types > So how do I use this greatness? Simple! import std.serialization, and > apply the @serializable UDA to the class/struct you want to serialize, > then call toJOSN(yourObject) and fromJSON!YourType(yourString) to your > heart's content! Why require a UDA? > Now, there are other serialization libraries out there, such as orange, > that take the compile-time reflection approach, but the amount of code > required to implement a single format is just massive 2100 lines for the > XMLArchive. The entire JSON (de)serialization, which *includes* both the > lexer and parser is only 900 lines. The reason for that might be: 1. XML is untyped unlike JSON 2. It supports quite a lot of features that most other serialization libraries don't support -- /Jacob Carlborg |
February 16, 2014 Re: std.serialization | ||||
---|---|---|---|---|
| ||||
Posted in reply to Jacob Carlborg | > What features does it support? How does it handle: > > * Arrays > * Slices > * Pointers > * Reference types > * Support for events > * Custom serialization > * Serialization of third party types Slices are handled as arrays, because of the fact that they need to be handled in such a way that many different types of serialization formats can support them, and be inter-operable with implementations in languages other than D. Pointers are not supported, because in my opinion, they should _NEVER_ be serialized. Reference types are serialized as they are encountered, I haven't handled the circular reference case yet. Events are not supported due to the fact it would require complete knowledge of the source and target environments of the serialization. Custom serialization is supported by either supporting to!YourType(string) or YourType.parse(string) / to!string(valueOfYourType) or valueOfYourType.toString(), and are handled transparently by the base serialization handler, the actual serialization format sees them simply as strings. Each serialization format however does have the ability to select any type it wants to support being serialized. And third party types are only supported if they have the requisite UDA, or support custom serialization. > Why require a UDA? The UDA is required for the exact same reason it's required in the .net framework, because it makes sure that the type you are trying to serialize is serialization aware, meaning that it's not serializing cache fields, and also makes sense to actually be serializing the type. |
February 16, 2014 Re: std.serialization | ||||
---|---|---|---|---|
| ||||
Posted in reply to Orvid King | On 2014-02-16 18:52, Orvid King wrote: > Slices are handled as arrays, because of the fact that they need to be > handled in such a way that many different types of serialization formats > can support them, and be inter-operable with implementations in > languages other than D. > > Pointers are not supported, because in my opinion, they should _NEVER_ > be serialized. Why not? Think of languages like C and C++, they only support pointers. Pointers to basic types are not so interesting but pointers to structs are. > Reference types are serialized as they are encountered, I haven't > handled the circular reference case yet. If the same reference value is encountered multiple times, is it serialized once or multiple times? > Events are not supported due to the fact it would require complete > knowledge of the source and target environments of the serialization. What? I'm referring to methods being called before and after serialization of a given value. > Custom serialization is supported by either supporting > to!YourType(string) or YourType.parse(string) / > to!string(valueOfYourType) or valueOfYourType.toString(), and are > handled transparently by the base serialization handler, the actual > serialization format sees them simply as strings. Each serialization > format however does have the ability to select any type it wants to > support being serialized. > > And third party types are only supported if they have the requisite UDA, > or support custom serialization. > >> Why require a UDA? > The UDA is required for the exact same reason it's required in the .net > framework, because it makes sure that the type you are trying to > serialize is serialization aware, meaning that it's not serializing > cache fields, and also makes sense to actually be serializing the type. I prefer opt-out rather than opt-in. Can it serialize through base class references? -- /Jacob Carlborg |
Copyright © 1999-2021 by the D Language Foundation