August 09, 2010 [phobos] Interest in having a serializer in Phobos? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Jacob | On Sun, 08 Aug 2010 23:44:01 +0900, Jacob <doob at me.com> wrote: > > On 8 aug 2010, at 16:16, Lars Tandle Kyllingstad wrote: > >> On Sat, 2010-08-07 at 17:19 +0200, Jacob wrote: >>> Is there any interest in having a serializer in Phobos? I have a serializer compatible with D2 which I licensed under the Boost license. This is the description from the project page: >>> >>> Orange is a serialization library for D1 and D2, supporting both Tango and Phobos. It can serialize most of the available types in D, including third party types and can serialize through base class references. It supports fully automatic serialization of all supported types and also supports several ways to customize the serialization. Orange has a separate front end (the serializer) and back end (the archive) making it possible for the user to create new archive types that can be used with the existing serializer. >>> >>> It's not very well tested but if there's some interest I'm hoping on getting more people to test the library. The project page is: http://dsource.org/projects/orange/ >> >> I agree (with everyone else) that Phobos should have a serialization lib. And now it seems we're spoilt for choices -- both Masahiro's MsgPack serializer and Jacob's Orange are more or less complete, working solutions being offered to us. > > Can MessagePack serialize an object? I'm looking at the website and can't see that is has direct support for that. No(but can use helper method). MessagePack is a language-neutral serialization and RPC specification(ProtoBuf, Avro too). Main purposes are fast and small serialization and communication between many languages. These libraries don't handle an object type because almost systems don't serialize and exchange an object type in product. I will add RPC part to MessagePack module. But currently, Phobos doesn't have event module. So, I am rewriting std.socket and thinking about event module API. >> I have very little experience with using serialization libs, so I have no idea how to determine which one is the better choice. How do we decide which one to use? Perhaps a vote on the NG? > > I think one could create a MessagePack archive for my serializer. > I think Orange front-end is overengineering for MessagePack serializer. MessagePack format can't handle object graph. But, without speed degradation, limitation version will be alternative of direct-conversion (de)serializer. Lastly, I think supporting two modules is better. Because the purpose of two modules is different. Masahiro |
August 08, 2010 [phobos] Interest in having a serializer in Phobos? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Michel Fortin | This is also how it's handled in Orange, every serialized value has a key associated with it. But currently it throws when it can't find something in the archive. I'm going to add an option for not throwing.
On 8 aug 2010, at 18:08, Michel Fortin wrote:
> Le 2010-08-08 ? 11:31, Andrei Alexandrescu a ?crit :
>
>> Good point. This is not a template-specific problem; if you try to deserialize a derived class and the client doesn't know of that class... well there's not a lot one can do, unless you package the code of the methods with the object.
>>
>> I think it's reasonable to limit (at least for now) things to requiring that the client knows about the exact type serialized. And they need to have a layout-compatible version. Which brings us to a related problem - versioning...
>
> My preferred way to handle versioning is to not have to handle it. I generally use key-value pairs to store the content of aggregates (structs, classes). This means I can grow the number of members over time while keeping backward compatibility. If a member is missing from the serialization I use the default value or the class/struct can handle the case with a more specialized behaviour. I can also continue serializing a no-logner necessary value to keep the aggregate backward compatible with older code.
>
> This also makes things less fragile in regard to layout changes: you don't need to keep variables in the same order on all platforms, in all versions, etc.
>
> So that's how I've build my serialization module. I probably should stop talking about it and finish it instead... :-)
>
> --
> Michel Fortin
> michel.fortin at michelf.com
> http://michelf.com/
>
>
>
> _______________________________________________
> phobos mailing list
> phobos at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/phobos
|
August 08, 2010 [phobos] Interest in having a serializer in Phobos? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Michel Fortin | Yeah, that is really a problem, the runtime reflection is currently to limited in D.
On 8 aug 2010, at 17:09, Michel Fortin wrote:
> Le 2010-08-08 ? 10:37, Jacob a ?crit :
>
>> Yes, exactly, that is how the library currently works. But I can see how starting out by deserializing with the type Object could work. This is a description of how the serializer "thinks" when it deserializes a value:
>>
>> auto a2 = serializer.deserialize!(Foo)(data);
>>
>> "Ok, I'm deserializing a Foo"
>>
>> 1. Start by creating a new instance of Foo
>> 2. Loop through all the instance variables
>>
>> "Oh, I found a struct of the type Bar"
>>
>> 1. Create a new Bar
>> 2. Loop through all the instance variables
>> 3. Deserialize the values for each variable
>> 4. Set the values for all the instance variables
>>
>> 3. Set the value for the instance variable of type Bar
>>
>> continue deserializing...
>>
>> Using the approach above I have all (or as much as possible) compile time information available, like the types of all the instance variables, the serializer is in control. Using Andrei's approach it seems more like this:
>>
>> "Start looking in the archive after types"
>> "Ok, the archive wants me to deserialize an instance of Foo"
>>
>> 1. Start by creating a new instance of Foo using reflection:
>>
>> Object foo = Object.factory("Foo");
>>
>> 2. In the archive, loop through all the instance variables
>> 3. See if there is a corresponding field in the deserialized object by loop through all the instance variables using foo.getMembers
>> 4. "Ok the archive wants me to deserialize a struct of the type Bar, hm how do I do that? I only have Bar as a string"
>>
>> Using this approach all compile time information is lost, the archive is in control. Probably not the best explanation.
>
> A better way to say it is that the archive tells you which class to instantiate, and this class should tell you (at runtime) how it can deserialize itself.
>
> With my serialization module a class needs to be defined in a special way to be serializable: it needs to implement the encode/decode methods of the KeyArchivable interface (a mixin can implement them for you). And the class needs to have a default constructor. Or you could define an external handler for a certain class and register it prior unserialization. There's no other way with the current state of runtime-reflection in D.
> --
> Michel Fortin
> michel.fortin at michelf.com
> http://michelf.com/
>
>
>
> _______________________________________________
> phobos mailing list
> phobos at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/phobos
|
August 08, 2010 [phobos] Interest in having a serializer in Phobos? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | On 8 aug 2010, at 17:27, Andrei Alexandrescu wrote: > On 08/08/2010 05:46 AM, Jacob wrote: >> On 8 aug 2010, at 07:47, Andrei Alexandrescu wrote: >> >>> I think that would be great. Knowing nothing about Orange, I visited the website and read the feature lists and the tutorial (the reference seems to be missing for now). The latter contains: >>> >>> auto a2 = serializer.deserialize!(A)(data); >>> >>> which seems to require compile-time knowledge of the deserialized type. I'd expect the library to support something like >>> >>> Object a2 = serializer.deserialize!Object(data); >> >> This is currently not possible in the library. > > I see. This is probably the single most important requirement of a serialization library, by a large margin. A classic example of object orientation is the Shape hierarchy and the array of Shape objects that you draw on the screen etc. Where books are usually coy (and where classic object technology took a while to get up to snuff) is the save/restore part, e.g. once you have an array of Shapes, how do you save it to disk and how do you load it back? > > Saving is easy because you already know the types of objects involved so you could define a virtual function save() that is customizable per type. Loading is not that easy because you need to bootstrap object types from the input stream - and here's where the factory pattern etc. come into play. > > It is absolutely necessary that a serialization library makes scenarios like the above simple and fool-proof. > >> I'm not sure if that >> would be possible, how would you deserialize a struct for example? >> There is no factory function for structs like there is for classes. > > To deserialize a struct, I think it's reasonable to require that the receiver knows the struct statically. In the Thrift protocol things are more lax - you can e.g. write a struct Point containing two ints, and you could deserialize it as two ints. I think that's reasonable. The point is the stream contains primitive type information and class field information about all data trafficked. > >> Since all the static types of the objects would be Object how would I set the values when deserializing? > > Deserialize into Object and then cast the Object to Shape. To be able to cast it to a Shape you need to know the type at compile time when you deserialize it. Or you have to register a method that deserializes the object, which is exactly how it works now when you deserialize through a base class reference. >> Or would Variant be useful here? I >> have not used Variant. > > Probably Variant would play a role when e.g. one wants to deserialize "the next primitive type" without needing to know exactly what type that is (e.g. different integer widths). > > > Andrei > _______________________________________________ > phobos mailing list > phobos at puremagic.com > http://lists.puremagic.com/mailman/listinfo/phobos I'm not sure if we understand each other correctly. If you deserialize into Object you eventually need to cast it to something more useful and then you probably could have deserialized to that type in the first place. The library can deserialize through base class references (by register a deserialize method) but you would have to start with a static type somewhere, not just Object. Do you have a simple (code) example describing what you want to do? |
August 08, 2010 [phobos] Interest in having a serializer in Phobos? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Masahiro Nakagawa |
On 8 aug 2010, at 18:39, Masahiro Nakagawa wrote:
> On Sun, 08 Aug 2010 23:44:01 +0900, Jacob <doob at me.com> wrote:
>
>>
>> On 8 aug 2010, at 16:16, Lars Tandle Kyllingstad wrote:
>>
>>> On Sat, 2010-08-07 at 17:19 +0200, Jacob wrote:
>>>> Is there any interest in having a serializer in Phobos? I have a serializer compatible with D2 which I licensed under the Boost license. This is the description from the project page:
>>>>
>>>> Orange is a serialization library for D1 and D2, supporting both Tango and Phobos. It can serialize most of the available types in D, including third party types and can serialize through base class references. It supports fully automatic serialization of all supported types and also supports several ways to customize the serialization. Orange has a separate front end (the serializer) and back end (the archive) making it possible for the user to create new archive types that can be used with the existing serializer.
>>>>
>>>> It's not very well tested but if there's some interest I'm hoping on getting more people to test the library. The project page is: http://dsource.org/projects/orange/
>>>
>>> I agree (with everyone else) that Phobos should have a serialization lib. And now it seems we're spoilt for choices -- both Masahiro's MsgPack serializer and Jacob's Orange are more or less complete, working solutions being offered to us.
>>
>> Can MessagePack serialize an object? I'm looking at the website and can't see that is has direct support for that.
>
> No(but can use helper method).
>
> MessagePack is a language-neutral serialization and RPC specification(ProtoBuf, Avro too).
> Main purposes are fast and small serialization and communication between many languages.
> These libraries don't handle an object type because
> almost systems don't serialize and exchange an object type in product.
>
> I will add RPC part to MessagePack module.
> But currently, Phobos doesn't have event module.
> So, I am rewriting std.socket and thinking about event module API.
>
>>> I have very little experience with using serialization libs, so I have no idea how to determine which one is the better choice. How do we decide which one to use? Perhaps a vote on the NG?
>>
>> I think one could create a MessagePack archive for my serializer.
>>
>
> I think Orange front-end is overengineering for MessagePack serializer.
> MessagePack format can't handle object graph.
> But, without speed degradation,
> limitation version will be alternative of direct-conversion (de)serializer.
>
> Lastly, I think supporting two modules is better.
> Because the purpose of two modules is different.
>
>
> Masahiro
> _______________________________________________
> phobos mailing list
> phobos at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/phobos
I think I better understand now what MessagePack is and I agree that our libraries have different goals/uses.
|
August 08, 2010 [phobos] Interest in having a serializer in Phobos? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Jacob | On 08/08/2010 01:39 PM, Jacob wrote: > On 8 aug 2010, at 17:27, Andrei Alexandrescu wrote: >> Deserialize into Object and then cast the Object to Shape. > > To be able to cast it to a Shape you need to know the type at compile time when you deserialize it. Or you have to register a method that deserializes the object, which is exactly how it works now when you deserialize through a base class reference. You only need to know the _base_ type statically. >>> Or would Variant be useful here? I have not used Variant. >> >> Probably Variant would play a role when e.g. one wants to deserialize "the next primitive type" without needing to know exactly what type that is (e.g. different integer widths). >> >> >> Andrei _______________________________________________ phobos mailing list phobos at puremagic.com http://lists.puremagic.com/mailman/listinfo/phobos > > I'm not sure if we understand each other correctly. Most likely - sorry about that. Don't forget that all I'm going by is the tutorial, which is very brief. > If you > deserialize into Object you eventually need to cast it to something > more useful and then you probably could have deserialized to that > type in the first place. The library can deserialize through base > class references (by register a deserialize method) but you would > have to start with a static type somewhere, not just Object. Do you > have a simple (code) example describing what you want to do? I think the Shape example is simple enough to serve as a good baseline. Say you have a hierarchy rooted in Shape including e.g. Triangle, Circle, and Rectangle. Now say you have a drawing represented as a Shape[]. What steps do you need to take to save the drawing to disk and restore it later? Andrei |
August 08, 2010 [phobos] Interest in having a serializer in Phobos? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | On 8 aug 2010, at 21:35, Andrei Alexandrescu wrote: > On 08/08/2010 01:39 PM, Jacob wrote: >> On 8 aug 2010, at 17:27, Andrei Alexandrescu wrote: >>> Deserialize into Object and then cast the Object to Shape. >> >> To be able to cast it to a Shape you need to know the type at compile time when you deserialize it. Or you have to register a method that deserializes the object, which is exactly how it works now when you deserialize through a base class reference. > > You only need to know the _base_ type statically. Yes, but using Object as the static type is not enough, see my example below. >>>> Or would Variant be useful here? I have not used Variant. >>> >>> Probably Variant would play a role when e.g. one wants to deserialize "the next primitive type" without needing to know exactly what type that is (e.g. different integer widths). >>> >>> >>> Andrei _______________________________________________ phobos mailing list phobos at puremagic.com http://lists.puremagic.com/mailman/listinfo/phobos >> >> I'm not sure if we understand each other correctly. > > Most likely - sorry about that. Don't forget that all I'm going by is the tutorial, which is very brief. I know, I will expand the tutorials and examples. >> If you >> deserialize into Object you eventually need to cast it to something >> more useful and then you probably could have deserialized to that >> type in the first place. The library can deserialize through base >> class references (by register a deserialize method) but you would >> have to start with a static type somewhere, not just Object. Do you >> have a simple (code) example describing what you want to do? > > I think the Shape example is simple enough to serve as a good baseline. Say you have a hierarchy rooted in Shape including e.g. Triangle, Circle, and Rectangle. Now say you have a drawing represented as a Shape[]. What steps do you need to take to save the drawing to disk and restore it later? 1. Create a new instance of the serializer (a or b) 2a. Register a serializer and deserializer method for each runtime type with the serializer 2b. Implement a toData and fromData method in each class (you don't have to register methods for the static type, Shape in this case) 3. Serialize the array 4. Deserialize the array as the static type Shape[] Now every object in the array should have been deserialized to its runtime type. I'll add a code example for this on the project page. > Andrei > _______________________________________________ > phobos mailing list > phobos at puremagic.com > http://lists.puremagic.com/mailman/listinfo/phobos |
August 08, 2010 [phobos] Interest in having a serializer in Phobos? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | On 8 aug 2010, at 21:35, Andrei Alexandrescu wrote: > On 08/08/2010 01:39 PM, Jacob wrote: >> On 8 aug 2010, at 17:27, Andrei Alexandrescu wrote: >>> Deserialize into Object and then cast the Object to Shape. >> >> To be able to cast it to a Shape you need to know the type at compile time when you deserialize it. Or you have to register a method that deserializes the object, which is exactly how it works now when you deserialize through a base class reference. > > You only need to know the _base_ type statically. > >>>> Or would Variant be useful here? I have not used Variant. >>> >>> Probably Variant would play a role when e.g. one wants to deserialize "the next primitive type" without needing to know exactly what type that is (e.g. different integer widths). >>> >>> >>> Andrei _______________________________________________ phobos mailing list phobos at puremagic.com http://lists.puremagic.com/mailman/listinfo/phobos >> >> I'm not sure if we understand each other correctly. > > Most likely - sorry about that. Don't forget that all I'm going by is the tutorial, which is very brief. > >> If you >> deserialize into Object you eventually need to cast it to something >> more useful and then you probably could have deserialized to that >> type in the first place. The library can deserialize through base >> class references (by register a deserialize method) but you would >> have to start with a static type somewhere, not just Object. Do you >> have a simple (code) example describing what you want to do? > > I think the Shape example is simple enough to serve as a good baseline. Say you have a hierarchy rooted in Shape including e.g. Triangle, Circle, and Rectangle. Now say you have a drawing represented as a Shape[]. What steps do you need to take to save the drawing to disk and restore it later? > The code example is now available at: http://dsource.org/projects/orange/wiki/Tutorials/SerializeBase > Andrei > _______________________________________________ > phobos mailing list > phobos at puremagic.com > http://lists.puremagic.com/mailman/listinfo/phobos |
September 17, 2010 [phobos] Interest in having a serializer in Phobos? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Michel Fortin | This exchange involves serialization design. It replies to Michel's email on Aug 8. On 8/8/10 7:26 CDT, Michel Fortin wrote: > Le 2010-08-08 ? 1:47, Andrei Alexandrescu a ?crit : [snip] > > My own unreleased, unfinished and in-need-of-a-refactoring serialization module does that... but unfortunately dynamically recreating the right type cannot be so straightforward in the current state of runtime reflection. > > This post turned out longer that I expected, please stay with me. OK. Apologies for taking so long. > Runtime reflection currently gives you access *only* to the default constructor, so this is what my module do internally when unserializing a class: > > ClassInfo c = findClass(classNameFromSerializationStream); Object o = > c.create(); (cast(Unserializable)o).unserialize(serialiationStream); Yes, and that's entirely sensible. You create an empty object using standardized header information from the stream, and then you fill it with type-specific information by continuing down the stream. It's good practice. > Since we can't access a constructor with a different signature, we can't unserialize directly from the constructor. I think that would be ill-advised too. It's not constructor's job to deserialize. > This is rather a > weak point as it forces all objects to have a default constructor. Only all objects that want to support serialization. > Another options is for the user to manually register his own constructor with the serialization system prior unserializing, but that's much less convenient. Agreed. > The unserialize member function called above must be explicitly added by the user (either manually or with a mixin) because the fields don't reflect at runtime and the actual class is unknown at compile-time. And the class needs to conform to an interface that contains that unserialize function so we can find it at runtime. Yah, good point. Probably we'll need to add to the information emitted automatically by the compiler, but for starters, how about this: class Widget : Serializable!Widget { ... } Then Serializable would use compile-time introspection to figure out Widget's fields etc. > So before adding a serialization library, I would suggest we solve the runtime-reflection problem and find a standard way to attach various attributes to types and members. If we go with the pattern I suggested above, we could even establish a naming convention for fields... > That could be done as a > library, but ideally it'd have some help from the compiler which > could put this stuff where it really belongs: ClassInfo. Currently, > QtD has its own mixins for that, my D/Objective-C bridge has its own > mixins and class registration system, my serialization module has its > own, surely Orange has its own, I believe PyD has its own... this is > going to be a mess pretty soon if it isn't already. Not necessarily if it's in the standard library and if it facilitates everybody else's implementation and higher abstractions. > Once we have a proper standardized runtime-reflection and attribute system, then the serialization module can focus on serialization instead of implementing various hacks to add and get to the information it needs. I think we need to start with solid compile-time reflection and then look for ways to transport it to runtime, ideally using library facilities. Andrei |
September 17, 2010 [phobos] Interest in having a serializer in Phobos? | ||||
---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | On 17 sep 2010, at 08:02, Andrei Alexandrescu wrote: > This exchange involves serialization design. It replies to Michel's email on Aug 8. > > On 8/8/10 7:26 CDT, Michel Fortin wrote: >> Le 2010-08-08 ? 1:47, Andrei Alexandrescu a ?crit : > [snip] >> >> My own unreleased, unfinished and in-need-of-a-refactoring serialization module does that... but unfortunately dynamically recreating the right type cannot be so straightforward in the current state of runtime reflection. >> >> This post turned out longer that I expected, please stay with me. > > OK. Apologies for taking so long. > >> Runtime reflection currently gives you access *only* to the default constructor, so this is what my module do internally when unserializing a class: >> >> ClassInfo c = findClass(classNameFromSerializationStream); Object o = >> c.create(); (cast(Unserializable)o).unserialize(serialiationStream); > > Yes, and that's entirely sensible. You create an empty object using standardized header information from the stream, and then you fill it with type-specific information by continuing down the stream. It's good practice. > >> Since we can't access a constructor with a different signature, we can't unserialize directly from the constructor. > > I think that would be ill-advised too. It's not constructor's job to deserialize. > >> This is rather a >> weak point as it forces all objects to have a default constructor. > > Only all objects that want to support serialization. We can just skip calling the constructor and let the user register a function/method that does any custom deserialization if needed. Then all objects can be deserialized. This is only needed if the user of the serializer wants perform any custom deserialization (run any custom code) or if deserializing through a base class reference. >> Another options is for the user to manually register his own constructor with the serialization system prior unserializing, but that's much less convenient. > > Agreed. > >> The unserialize member function called above must be explicitly added by the user (either manually or with a mixin) because the fields don't reflect at runtime and the actual class is unknown at compile-time. And the class needs to conform to an interface that contains that unserialize function so we can find it at runtime. > > Yah, good point. Probably we'll need to add to the information emitted automatically by the compiler, but for starters, how about this: > > class Widget : Serializable!Widget { > ... > } > > Then Serializable would use compile-time introspection to figure out Widget's fields etc. > >> So before adding a serialization library, I would suggest we solve the runtime-reflection problem and find a standard way to attach various attributes to types and members. > > If we go with the pattern I suggested above, we could even establish a naming convention for fields... > >> That could be done as a >> library, but ideally it'd have some help from the compiler which >> could put this stuff where it really belongs: ClassInfo. Currently, >> QtD has its own mixins for that, my D/Objective-C bridge has its own >> mixins and class registration system, my serialization module has its >> own, surely Orange has its own, I believe PyD has its own... this is >> going to be a mess pretty soon if it isn't already. > > Not necessarily if it's in the standard library and if it facilitates everybody else's implementation and higher abstractions. > >> Once we have a proper standardized runtime-reflection and attribute system, then the serialization module can focus on serialization instead of implementing various hacks to add and get to the information it needs. > > I think we need to start with solid compile-time reflection and then look for ways to transport it to runtime, ideally using library facilities. Don't we already have quite solid compile-time reflection with __traits using hasMember and getMember? Now I don't understand how these could be transported to runtime. On the other hand we already have half of this implemented in the runtime in the form of ClassInfo.getMembers (though it currently doesn't work). We only need methods to set and get the value of a member. > Andrei > _______________________________________________ > phobos mailing list > phobos at puremagic.com > http://lists.puremagic.com/mailman/listinfo/phobos -- /Jacob Carlborg -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.puremagic.com/pipermail/phobos/attachments/20100917/827fcd4b/attachment-0001.html> |
Copyright © 1999-2021 by the D Language Foundation