Jump to page: 1 26  
Page
Thread overview
Request for review - std.serialization (orange)
Mar 24, 2013
Jacob Carlborg
Mar 24, 2013
Andrej Mitrovic
Mar 25, 2013
Jacob Carlborg
Mar 27, 2013
Jacob Carlborg
Mar 25, 2013
Manu
Mar 25, 2013
Jacob Carlborg
Mar 31, 2013
Kagamin
Mar 31, 2013
Jacob Carlborg
Mar 31, 2013
Kagamin
Mar 31, 2013
Jacob Carlborg
Mar 31, 2013
Kagamin
Mar 31, 2013
Kagamin
Mar 31, 2013
Kagamin
Mar 31, 2013
Kagamin
Apr 01, 2013
Jacob Carlborg
Mar 25, 2013
Jacob Carlborg
Mar 30, 2013
Jesse Phillips
Mar 30, 2013
Jacob Carlborg
Mar 31, 2013
Kagamin
Mar 31, 2013
Jacob Carlborg
Mar 31, 2013
Kagamin
Mar 31, 2013
Jacob Carlborg
Mar 31, 2013
Kagamin
Apr 01, 2013
Jacob Carlborg
Apr 01, 2013
Kagamin
Apr 01, 2013
Jacob Carlborg
Mar 31, 2013
Jesse Phillips
Apr 01, 2013
Jacob Carlborg
Apr 01, 2013
Jesse Phillips
Apr 01, 2013
Matt Soucy
Apr 01, 2013
Kagamin
Apr 01, 2013
Kagamin
Apr 01, 2013
Matt Soucy
Apr 01, 2013
Kagamin
Apr 01, 2013
Kagamin
Apr 01, 2013
Matt Soucy
Apr 01, 2013
Kagamin
Apr 01, 2013
Matt Soucy
Apr 02, 2013
Kagamin
Apr 02, 2013
Matt Soucy
Apr 01, 2013
Jesse Phillips
Apr 01, 2013
Matt Soucy
Apr 02, 2013
Jacob Carlborg
Apr 02, 2013
Matt Soucy
Apr 02, 2013
Jacob Carlborg
Apr 02, 2013
Matt Soucy
Apr 02, 2013
Jacob Carlborg
Jun 17, 2013
Francois Chabot
Jun 17, 2013
Jacob Carlborg
Jun 17, 2013
Francois Chabot
Jun 17, 2013
Jacob Carlborg
Jun 17, 2013
Francois Chabot
Jun 18, 2013
Jacob Carlborg
Oct 06, 2014
Daniele Bondì
Oct 06, 2014
Jacob Carlborg
Apr 02, 2013
Jacob Carlborg
Jun 15, 2013
Baz
Jun 15, 2013
Jacob Carlborg
Jun 15, 2013
Dicebot
March 24, 2013
std.serialization (orange) is now ready to be reviewed.

A couple of notes for the review:

* The most important packages are: orange.serialization and orange.serialization.archives

* The unit tests are located in its own package, I'm not very happy about putting the unit tests in the same module as the rest of the code, i.e. the serialization module. What are the options? These test are quite high level. They test the whole Serializer class and not individual functions.

* I'm using some utility functions located in the "util" and "core" packages, what should we do about those, where to put them?

* Trailing whitespace and tabs will be fixed when/if the package gets accepted

* If this get accepted should I do a sub-tree merge (or what it's called) to keep the history intact?

Changes since last time:

* I've removed any Tango and D1 related code
* I've removed all unused functions (hopefully)

For usage examples, see the github wiki pages: https://github.com/jacob-carlborg/orange/wiki/_pages

For more extended usage examples, see the unit tests: https://github.com/jacob-carlborg/orange/tree/master/tests

Sources: https://github.com/jacob-carlborg/orange
Documentation: https://dl.dropbox.com/u/18386187/orange_docs/Serializer.html
Run unit tests: execute the unittest.sh shell script

(Don't forget clicking the "Package" tab in the top corner to see the documentation for the rest of the modules)

-- 
/Jacob Carlborg
March 24, 2013
On 3/24/13, Jacob Carlborg <doob@me.com> wrote:
> For usage examples, see the github wiki pages: https://github.com/jacob-carlborg/orange/wiki/_pages

A small example actually writing the xml file to disk and the reading back from it would be beneficial.

Btw the library doesn't build with the -w switch:

orange\xml\PhobosXml.d(2536): Error: switch case fallthrough - use
'goto case;' if intended
March 25, 2013
Just at a glance, a few things strike me...

Phobos doesn't typically use classes, seems to prefer flat functions. Are
we happy with classes in this instance?
Use of caps in the filenames/functions is not very phobos like.

Can I have a post-de-serialise callback to recalculate transient data?

Why register serialiser's, and structures that can be operated on? (I'm not a big fan of registrations of this sort personally, if they can be avoided)

Is there a mechanism to deal with pointers, or do you just serialise through the pointer? Some sort of reference system so objects pointing at the same object instance will deserialise pointing at the same object instance (or a new copy thereof)?

Is it fast? I see in your custom deserialise example, you deserialise
members by string name... does it need to FIND those in the stream by name,
or does it just use that to validate the sequence?
I have a serialiser that serialises in realtime (60fps), a good fair few
megabytes of data per frame... will orange handle this?

Documentation, what attributes are available? How to use them?

You only seem to provide an XML backend. What about JSON? Binary (with
endian awareness)?

Writing an Archiver looks a lot more involved than I would have imagined.
XmlArchive.d is huge, mostly just 'ditto'.
Should unarchiveXXX() not rather be unarchive!(XXX)(), allowing to minimise
most of those function definitions?


On 25 March 2013 07:41, Andrej Mitrovic <andrej.mitrovich@gmail.com> wrote:

> On 3/24/13, Jacob Carlborg <doob@me.com> wrote:
> > For usage examples, see the github wiki pages: https://github.com/jacob-carlborg/orange/wiki/_pages
>
> A small example actually writing the xml file to disk and the reading back from it would be beneficial.
>
> Btw the library doesn't build with the -w switch:
>
> orange\xml\PhobosXml.d(2536): Error: switch case fallthrough - use
> 'goto case;' if intended
>


March 25, 2013
On 2013-03-24 22:41, Andrej Mitrovic wrote:

> A small example actually writing the xml file to disk and the reading
> back from it would be beneficial.

Ok, so just adding write and read to disk to the usage example on the github page?

> Btw the library doesn't build with the -w switch:
>
> orange\xml\PhobosXml.d(2536): Error: switch case fallthrough - use
> 'goto case;' if intended

Good catch.

-- 
/Jacob Carlborg
March 25, 2013
On 2013-03-25 02:16, Manu wrote:
> Just at a glance, a few things strike me...
>
> Phobos doesn't typically use classes, seems to prefer flat functions.

It's necessary to have a class or struct to pass around. The serializer is passed to method/functions doing custom serialization. I could create a free function that encapsulates the classes for the common use cases.

> Are we happy with classes in this instance?
> Use of caps in the filenames/functions is not very phobos like.

Yeah, that will be fixed if accepted. As you see, it's still a separate library and not included into Phobos.

> Can I have a post-de-serialise callback to recalculate transient data?

Yes. There are three ways to custom the serialization process.

1. Take complete control of the process (for the type) by adding toData/fromData to your types

https://github.com/jacob-carlborg/orange/wiki/Custom-Serialization

2. Take complete control of the process (for the type) by registering a function pointer/delegate as a serializer for a given type. Useful for serializing third party types

https://github.com/jacob-carlborg/orange/wiki/Non-Intrusive-Serialization

3. Add the onDeserialized attribute to a method in the type being serialized

https://github.com/jacob-carlborg/orange/blob/master/tests/Events.d#L75
https://dl.dropbox.com/u/18386187/orange_docs/Events.html

I noticed that the documentation for the attributes don't look so good.

> Why register serialiser's, and structures that can be operated on? (I'm
> not a big fan of registrations of this sort personally, if they can be
> avoided)

The only time when registering a serializer is really necessary is when serializing through a base class reference. Otherwise the use cases are when customizing the serialization process.

> Is there a mechanism to deal with pointers, or do you just serialise
> through the pointer? Some sort of reference system so objects pointing
> at the same object instance will deserialise pointing at the same object
> instance (or a new copy thereof)?

Yes. All references types (including pointers) are only serialized ones. If a pointer, that is serialized, is pointing to data not being serialized it serialize what it's pointing to as well.

If you're curious about the internals I suggest you serialize some class/strcut hierarchy and look at the XML data. It should be readable.

> Is it fast? I see in your custom deserialise example, you deserialise
> members by string name... does it need to FIND those in the stream by
> name, or does it just use that to validate the sequence?

That's up to the archive how to implemented. But the idea is that it should be able to find by name in the serialized data. That is kind of an implicit contract between the archive and the serializer.

> I have a serialiser that serialises in realtime (60fps), a good fair few
> megabytes of data per frame... will orange handle this?

Probably not. I think it mostly depends on the archive used. The XML module in Phobos is really, REALLY slow. Serializing the same data with Tango (D1) is at least twice as fast. I have started to work on an archive type that just tries to be as fast as possible. That:

* Break the implicit contract with the serializer
* Doesn't care about endians
* Doesn't care if the fields have changed
* May not handle slices correctly
* And some other things

> Documentation, what attributes are available? How to use them?

https://dl.dropbox.com/u/18386187/orange_docs/Events.html
https://dl.dropbox.com/u/18386187/orange_docs/Serializable.html

Is this clear enough?

> You only seem to provide an XML backend. What about JSON? Binary (with
> endian awareness)?

Yeah, that is not implemented yet. Is it necessary before adding to to Phobos?

> Writing an Archiver looks a lot more involved than I would have
> imagined. XmlArchive.d is huge, mostly just 'ditto'.
> Should unarchiveXXX() not rather be unarchive!(XXX)(), allowing to
> minimise most of those function definitions?

Yeah, it has kind of a big API. The reason is to be able to use interfaces. Seriailzer contains a reference to an archive, typed as the interface Archive. If you're using custom serialization I don't think it would be good to lock yourself to a specific archive type.

BTW, unarchiveXXX is forwarded to a private unarchive!(XXX)() in XmlArchive.

With classes and interfaces:

class Serializer
interface Archive
class XmlArchive : Archive

Archive archive = new XmlArchive;
auto serializer = new Serializer(archive);

struct Foo
{
    void toData (Serializer serializer, Serializer.Data key);
}

With templates:

class Serializer (T)
class XmlArchive

auto archive = new XmlArchive;
auto serializer = new Serializer!(XmlArchive)(archive);

struct Foo
{
    void toData (Serializer!(XmlArchive) serializer, Serializer.Data key);
}

Foo is now locked to the XmlArchive. Or:

class Bar
{
    void toData (T) (Serializer!(T) serializer, Serializer.Data key);
}

toData cannot be virtual.

-- 
/Jacob Carlborg
March 25, 2013
On 2013-03-24 22:03, Jacob Carlborg wrote:

> std.serialization (orange) is now ready to be reviewed.

Just so there is no confusion. If it gets accepted I will replace tabs with spaces, fix the column limit and change all filenames to lowercase.

-- 
/Jacob Carlborg
March 27, 2013
On 2013-03-24 22:41, Andrej Mitrovic wrote:

> orange\xml\PhobosXml.d(2536): Error: switch case fallthrough - use
> 'goto case;' if intended

PhobosXml is a local copy of std.xml with a few small modifications. If accepted I'll make the changes to std.xml and remove PhobosXml.

-- 
/Jacob Carlborg
March 30, 2013
Hello Jacob,

These comments are based on looking into adding Protocol Buffer as an archive. First some details on the PB format.
https://developers.google.com/protocol-buffers/docs/overview

1) It is a binary format
2) Not all D types can be serialized
3) Serialization is done by message (struct) and not by primitives
4) It defines options which can affect (de)serializing.

I am looking at using Serializer to drive (de)serialization even if that meant just jamming it in there where Orange could only read PB data it has written. Keep in mind I'm not saying these are requirements or that I know what I'm talking about, only my thoughts.

My first thought was at a minimum I could just use a function which does the complete (de)serialization of the type. Which would be great since the pbcompiler I'm using/modifying already does this.

Because of the way custom serialization I'm stopped by point 3. I didn't realize that at first so I also looked at implementing an Archive. What I notice here is

* Information is lost, specifically the attributes (more important with UDA).
* I am required to implement conversions I have no implementation for.

This leaves me concluding that I'd need to implement my own Serializer, which seems to me I'm practically reimplementing most of Orange to use Orange with PB.

Does having Orange support things like PB make sense?

I think some work could be done for the Archive API as it doesn't feel like D2. Maybe we could look at locking down custom Archive/Serializer classes while the internals are worked out (would mean XML (de)serialization is available in Phobos).
March 30, 2013
On 2013-03-30 21:02, Jesse Phillips wrote:
> Hello Jacob,
>
> These comments are based on looking into adding Protocol Buffer as an
> archive. First some details on the PB format.
> https://developers.google.com/protocol-buffers/docs/overview
>
> 1) It is a binary format

That shouldn't be a problem. Preferably it should support some kind of identity map and be able to deserialize fields in any order.

> 2) Not all D types can be serialized

Any data format that supports some kind of key-value mapping should be able to serialize all D types. Although, possibly in a format that is not idiomatic for that data format. XML doesn't have any types and the XML archive can serialize any D type.

> 3) Serialization is done by message (struct) and not by primitives

I'm not sure I understand this.

> 4) It defines options which can affect (de)serializing.

While Orange doesn't support the options Protocol Buffer seems to use directly, it should be possible by customizing the serialization of a type. See:

https://github.com/jacob-carlborg/orange/wiki/Custom-Serialization
https://github.com/jacob-carlborg/orange/wiki/Non-Intrusive-Serialization

Alternatively they are useful enough to have direct support in the serializer.

> I am looking at using Serializer to drive (de)serialization even if that
> meant just jamming it in there where Orange could only read PB data it
> has written. Keep in mind I'm not saying these are requirements or that
> I know what I'm talking about, only my thoughts.

That should be possible. I've been working a binary archive that tries to be as fast as possible, breaking rules to the left and right, doesn't conform to the implicit contract between the serializer and archive and so on.

> My first thought was at a minimum I could just use a function which does
> the complete (de)serialization of the type. Which would be great since
> the pbcompiler I'm using/modifying already does this.
>
> Because of the way custom serialization I'm stopped by point 3. I didn't
> realize that at first so I also looked at implementing an Archive. What
> I notice here is
>
> * Information is lost, specifically the attributes (more important with
> UDA).

Do you want UDA's passed to the archive for a give type or field? I don't know how easy that would be to implement. It would probably require a template method in the archive, which I would like to avoid, since it wouldn't be possible to use via an interface.

> * I am required to implement conversions I have no implementation for.

Just implement an empty method for any method you don't have use for. If it needs to return a value, you can most of return typeof(return).init.

> This leaves me concluding that I'd need to implement my own Serializer,
> which seems to me I'm practically reimplementing most of Orange to use
> Orange with PB.

That doesn't sound good.

> Does having Orange support things like PB make sense?

I think so.

> I think some work could be done for the Archive API as it doesn't feel
> like D2.

It started for D1.

> Maybe we could look at locking down custom Archive/Serializer
> classes while the internals are worked out (would mean XML
> (de)serialization is available in Phobos).

-- 
/Jacob Carlborg
March 31, 2013
On Monday, 25 March 2013 at 08:53:32 UTC, Jacob Carlborg wrote:
> With templates:
>
> class Serializer (T)
> class XmlArchive
>
> auto archive = new XmlArchive;
> auto serializer = new Serializer!(XmlArchive)(archive);
>
> struct Foo
> {
>     void toData (Serializer!(XmlArchive) serializer, Serializer.Data key);
> }
>
> Foo is now locked to the XmlArchive. Or:
>
> class Bar
> {
>     void toData (T) (Serializer!(T) serializer, Serializer.Data key);
> }
>
> toData cannot be virtual.

http://dpaste.dzfl.pl/0f7d8219
« First   ‹ Prev
1 2 3 4 5 6