Thread overview
CRTP + compile-time introspection + static ctors = WIN
Jan 15, 2021
H. S. Teoh
Jan 15, 2021
Adam D. Ruppe
Jan 16, 2021
Imperatorn
Jan 16, 2021
Mathias LANG
Jan 16, 2021
zjh
Jan 18, 2021
Jacob Carlborg
Jan 18, 2021
H. S. Teoh
Jan 23, 2021
Jacob Carlborg
January 15, 2021
Recently, I needed to extend a simple serialization system I wrote for one of my projects to handle polymorphic objects.  It's all data-only structs and classes, so no need for fancy heavyweight serialization libraries.

One way to do this was to add load() and save() methods in the base
class, and override them for every derived class.  However, this would
be too much boilerplate, and prone to mistakes (forget to override a
method, and derived class data would fail to be serialized).

Another solution is to use mixins to inject these methods into each derived class. But again, too much boilerplate, and prone to forgetting to include the mixin statement in the class.

Yesterday, I hit upon a nice solution: use CRTP (the Curiously-Recursive Template Pattern) to inject these methods into each derived class:

	class Saveable(Derived, Base) : Base {
		static if (is(Base == Object)) {
			// Top-level virtual function
			void save() { ... }
		} else {
			// Derived class override
			override void save() { ... }
		}
	}

	class Base : Saveable!(Base, Object) { ...  }

	class Derived1 : Saveable!(Derived1, Base) { ... }

	class Derived2 : Saveable!(Derived1, Base) { ... }

Since the CRTP is right at the first line of the class declaration, it's hard to miss, and it's easy to notice when I forgot to include it (as opposed to a mixin line buried somewhere in a potentially large class definition).

The Base parameter to Saveable lets us nicely inject overridable methods into the class hierarchy, and also to differentiate between top-level methods and derived class overrides.

Saveable.save() uses the template argument to introspect the derived class and generate code to serialize its fields. It includes code to generate a tag in the serialized output to identify what type it is.

That takes care of the serialization half of the task.

For deserialization, there was the possibility of using Object.factory. However, the API is klunky, and there is a disconnect with how to read the fields back with the right types.

For this, static ctors come to the rescue. I expanded Saveable thus:

	alias Loader = Object function(InputFile);
	Loader[string] classLoaders;

	class Saveable(Derived, Base) : Base {
		static if (is(Base == Object)) {
			// Top-level virtual function
			void save() { ... }
		} else {
			// Derived class override
			override void save() { ... }
		}

		static this()
		{
			classLoaders[Derived.stringof] = (InputFile f) {
				auto result = new Derived;
				... // use introspection to read Derived's fields back
				return result;
			};
		}
	}

The magic here is that the static this() block is generated *once per instantation* of Saveable, and it has full compile-time knowledge of the derived class. So the function literal can use compile-time introspection to generate the serialization code.  Then this knowledge is translated to runtime by registering the function literal into a global table of loaders, keyed by the class name. (For simplicity, I used .stringof here; for larger-scale projects you probably want .mangleof instead.)

And since static this() blocks are run at program startup and dynamic
library load time, this ensures that after program startup,
`classLoaders` has knowledge of all types the program will ever use.
So the deserialization code can simply look up the saved type tag in
`classLoaders`, and call the function pointer to reconstruct the object.

The result: to make any class serializable, you just replace:

	class MyClass : MyBase { ... }

with

	class MyClass : Saveable!(MyClass, MyBase) { ... }

and everything else is taken care of automatically. No need for mixins, no need for repetitious serialization boilerplate polluting every class, no need even for runtime TypeInfo's.  This will support even class definitions loaded via dynamic libraries -- as long as you use Runtime.loadLibrary to ensure static ctors are run -- since the static ctors will inject any new class loaders into `classLoaders`, thus automatically "teaching" the deserialization code how to deserialize the corresponding classes.

CRTP + compile-time introspection + static ctors = WIN

D rocks!!


T

-- 
"Outlook not so good." That magic 8-ball knows everything! I'll ask about Exchange Server next. -- (Stolen from the net)
January 15, 2021
On Friday, 15 January 2021 at 18:31:18 UTC, H. S. Teoh wrote:
> CRTP + compile-time introspection + static ctors = WIN

truth

I wrote a lil on this a while ago too http://dpldocs.info/this-week-in-d/Blog.Posted_2019_06_10.html#tip-of-the-week

and

http://dpldocs.info/this-week-in-d/Blog.Posted_2019_08_05.html#what-adam-is-working-on

are both on the same topic. I like using static ctors with mixin templates too, you can define your own private vars and get them init, my jni.d does that for bridging.

> D rocks!!

D is bigger than a rock. D boulders.
January 16, 2021
On Friday, 15 January 2021 at 18:31:18 UTC, H. S. Teoh wrote:
>
> And since static this() blocks are run at program startup and dynamic
> library load time, this ensures that after program startup,
> `classLoaders` has knowledge of all types the program will ever use.

Just don't use separate compilation or you're in for a lot of troubles.
https://issues.dlang.org/show_bug.cgi?id=20641
January 16, 2021
On Friday, 15 January 2021 at 18:34:12 UTC, Adam D. Ruppe wrote:
> On Friday, 15 January 2021 at 18:31:18 UTC, H. S. Teoh wrote:
>> CRTP + compile-time introspection + static ctors = WIN
>
> truth
>
> I wrote a lil on this a while ago too http://dpldocs.info/this-week-in-d/Blog.Posted_2019_06_10.html#tip-of-the-week
>
> and
>
> http://dpldocs.info/this-week-in-d/Blog.Posted_2019_08_05.html#what-adam-is-working-on
>
> are both on the same topic. I like using static ctors with mixin templates too, you can define your own private vars and get them init, my jni.d does that for bridging.
>
>> D rocks!!
>
> D is bigger than a rock. D boulders.

+ D Mars
January 16, 2021
On Friday, 15 January 2021 at 18:31:18 UTC, H. S. Teoh wrote:

Very Good.


January 18, 2021
On 2021-01-15 19:31, H. S. Teoh wrote:

> Yesterday, I hit upon a nice solution: use CRTP (the Curiously-Recursive
> Template Pattern) to inject these methods into each derived class:
> 
> 	class Saveable(Derived, Base) : Base {
> 		static if (is(Base == Object)) {
> 			// Top-level virtual function
> 			void save() { ... }
> 		} else {
> 			// Derived class override
> 			override void save() { ... }
> 		}
> 	}
> 
> 	class Base : Saveable!(Base, Object) { ...  }
> 
> 	class Derived1 : Saveable!(Derived1, Base) { ... }
> 
> 	class Derived2 : Saveable!(Derived1, Base) { ... }
> 

That's an interesting idea. Although it's a bit intrusive since it requires changing what you're serializing.

In my serialization library Orange [1] I solved this by registering subclasses that are going to be serialized through a base class reference [2]. If they're not serialized through a base class reference, no registration is required.

[1] https://github.com/jacob-carlborg/orange
[2] https://github.com/jacob-carlborg/orange/blob/1c4b1ab989fc36e6fae91131ba6951acf074f383/tests/BaseClass.d#L73

-- 
/Jacob Carlborg
January 18, 2021
On Mon, Jan 18, 2021 at 05:49:03PM +0100, Jacob Carlborg via Digitalmars-d wrote:
> On 2021-01-15 19:31, H. S. Teoh wrote:
> 
> > Yesterday, I hit upon a nice solution: use CRTP (the Curiously-Recursive Template Pattern) to inject these methods into each derived class:
> > 
> > 	class Saveable(Derived, Base) : Base {
> > 		static if (is(Base == Object)) {
> > 			// Top-level virtual function
> > 			void save() { ... }
> > 		} else {
> > 			// Derived class override
> > 			override void save() { ... }
> > 		}
> > 	}
> > 
> > 	class Base : Saveable!(Base, Object) { ...  }
> > 
> > 	class Derived1 : Saveable!(Derived1, Base) { ... }
> > 
> > 	class Derived2 : Saveable!(Derived1, Base) { ... }
> > 
> 
> That's an interesting idea. Although it's a bit intrusive since it requires changing what you're serializing.

True, but I was looking for a maximally-automated, minimal-boilerplate solution.


> In my serialization library Orange [1] I solved this by registering subclasses that are going to be serialized through a base class reference [2]. If they're not serialized through a base class reference, no registration is required.
> 
> [1] https://github.com/jacob-carlborg/orange
> [2] https://github.com/jacob-carlborg/orange/blob/1c4b1ab989fc36e6fae91131ba6951acf074f383/tests/BaseClass.d#L73
[...]

I also considered this approach, but rejected it because forgetting to register a derived class would result in incorrect serialization. I felt that was too fragile for my needs.

My current serialization need is quite specific in scope: I have a bunch of arrays, AA's, structs, and classes, all of which are data-only (i.e., public data fields only, no special semantics via setters/getters).  A small number of types may require special serialization/deserialization treatment; for this the serialization code detects the existence of custom .save/.load methods.  Other than that, serialization is automated from the root object.  Since root objects are very general, and as development goes on the exact combination of contained types may change, so a solution that does not require explicit registration of types is ideal.

Using my solution above, the only thing I need to check is that the derived class derives from Saveable, which can be done the first time I declare it. It's highly visible, so accidental omission can be immediately noticed.  Nothing else needs to be done, as the rest of the mechanisms are fully automated from that point on.


T

-- 
In theory, there is no difference between theory and practice.
January 23, 2021
On 2021-01-18 18:15, H. S. Teoh wrote:

> I also considered this approach, but rejected it because forgetting to
> register a derived class would result in incorrect serialization. I felt
> that was too fragile for my needs.

Orange will catch this at runtime and throw an exception.

> My current serialization need is quite specific in scope: I have a bunch
> of arrays, AA's, structs, and classes, all of which are data-only (i.e.,
> public data fields only, no special semantics via setters/getters).  A
> small number of types may require special serialization/deserialization
> treatment; for this the serialization code detects the existence of
> custom .save/.load methods.  Other than that, serialization is automated
> from the root object.  Since root objects are very general, and as
> development goes on the exact combination of contained types may change,
> so a solution that does not require explicit registration of types is
> ideal.
> 
> Using my solution above, the only thing I need to check is that the
> derived class derives from Saveable, which can be done the first time I
> declare it. It's highly visible, so accidental omission can be
> immediately noticed.  Nothing else needs to be done, as the rest of the
> mechanisms are fully automated from that point on.

I think both of our solutions are equally automatic. You require inheriting from a specific class, mine require registering the class.

-- 
/Jacob Carlborg