April 11, 2021

Hi all,

As you may know, Mir provides two JSON libraries: asdf and WIP mir-ion. The last one is based on Amazon's Ion dual format.

Both libraries provide a direct de/serialization API that doesn't need to have a mutable JSON value. This works awesome for almost all cases, except when we want to work with JSON tree directly and dynamically change it. Or sometimes a JSON value may have different types and we want to check them at runtime.

Looking into JSON value implementations, we can find that they are tagged nullable self-referencing algebraic types.

  • algebraic - type can store a value of a type from a fixed typeset.
  • self-referencing - algebraic typeset types can refer to this algebraic type
  • nullable - typeof(null) type is supported, and it is the default
  • tagged - we have an enumeration (enum) of the whole typeset, and value has a property, which returns an enumeration value that corresponds to the underlying type.

We can define such algebraics with mir.algebraic:

import mir.algebraic: TaggedVariant, This;

union JsonAlgebraicUnion
{
    typeof(null) null_;
    bool boolean;
    long integer;
    double float_;
    immutable(char)[] string;
    /// Self alias in array
    This[] array;
    /// Self alias in associative
    This[immutable(char)[]] object;
}

alias JsonAlgebraic = TaggedVariant!JsonAlgebraicUnion;

unittest
{
    JsonAlgebraic value;

    JsonAlgebraic[string] object;

    // Default
    assert(value.isNull);
    assert(value.kind == JsonAlgebraic.Kind.null_);

    // Boolean
    value = true;
    object["key"] = value;
    assert(!value.isNull);
    assert(value == true);
    assert(value.kind == JsonAlgebraic.Kind.boolean);
    assert(value.get!bool == true);
    assert(value.get!(JsonAlgebraic.Kind.boolean) == true);

...

We added serialization of mir.algebraic a few months ago to both JSON libraries. It is effortless to implement: send a serialization lambda to a visitor. And the typeset isn't limited to JSON-like types (bool, string, double, etc.). If all types of the typeset are serializable, then algebraic is serializable as well.

Deserialization is more complicated. We need to define a rule of how we want to deserialize a JSON type set to an algebraic typeset. And more, we don't need to provide a unique JsonAlgebraic type.

Instead, we provide a common JSON algebraic alias mir.algebraic_alias.json and support user-provided algebraic aliases as well.

For example, user-provided types can be:

import mir.algebraic: Variant, Nullable, This;

struct Color { ubyte a, r, g, b; }

alias JsonValue0 = Variant!(string, Color[], This[string]);
alias JsonValue1 = Nullable!(This[], long);

Asdf match Json types according to the following rules:

  • typeof(null) can handle JSON null
  • bool can handle JSON true and false
  • string can handle JSON strings
  • double can handle JSON numbers
  • long can handle JSON integer numbers and has priority for them comparing to double
  • T[] can handle JSON arrays, where T is a deserializable type, including the algebraic itself
  • StringMap!T and T[string] can handle JSON objects, where T is a deserializable type, including the algebraic type itself. StringMap is an ordered string-value associative array with fast search operations. It has a priority over built-in associative arrays.
  • Other types of algebraic typeset aren't used for deserialization.
  • If no type can handle the current JSON value, then an exception is thrown.

Users APIs will be consistent even if they will define different, their own JSON algebraic aliases. Mir algebraic types:

  • are order-independent: Variant!(A, B) is the same type as Variant!(B, A)
  • can be constructed from their algebraic subset.
  • can get their algebraic subset.

mir-ion support for algebraic deserialization will be added later and extended with Ion types, including Blob, Clob, Timestamp, Decimal, and 4 bytes floats.

Kind regards,
Ilya


This work has been sponsored by Symmetry Investments and Kaleidic Associates.