June 04, 2022

Asdf JSON library is beating my attempts to replace it with Mir Ion [1].

Both are Mir's projects. The first one has been created for Tamedia and the second one for Symmetry. Both are widely used in production and Asdf receives the must-have patches to make it more compatible with Mir Ion. The only reason is to make its replacement with the Ion alternative faster. Both libraries share mir.serde UDAs and mir number parsing [2].

All of a sudden, yesterday Asdf become the World's third-fastest [3] implementation. And the second-fastest DOM-based implementation. The only library that builds JSON DOM faster is simdjson:

  • 1717 MB/s JSON -> simdjson On-Demand deserialization
  • 833 MB/s JSON -> simdjson DOM writing -> DOM deserialization
  • 788 MB/s JSON -> Asdf DOM writing -> DOM deserialization
  • 600 MB/s JSON -> Amazon's Ion DOM writing -> DOM deserialization

On the other hand, Asdf requires 2.9 times less DOM memory than simdjson. Mir Ion - 11 times less [3].

Asdf is too good to be deprecated. We have to support it as a stable library.

[1] Mir Ion adds more serialization features. It supports text and binary Amazon's Ion formats, MsgPack, YAML, and Bloomberg Private API under the same reflection-based de/serialization.

[2] It is quite uneasy to parse decimals with a round-half-to-even rule according to IEEE 754. And it is much harder to make correct parsing fast because it involves big integer arithmetics. The recent Mir Algorithm release finally got multiplication and division for its scope allocated BigInt implementation. It reuses Phobos multiplication kernels and reworks caching and blocking logic. Mir BigInt gives up to 35% speedup for RSA-like computations and about 100-200% for smaller numbers. For really fast general-purpose big integer implementation one may want to use gmp-d, which is up to 10 times faster.

[3] According to kostya/benchmarks. We don't count gason and DAW JSON Link libraries because their number parsing is approximate.

[4] Ion format use symbol tables to reduce memory usage.

Kind regards,
Ilia Ki