Jump to page: 1 212  
Page
Thread overview
Fastest JSON parser in the world is a D project
Oct 14, 2015
Marco Leise
Oct 14, 2015
Marco Leise
Oct 14, 2015
Idan Arye
Oct 15, 2015
Marco Leise
Oct 14, 2015
Rory McGuire
Oct 15, 2015
Marco Leise
Oct 15, 2015
Rory McGuire
Oct 15, 2015
Sönke Ludwig
Oct 16, 2015
Marco Leise
Oct 17, 2015
Sönke Ludwig
Oct 17, 2015
Marco Leise
Oct 17, 2015
Sönke Ludwig
Oct 18, 2015
Brad Anderson
Oct 17, 2015
Martin Nowak
Oct 17, 2015
Daniel N
Oct 17, 2015
Marco Leise
Oct 17, 2015
Marco Leise
Oct 26, 2015
Nordlöw
Oct 27, 2015
wobbles
Oct 27, 2015
Martin Nowak
Oct 28, 2015
wobbles
Oct 28, 2015
wobbles
Oct 28, 2015
Adam D. Ruppe
Oct 28, 2015
Adam D. Ruppe
Oct 28, 2015
Meta
Oct 28, 2015
Marco Leise
May 16, 2020
mw
Oct 14, 2015
Per Nordlöw
Oct 15, 2015
Marco Leise
Oct 15, 2015
Gary Willoughby
Oct 15, 2015
Daniel Kozak
Oct 15, 2015
Daniel Kozak
Oct 15, 2015
Johannes Pfau
Oct 15, 2015
Jack Stouffer
Oct 15, 2015
Jacob Carlborg
Oct 15, 2015
Jack Stouffer
Oct 17, 2015
Piotrek
Oct 15, 2015
Jonathan M Davis
Oct 16, 2015
Jacob Carlborg
Oct 16, 2015
Mike Parker
Oct 16, 2015
Mike Parker
Oct 16, 2015
Nick Sabalausky
Oct 15, 2015
Per Nordlöw
Oct 15, 2015
wobbles
Oct 15, 2015
Jonathan M Davis
Oct 16, 2015
Jacob Carlborg
Oct 16, 2015
Jonathan M Davis
Oct 15, 2015
Sönke Ludwig
Oct 16, 2015
Marco Leise
Oct 19, 2015
Sönke Ludwig
Oct 21, 2015
Suliman
Oct 21, 2015
Jonathan M Davis
Oct 21, 2015
Suliman
Oct 21, 2015
Marco Leise
Oct 16, 2015
Per Nordlöw
Oct 16, 2015
Marco Leise
Oct 17, 2015
Sean Kelly
Oct 17, 2015
Sean Kelly
Oct 17, 2015
Marco Leise
Oct 18, 2015
rsw0x
Oct 18, 2015
Marco Leise
Oct 21, 2015
Laeeth Isharc
Oct 21, 2015
Kapps
Oct 21, 2015
Laeeth Isharc
Oct 21, 2015
Suliman
Oct 21, 2015
Laeeth Isharc
Oct 22, 2015
Walter Bright
Oct 22, 2015
Laeeth Isharc
Oct 22, 2015
Meta
Oct 22, 2015
rsw0x
Oct 24, 2015
Laeeth Isharc
Oct 21, 2015
Marco Leise
Oct 21, 2015
Laeeth Isharc
Oct 22, 2015
Walter Bright
Oct 22, 2015
Marco Leise
Oct 23, 2015
bachmeier
Oct 23, 2015
Joakim
Oct 23, 2015
Walter Bright
Oct 23, 2015
Walter Bright
Oct 23, 2015
Jacob Carlborg
Oct 23, 2015
Laeeth Isharc
Oct 22, 2015
Nick Sabalausky
Oct 22, 2015
Laeeth Isharc
Oct 29, 2015
Suliman
Oct 29, 2015
Jack Applegame
Nov 16, 2015
Suliman
Apr 25, 2017
Mir Al Monsor
Jul 13, 2018
iris
Aug 01, 2018
Marco Leise
October 14, 2015
JSON parsing in D has come a long way, especially when you look at it from the efficiency angle as a popular benchmark does that has been forked by well known D contributers like Martin Nowak or Sönke Ludwig.

The test is pretty simple: Parse a JSON object, containing an array of 1_000_000 3D coordinates in the range [0..1) and average them.

The performance of std.json in parsing those was horrible still in the DMD 2.066 days*:

DMD     : 41.44s,  934.9Mb
Gdc     : 29.64s,  929.7Mb
Python  : 12.30s, 1410.2Mb
Ruby    : 13.80s, 2101.2Mb

Then with 2.067 std.json got a major 3x speed improvement and rivaled the popular dynamic languages Ruby and Python:

DMD     : 13.02s, 1324.2Mb

In the mean time several other D JSON libraries appeared with varying focus on performance or API:

Medea         : 56.75s, 1753.6Mb  (GDC)
libdjson      : 24.47s, 1060.7Mb  (GDC)
stdx.data.json:  2.76s,  207.1Mb  (LDC)

Yep, that's right. stdx.data.json's pull parser finally beats the dynamic languages with native efficiency. (I used the default options here that provide you with an Exception and line number on errors.)

A few days ago I decided to get some practical use out of my pet project 'fast' by implementing a JSON parser myself, that could rival even the by then fastest JSON parser, RapidJSON. The result can be seen in the benchmark results right now:

https://github.com/kostya/benchmarks#json

fast:	   0.34s, 226.7Mb (GDC)
RapidJSON: 0.79s, 687.1Mb (GCC)

(* Timings from my computer, Haswell CPU, Linux amd64.)

-- 
Marco

October 14, 2015
fast.json usage:

UTF-8 and JSON validation of used portions by default:

    auto json = parseJSONFile("data.json");

Known good file input:

    auto json = parseTrustedJSONFile("data.json");
    auto json = parseTrustedJSON(`{"x":123}`);

Work with a single key from an object:

    json.singleKey!"someKey"
    json.someKey

Iteration:

    foreach (key; json.byKey)  // object by key
    foreach (idx; json)        // array by index

Remap member names:

    @JsonRemap(["clazz", "class"])
    struct S { string clazz; }

    @JsonRemap(["clazz", "class"])
    enum E { clazz; }

Example:

    double x = 0, y = 0, z = 0;
    auto json = parseTrustedJSON(`{ "coordinates": [ { "x": 1, "y": 2, "z": 3 }, … ] }`);

    foreach (idx; json.coordinates)
    {
        // Provide one function for each key you are interested in
        json.keySwitch!("x", "y", "z")(
                { x += json.read!double; },
                { y += json.read!double; },
                { z += json.read!double; }
            );
    }

Features:
  - Loads double values in compliance with IEEE round-to-nearest
    (no precision loss in serialization->deserialization round trips)
  - UTF-8 validation of non-string input (file, ubyte[])
  - Currently fastest JSON parser money can buy
  - Reads strings, enums, integral types, double, bool, POD
    structs consisting of those and pointers to such structs

Shortcomings:
  - Rejects numbers with exponents of huge magnitude (>=10^28)
  - Only works on Posix x86/amd64 systems
  - No write capabilities
  - Data size limited by available contiguous virtual memory

-- 
Marco

October 14, 2015
On Wednesday, 14 October 2015 at 07:35:49 UTC, Marco Leise wrote:
>     auto json = parseTrustedJSON(`{ "coordinates": [ { "x": 1, "y": 2, "z": 3 }, … ] }`);

I assume parseTrustedJSON is not validating? Did you use it in the benchmark? And were the competitors non-validating as well?
October 14, 2015
On Wednesday, 14 October 2015 at 07:01:49 UTC, Marco Leise wrote:
> https://github.com/kostya/benchmarks#json

I can't find fast.json here. Where is it?
October 14, 2015
On Wed, Oct 14, 2015 at 9:35 AM, Marco Leise via Digitalmars-d-announce < digitalmars-d-announce@puremagic.com> wrote:

> Features:
>   - Loads double values in compliance with IEEE round-to-nearest
>     (no precision loss in serialization->deserialization round trips)
>   - UTF-8 validation of non-string input (file, ubyte[])
>   - Currently fastest JSON parser money can buy
>   - Reads strings, enums, integral types, double, bool, POD
>     structs consisting of those and pointers to such structs
>

Does this version handle real world JSON?

I've keep getting problems with vibe and JSON because web browsers will automatically make a "1" into a 1 which then causes exceptions in vibe.

Does yours do lossless conversions automatically?


October 15, 2015
Am Wed, 14 Oct 2015 07:55:18 +0000
schrieb Idan Arye <GenericNPC@gmail.com>:

> On Wednesday, 14 October 2015 at 07:35:49 UTC, Marco Leise wrote:
> >     auto json = parseTrustedJSON(`{ "coordinates": [ { "x": 1,
> > "y": 2, "z": 3 }, … ] }`);
> 
> I assume parseTrustedJSON is not validating? Did you use it in the benchmark? And were the competitors non-validating as well?

That is correct. For the benchmark parseJSONFile was used though, which validates UTF-8 and JSON in the used portions. That probably renders your third question superfluous. I wouldn't know anyways, but am inclined to think they all validate the entire JSON and some may skip UTF-8 validation, which is a low cost operation in this ASCII file anyways.

-- 
Marco

October 15, 2015
Am Wed, 14 Oct 2015 10:22:37 +0200
schrieb Rory McGuire via Digitalmars-d-announce
<digitalmars-d-announce@puremagic.com>:

> Does this version handle real world JSON?
> 
> I've keep getting problems with vibe and JSON because web browsers will automatically make a "1" into a 1 which then causes exceptions in vibe.
> 
> Does yours do lossless conversions automatically?

No I don't read numbers as strings. Could the client JavaScript be fixed? I fail to see why the conversion would happen automatically when the code could explicitly check for strings before doing math with the value "1". What do I miss?

-- 
Marco

October 15, 2015
Am Wed, 14 Oct 2015 08:19:52 +0000
schrieb Per Nordlöw <per.nordlow@gmail.com>:

> On Wednesday, 14 October 2015 at 07:01:49 UTC, Marco Leise wrote:
> > https://github.com/kostya/benchmarks#json
> 
> I can't find fast.json here. Where is it?

»»» D Gdc Fast	0.34	226.7 «««
    C++ Rapid	0.79	687.1

Granted if he wrote "D fast.json" it would have been easier to identify.

-- 
Marco

October 15, 2015
On Wednesday, 14 October 2015 at 07:01:49 UTC, Marco Leise wrote:
> fast:	   0.34s, 226.7Mb (GDC)
> RapidJSON: 0.79s, 687.1Mb (GCC)
>
> (* Timings from my computer, Haswell CPU, Linux amd64.)

Where's the code?
October 15, 2015

Gary Willoughby via Digitalmars-d-announce <digitalmars-d-announce@puremagic.com> napsal Čt, říj 15, 2015 v 10∶08 :
> On Wednesday, 14 October 2015 at 07:01:49 UTC, Marco Leise wrote:
>> fast:	   0.34s, 226.7Mb (GDC)
>> RapidJSON: 0.79s, 687.1Mb (GCC)
>> 
>> (* Timings from my computer, Haswell CPU, Linux amd64.)
> 
> Where's the code?

code.dlang.org


« First   ‹ Prev
1 2 3 4 5 6 7 8 9 10 11