October 14, 2015 Fastest JSON parser in the world is a D project | ||||
---|---|---|---|---|
| ||||
JSON parsing in D has come a long way, especially when you look at it from the efficiency angle as a popular benchmark does that has been forked by well known D contributers like Martin Nowak or Sönke Ludwig. The test is pretty simple: Parse a JSON object, containing an array of 1_000_000 3D coordinates in the range [0..1) and average them. The performance of std.json in parsing those was horrible still in the DMD 2.066 days*: DMD : 41.44s, 934.9Mb Gdc : 29.64s, 929.7Mb Python : 12.30s, 1410.2Mb Ruby : 13.80s, 2101.2Mb Then with 2.067 std.json got a major 3x speed improvement and rivaled the popular dynamic languages Ruby and Python: DMD : 13.02s, 1324.2Mb In the mean time several other D JSON libraries appeared with varying focus on performance or API: Medea : 56.75s, 1753.6Mb (GDC) libdjson : 24.47s, 1060.7Mb (GDC) stdx.data.json: 2.76s, 207.1Mb (LDC) Yep, that's right. stdx.data.json's pull parser finally beats the dynamic languages with native efficiency. (I used the default options here that provide you with an Exception and line number on errors.) A few days ago I decided to get some practical use out of my pet project 'fast' by implementing a JSON parser myself, that could rival even the by then fastest JSON parser, RapidJSON. The result can be seen in the benchmark results right now: https://github.com/kostya/benchmarks#json fast: 0.34s, 226.7Mb (GDC) RapidJSON: 0.79s, 687.1Mb (GCC) (* Timings from my computer, Haswell CPU, Linux amd64.) -- Marco |
October 14, 2015 Fastest JSON parser in the world is a D project | ||||
---|---|---|---|---|
| ||||
Posted in reply to Marco Leise | fast.json usage: UTF-8 and JSON validation of used portions by default: auto json = parseJSONFile("data.json"); Known good file input: auto json = parseTrustedJSONFile("data.json"); auto json = parseTrustedJSON(`{"x":123}`); Work with a single key from an object: json.singleKey!"someKey" json.someKey Iteration: foreach (key; json.byKey) // object by key foreach (idx; json) // array by index Remap member names: @JsonRemap(["clazz", "class"]) struct S { string clazz; } @JsonRemap(["clazz", "class"]) enum E { clazz; } Example: double x = 0, y = 0, z = 0; auto json = parseTrustedJSON(`{ "coordinates": [ { "x": 1, "y": 2, "z": 3 }, … ] }`); foreach (idx; json.coordinates) { // Provide one function for each key you are interested in json.keySwitch!("x", "y", "z")( { x += json.read!double; }, { y += json.read!double; }, { z += json.read!double; } ); } Features: - Loads double values in compliance with IEEE round-to-nearest (no precision loss in serialization->deserialization round trips) - UTF-8 validation of non-string input (file, ubyte[]) - Currently fastest JSON parser money can buy - Reads strings, enums, integral types, double, bool, POD structs consisting of those and pointers to such structs Shortcomings: - Rejects numbers with exponents of huge magnitude (>=10^28) - Only works on Posix x86/amd64 systems - No write capabilities - Data size limited by available contiguous virtual memory -- Marco |
October 14, 2015 Re: Fastest JSON parser in the world is a D project | ||||
---|---|---|---|---|
| ||||
Posted in reply to Marco Leise | On Wednesday, 14 October 2015 at 07:35:49 UTC, Marco Leise wrote:
> auto json = parseTrustedJSON(`{ "coordinates": [ { "x": 1, "y": 2, "z": 3 }, … ] }`);
I assume parseTrustedJSON is not validating? Did you use it in the benchmark? And were the competitors non-validating as well?
|
October 14, 2015 Re: Fastest JSON parser in the world is a D project | ||||
---|---|---|---|---|
| ||||
Posted in reply to Marco Leise | On Wednesday, 14 October 2015 at 07:01:49 UTC, Marco Leise wrote:
> https://github.com/kostya/benchmarks#json
I can't find fast.json here. Where is it?
|
October 14, 2015 Re: Fastest JSON parser in the world is a D project | ||||
---|---|---|---|---|
| ||||
Posted in reply to Marco Leise Attachments:
| On Wed, Oct 14, 2015 at 9:35 AM, Marco Leise via Digitalmars-d-announce < digitalmars-d-announce@puremagic.com> wrote: > Features: > - Loads double values in compliance with IEEE round-to-nearest > (no precision loss in serialization->deserialization round trips) > - UTF-8 validation of non-string input (file, ubyte[]) > - Currently fastest JSON parser money can buy > - Reads strings, enums, integral types, double, bool, POD > structs consisting of those and pointers to such structs > Does this version handle real world JSON? I've keep getting problems with vibe and JSON because web browsers will automatically make a "1" into a 1 which then causes exceptions in vibe. Does yours do lossless conversions automatically? |
October 15, 2015 Re: Fastest JSON parser in the world is a D project | ||||
---|---|---|---|---|
| ||||
Posted in reply to Idan Arye | Am Wed, 14 Oct 2015 07:55:18 +0000 schrieb Idan Arye <GenericNPC@gmail.com>: > On Wednesday, 14 October 2015 at 07:35:49 UTC, Marco Leise wrote: > > auto json = parseTrustedJSON(`{ "coordinates": [ { "x": 1, > > "y": 2, "z": 3 }, … ] }`); > > I assume parseTrustedJSON is not validating? Did you use it in the benchmark? And were the competitors non-validating as well? That is correct. For the benchmark parseJSONFile was used though, which validates UTF-8 and JSON in the used portions. That probably renders your third question superfluous. I wouldn't know anyways, but am inclined to think they all validate the entire JSON and some may skip UTF-8 validation, which is a low cost operation in this ASCII file anyways. -- Marco |
October 15, 2015 Re: Fastest JSON parser in the world is a D project | ||||
---|---|---|---|---|
| ||||
Posted in reply to Rory McGuire | Am Wed, 14 Oct 2015 10:22:37 +0200 schrieb Rory McGuire via Digitalmars-d-announce <digitalmars-d-announce@puremagic.com>: > Does this version handle real world JSON? > > I've keep getting problems with vibe and JSON because web browsers will automatically make a "1" into a 1 which then causes exceptions in vibe. > > Does yours do lossless conversions automatically? No I don't read numbers as strings. Could the client JavaScript be fixed? I fail to see why the conversion would happen automatically when the code could explicitly check for strings before doing math with the value "1". What do I miss? -- Marco |
October 15, 2015 Re: Fastest JSON parser in the world is a D project | ||||
---|---|---|---|---|
| ||||
Posted in reply to Per Nordlöw | Am Wed, 14 Oct 2015 08:19:52 +0000 schrieb Per Nordlöw <per.nordlow@gmail.com>: > On Wednesday, 14 October 2015 at 07:01:49 UTC, Marco Leise wrote: > > https://github.com/kostya/benchmarks#json > > I can't find fast.json here. Where is it? »»» D Gdc Fast 0.34 226.7 ««« C++ Rapid 0.79 687.1 Granted if he wrote "D fast.json" it would have been easier to identify. -- Marco |
October 15, 2015 Re: Fastest JSON parser in the world is a D project | ||||
---|---|---|---|---|
| ||||
Posted in reply to Marco Leise | On Wednesday, 14 October 2015 at 07:01:49 UTC, Marco Leise wrote:
> fast: 0.34s, 226.7Mb (GDC)
> RapidJSON: 0.79s, 687.1Mb (GCC)
>
> (* Timings from my computer, Haswell CPU, Linux amd64.)
Where's the code?
|
October 15, 2015 Re: Fastest JSON parser in the world is a D project | ||||
---|---|---|---|---|
| ||||
Posted in reply to Gary Willoughby Attachments:
| Gary Willoughby via Digitalmars-d-announce <digitalmars-d-announce@puremagic.com> napsal Čt, říj 15, 2015 v 10∶08 : > On Wednesday, 14 October 2015 at 07:01:49 UTC, Marco Leise wrote: >> fast: 0.34s, 226.7Mb (GDC) >> RapidJSON: 0.79s, 687.1Mb (GCC) >> >> (* Timings from my computer, Haswell CPU, Linux amd64.) > > Where's the code? code.dlang.org |
Copyright © 1999-2021 by the D Language Foundation