RFC: std.json sucessor (page 4) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » RFC: std.json sucessor (page 4)

August 22, 2014

Re: RFC: std.json sucessor

Posted by Marc Schütz
in reply to Sönke Ludwig

Marc Schütz

Posted in reply to Sönke Ludwig

On Friday, 22 August 2014 at 17:35:20 UTC, Sönke Ludwig wrote:
>> ... why not use exactly the same convention then? => `parse!JSONValue`
>>
>> Would be nice to have a "pluggable" API where you just need to specify
>> the type in a factory method to choose the input format. Then there
>> could be `parse!BSON`, `parse!YAML`, with the same style as
>> `parse!(int[])`.
>>
>> I know this sound a bit like bike-shedding, but the API shouldn't stand
>> by itself, but fit into the "big picture", especially as there will
>> probably be other parsers (you already named the module std._data_.json).
>
> That would be nice, but then it should also work together with std.conv, which basically is exactly this pluggable API. Just like this it would result in an ambiguity error if both std.data.json and std.conv are imported at the same time.
>
> Is there a way to make std.conv work properly with JSONValue? I guess the only theoretical way would be to put something in JSONValue, but that would result in a slightly ugly cyclic dependency between parser.d and value.d.

The easiest and cleanest way would be to add a function in std.data.json:

    auto parse(Target, Source)(Source input)
        if(is(Target == JSONValue))
    {
        return ...;
    }

The various overloads of `std.conv.parse` already have mutually exclusive template constraints, they will not collide with our function.

August 22, 2014

Re: RFC: std.json sucessor

Posted by Marc Schütz
in reply to Sönke Ludwig

Marc Schütz

Posted in reply to Sönke Ludwig

On Friday, 22 August 2014 at 17:45:03 UTC, Sönke Ludwig wrote:
> Am 22.08.2014 19:27, schrieb "Marc Schütz" <schuetzm@gmx.net>":
>> On Friday, 22 August 2014 at 16:56:26 UTC, Sönke Ludwig wrote:
>>> Am 22.08.2014 18:31, schrieb Christian Manning:
>>>> It would be nice to have integers treated separately to doubles. I know
>>>> it makes the number parsing simpler to just treat everything as double,
>>>> but still, it could be annoying when you expect an integer type.
>>>
>>> That's how I've done it for vibe.data.json, too. For the new
>>> implementation, I've just used the number parsing routine from
>>> Andrei's std.jgrandson module. Does anybody have reservations about
>>> representing integers as "long" instead?
>>
>> It should automatically fall back to double on overflow. Maybe even use
>> BigInt if applicable?
>
> I guess BigInt + exponent would be the only lossless way to represent any JSON number. That could then be converted to any desired smaller type as required.
>
> But checking for overflow during number parsing would definitely have an impact on parsing speed, as well as using a BigInt of course, so the question is how we want set up the trade off here (or if there is another way that is overhead-free).

As the functions will be templatized anyway, it should include a flags parameter. These and possible future extensions can then be selected by the user.

August 22, 2014

Re: RFC: std.json sucessor

Posted by Sönke Ludwig
in reply to Marc Schütz

Sönke Ludwig

Posted in reply to Marc Schütz

Am 22.08.2014 19:57, schrieb "Marc Schütz" <schuetzm@gmx.net>":
> On Friday, 22 August 2014 at 17:35:20 UTC, Sönke Ludwig wrote:
>>> ... why not use exactly the same convention then? => `parse!JSONValue`
>>>
>>> Would be nice to have a "pluggable" API where you just need to specify
>>> the type in a factory method to choose the input format. Then there
>>> could be `parse!BSON`, `parse!YAML`, with the same style as
>>> `parse!(int[])`.
>>>
>>> I know this sound a bit like bike-shedding, but the API shouldn't stand
>>> by itself, but fit into the "big picture", especially as there will
>>> probably be other parsers (you already named the module
>>> std._data_.json).
>>
>> That would be nice, but then it should also work together with
>> std.conv, which basically is exactly this pluggable API. Just like
>> this it would result in an ambiguity error if both std.data.json and
>> std.conv are imported at the same time.
>>
>> Is there a way to make std.conv work properly with JSONValue? I guess
>> the only theoretical way would be to put something in JSONValue, but
>> that would result in a slightly ugly cyclic dependency between
>> parser.d and value.d.
>
> The easiest and cleanest way would be to add a function in std.data.json:
>
>      auto parse(Target, Source)(Source input)
>          if(is(Target == JSONValue))
>      {
>          return ...;
>      }
>
> The various overloads of `std.conv.parse` already have mutually
> exclusive template constraints, they will not collide with our function.

Okay, for parse that may work, but what about to!()?

August 22, 2014

Re: RFC: std.json sucessor

Posted by Walter Bright
in reply to Sönke Ludwig

Walter Bright

Posted in reply to Sönke Ludwig

On 8/21/2014 3:35 PM, Sönke Ludwig wrote:
> Destroy away! ;)

Thanks for taking this on! This is valuable work. On to destruction!

I'm looking at:

http://s-ludwig.github.io/std_data_json/stdx/data/json/lexer/lexJSON.html

I anticipate this will be used a LOT and in very high speed demanding applications. With that in mind,


1. There's no mention of what will happen if it is passed malformed JSON strings. I presume an exception is thrown. Exceptions are both slow and consume GC memory. I suggest an alternative would be to emit an "Error" token instead; this would be much like how the UTF decoding algorithms emit a "replacement char" for invalid UTF sequences.

2. The escape sequenced strings presumably consume GC memory. This will be a problem for high performance code. I suggest either leaving them undecoded in the token stream, and letting higher level code decide what to do about them, or provide a hook that the user can override with his own allocation scheme.


If we don't make it possible to use std.json without invoking the GC, I believe the module will fail in the long term.

August 22, 2014

Re: RFC: std.json sucessor

Posted by Sönke Ludwig
in reply to Marc Schütz

Sönke Ludwig

Posted in reply to Marc Schütz

Am 22.08.2014 20:01, schrieb "Marc Schütz" <schuetzm@gmx.net>":
> On Friday, 22 August 2014 at 17:45:03 UTC, Sönke Ludwig wrote:
>> Am 22.08.2014 19:27, schrieb "Marc Schütz" <schuetzm@gmx.net>":
>>> On Friday, 22 August 2014 at 16:56:26 UTC, Sönke Ludwig wrote:
>>>> Am 22.08.2014 18:31, schrieb Christian Manning:
>>>>> It would be nice to have integers treated separately to doubles. I
>>>>> know
>>>>> it makes the number parsing simpler to just treat everything as
>>>>> double,
>>>>> but still, it could be annoying when you expect an integer type.
>>>>
>>>> That's how I've done it for vibe.data.json, too. For the new
>>>> implementation, I've just used the number parsing routine from
>>>> Andrei's std.jgrandson module. Does anybody have reservations about
>>>> representing integers as "long" instead?
>>>
>>> It should automatically fall back to double on overflow. Maybe even use
>>> BigInt if applicable?
>>
>> I guess BigInt + exponent would be the only lossless way to represent
>> any JSON number. That could then be converted to any desired smaller
>> type as required.
>>
>> But checking for overflow during number parsing would definitely have
>> an impact on parsing speed, as well as using a BigInt of course, so
>> the question is how we want set up the trade off here (or if there is
>> another way that is overhead-free).
>
> As the functions will be templatized anyway, it should include a flags
> parameter. These and possible future extensions can then be selected by
> the user.

I'm actually in the process of converting the "track_location" parameter to a flags enum and to add support for an error token, so this would fit right in.

August 22, 2014

Re: RFC: std.json sucessor

Posted by Marc Schütz
in reply to Sönke Ludwig

Marc Schütz

Posted in reply to Sönke Ludwig

On Friday, 22 August 2014 at 18:08:34 UTC, Sönke Ludwig wrote:
> Am 22.08.2014 19:57, schrieb "Marc Schütz" <schuetzm@gmx.net>":
>> The easiest and cleanest way would be to add a function in std.data.json:
>>
>>     auto parse(Target, Source)(Source input)
>>         if(is(Target == JSONValue))
>>     {
>>         return ...;
>>     }
>>
>> The various overloads of `std.conv.parse` already have mutually
>> exclusive template constraints, they will not collide with our function.
>
> Okay, for parse that may work, but what about to!()?

What's the problem with to!()?

August 22, 2014

Re: RFC: std.json sucessor

Posted by Andrej Mitrovic
in reply to Sönke Ludwig

Andrej Mitrovic

Posted in reply to Sönke Ludwig

On 8/22/14, Sönke Ludwig <digitalmars-d@puremagic.com> wrote:
> Docs: http://s-ludwig.github.io/std_data_json/

This confused me for a solid minute:

// Lex a JSON string into a lazy range of tokens
auto tokens = lexJSON(`{"name": "Peter", "age": 42}`);

with (JSONToken.Kind) {
    assert(tokens.map!(t => t.kind).equal(
        [objectStart, string, colon, string, comma,
        string, colon, number, objectEnd]));
}

Generally I'd avoid using de-facto reserved names as enum member names
(e.g. string).

August 22, 2014

Re: RFC: std.json sucessor

Posted by Sönke Ludwig
in reply to Andrej Mitrovic

Sönke Ludwig

Posted in reply to Andrej Mitrovic

Am 22.08.2014 21:15, schrieb Andrej Mitrovic via Digitalmars-d:
> On 8/22/14, Sönke Ludwig <digitalmars-d@puremagic.com> wrote:
>> Docs: http://s-ludwig.github.io/std_data_json/
>
> This confused me for a solid minute:
>
> // Lex a JSON string into a lazy range of tokens
> auto tokens = lexJSON(`{"name": "Peter", "age": 42}`);
>
> with (JSONToken.Kind) {
>      assert(tokens.map!(t => t.kind).equal(
>          [objectStart, string, colon, string, comma,
>          string, colon, number, objectEnd]));
> }
>
> Generally I'd avoid using de-facto reserved names as enum member names
> (e.g. string).
>

Hmmm, but it *is* a string. Isn't the problem more the use of with in this case? Maybe the example should just use with(JSONToken) and then Kind.string?

August 22, 2014

Re: RFC: std.json sucessor

Posted by Christian Manning
in reply to Sönke Ludwig

Christian Manning

Posted in reply to Sönke Ludwig

On Friday, 22 August 2014 at 17:45:03 UTC, Sönke Ludwig wrote:
> Am 22.08.2014 19:27, schrieb "Marc Schütz" <schuetzm@gmx.net>":
>> On Friday, 22 August 2014 at 16:56:26 UTC, Sönke Ludwig wrote:
>>> Am 22.08.2014 18:31, schrieb Christian Manning:
>>>> It would be nice to have integers treated separately to doubles. I know
>>>> it makes the number parsing simpler to just treat everything as double,
>>>> but still, it could be annoying when you expect an integer type.
>>>
>>> That's how I've done it for vibe.data.json, too. For the new
>>> implementation, I've just used the number parsing routine from
>>> Andrei's std.jgrandson module. Does anybody have reservations about
>>> representing integers as "long" instead?
>>
>> It should automatically fall back to double on overflow. Maybe even use
>> BigInt if applicable?
>
> I guess BigInt + exponent would be the only lossless way to represent any JSON number. That could then be converted to any desired smaller type as required.
>
> But checking for overflow during number parsing would definitely have an impact on parsing speed, as well as using a BigInt of course, so the question is how we want set up the trade off here (or if there is another way that is overhead-free).

You could check for a decimal point and a 0 at the front (excluding possible - sign), either would indicate a double, making the reasonable assumption that anything else will fit in a long.

August 22, 2014

Re: RFC: std.json sucessor

Posted by Sönke Ludwig
in reply to Christian Manning

Sönke Ludwig

Posted in reply to Christian Manning

Am 22.08.2014 21:48, schrieb Christian Manning:
> On Friday, 22 August 2014 at 17:45:03 UTC, Sönke Ludwig wrote:
>> Am 22.08.2014 19:27, schrieb "Marc Schütz" <schuetzm@gmx.net>":
>>> On Friday, 22 August 2014 at 16:56:26 UTC, Sönke Ludwig wrote:
>>>> Am 22.08.2014 18:31, schrieb Christian Manning:
>>>>> It would be nice to have integers treated separately to doubles. I
>>>>> know
>>>>> it makes the number parsing simpler to just treat everything as
>>>>> double,
>>>>> but still, it could be annoying when you expect an integer type.
>>>>
>>>> That's how I've done it for vibe.data.json, too. For the new
>>>> implementation, I've just used the number parsing routine from
>>>> Andrei's std.jgrandson module. Does anybody have reservations about
>>>> representing integers as "long" instead?
>>>
>>> It should automatically fall back to double on overflow. Maybe even use
>>> BigInt if applicable?
>>
>> I guess BigInt + exponent would be the only lossless way to represent
>> any JSON number. That could then be converted to any desired smaller
>> type as required.
>>
>> But checking for overflow during number parsing would definitely have
>> an impact on parsing speed, as well as using a BigInt of course, so
>> the question is how we want set up the trade off here (or if there is
>> another way that is overhead-free).
>
> You could check for a decimal point and a 0 at the front (excluding
> possible - sign), either would indicate a double, making the reasonable
> assumption that anything else will fit in a long.

Yes, no decimal point + no exponent would work without overhead to detect integers, but that wouldn't solve the proposed automatic long->double overflow, which is what I meant. My current idea is to default to double and optionally support any of long, BigInt and "Decimal" (BigInt+exponent), where integer overflow only works for long->BigInt.

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation