August 23, 2014
Am 22.08.2014 20:08, schrieb Walter Bright:
> (...)
> 2. The escape sequenced strings presumably consume GC memory. This will
> be a problem for high performance code. I suggest either leaving them
> undecoded in the token stream, and letting higher level code decide what
> to do about them, or provide a hook that the user can override with his
> own allocation scheme.
>
> If we don't make it possible to use std.json without invoking the GC, I
> believe the module will fail in the long term.

I've added two new types now to abstract away how strings and numbers are represented in memory. For string literals this means that for input types "string" and "immutable(ubyte)[]" they will always be stored as slices to the input buffer. JSONValue has a .rawValue property to access them, as well as an "alias this"ed .value property that transparently unescapes.

At that place it would also be easy to provide a method that takes an arbitrary output range to unescape without allocations.

Documentation and code are both updated (also added a note about exception behavior).
August 23, 2014
Am 22.08.2014 21:00, schrieb "Marc Schütz" <schuetzm@gmx.net>":
> On Friday, 22 August 2014 at 18:08:34 UTC, Sönke Ludwig wrote:
>> Am 22.08.2014 19:57, schrieb "Marc Schütz" <schuetzm@gmx.net>":
>>> The easiest and cleanest way would be to add a function in
>>> std.data.json:
>>>
>>>     auto parse(Target, Source)(Source input)
>>>         if(is(Target == JSONValue))
>>>     {
>>>         return ...;
>>>     }
>>>
>>> The various overloads of `std.conv.parse` already have mutually
>>> exclusive template constraints, they will not collide with our function.
>>
>> Okay, for parse that may work, but what about to!()?
>
> What's the problem with to!()?

to!() definitely doesn't have a template constraint that excludes JSONValue. Instead, it will convert any struct type that doesn't define toString() to a D-like representation.
August 23, 2014
On Saturday, 23 August 2014 at 16:49:23 UTC, Sönke Ludwig wrote:
> Am 22.08.2014 21:00, schrieb "Marc Schütz" <schuetzm@gmx.net>":
>> On Friday, 22 August 2014 at 18:08:34 UTC, Sönke Ludwig wrote:
>>> Am 22.08.2014 19:57, schrieb "Marc Schütz" <schuetzm@gmx.net>":
>>>> The easiest and cleanest way would be to add a function in
>>>> std.data.json:
>>>>
>>>>    auto parse(Target, Source)(Source input)
>>>>        if(is(Target == JSONValue))
>>>>    {
>>>>        return ...;
>>>>    }
>>>>
>>>> The various overloads of `std.conv.parse` already have mutually
>>>> exclusive template constraints, they will not collide with our function.
>>>
>>> Okay, for parse that may work, but what about to!()?
>>
>> What's the problem with to!()?
>
> to!() definitely doesn't have a template constraint that excludes JSONValue. Instead, it will convert any struct type that doesn't define toString() to a D-like representation.

For converting a JSONValue to a different type, JSONValue can implement `opCast`, which is the regular interface that std.conv.to uses if it's available.

For converting something _to_ a JSONValue, std.conv.to will simply create an instance of it by calling the constructor.
August 23, 2014
Am 23.08.2014 19:25, schrieb "Marc Schütz" <schuetzm@gmx.net>":
> On Saturday, 23 August 2014 at 16:49:23 UTC, Sönke Ludwig wrote:
>> Am 22.08.2014 21:00, schrieb "Marc Schütz" <schuetzm@gmx.net>":
>>> On Friday, 22 August 2014 at 18:08:34 UTC, Sönke Ludwig wrote:
>>>> Am 22.08.2014 19:57, schrieb "Marc Schütz" <schuetzm@gmx.net>":
>>>>> The easiest and cleanest way would be to add a function in
>>>>> std.data.json:
>>>>>
>>>>>    auto parse(Target, Source)(Source input)
>>>>>        if(is(Target == JSONValue))
>>>>>    {
>>>>>        return ...;
>>>>>    }
>>>>>
>>>>> The various overloads of `std.conv.parse` already have mutually
>>>>> exclusive template constraints, they will not collide with our
>>>>> function.
>>>>
>>>> Okay, for parse that may work, but what about to!()?
>>>
>>> What's the problem with to!()?
>>
>> to!() definitely doesn't have a template constraint that excludes
>> JSONValue. Instead, it will convert any struct type that doesn't
>> define toString() to a D-like representation.
>
> For converting a JSONValue to a different type, JSONValue can implement
> `opCast`, which is the regular interface that std.conv.to uses if it's
> available.
>
> For converting something _to_ a JSONValue, std.conv.to will simply
> create an instance of it by calling the constructor.

That would just introduce the said dependency cycle between JSONValue, the parser and the lexer. Possible, but not particularly pretty. Also, using the JSONValue constructor to parse an input string would contradict the intuitive behavior to just store the string value.
August 23, 2014
On 8/23/2014 9:36 AM, Sönke Ludwig wrote:
> input types "string" and "immutable(ubyte)[]"

Why the immutable(ubyte)[] ?
August 23, 2014
Am 23.08.2014 19:38, schrieb Walter Bright:
> On 8/23/2014 9:36 AM, Sönke Ludwig wrote:
>> input types "string" and "immutable(ubyte)[]"
>
> Why the immutable(ubyte)[] ?

I've adopted that basically from Andrei's module. The idea is to allow processing data with arbitrary character encoding. However, the output will always be Unicode and JSON is defined to be encoded as Unicode, too, so that could probably be dropped...
August 23, 2014
On 8/23/2014 10:42 AM, Sönke Ludwig wrote:
> Am 23.08.2014 19:38, schrieb Walter Bright:
>> On 8/23/2014 9:36 AM, Sönke Ludwig wrote:
>>> input types "string" and "immutable(ubyte)[]"
>>
>> Why the immutable(ubyte)[] ?
>
> I've adopted that basically from Andrei's module. The idea is to allow
> processing data with arbitrary character encoding. However, the output will
> always be Unicode and JSON is defined to be encoded as Unicode, too, so that
> could probably be dropped...

I feel that non-UTF encodings should be handled by adapter algorithms, not embedded into the JSON lexer, so yes, I'd drop that.
August 23, 2014
On Saturday, 23 August 2014 at 17:32:01 UTC, Sönke Ludwig wrote:
> Am 23.08.2014 19:25, schrieb "Marc Schütz" <schuetzm@gmx.net>":
>> On Saturday, 23 August 2014 at 16:49:23 UTC, Sönke Ludwig wrote:
>>> Am 22.08.2014 21:00, schrieb "Marc Schütz" <schuetzm@gmx.net>":
>>>> On Friday, 22 August 2014 at 18:08:34 UTC, Sönke Ludwig wrote:
>>>>> Am 22.08.2014 19:57, schrieb "Marc Schütz" <schuetzm@gmx.net>":
>>>>>> The easiest and cleanest way would be to add a function in
>>>>>> std.data.json:
>>>>>>
>>>>>>   auto parse(Target, Source)(Source input)
>>>>>>       if(is(Target == JSONValue))
>>>>>>   {
>>>>>>       return ...;
>>>>>>   }
>>>>>>
>>>>>> The various overloads of `std.conv.parse` already have mutually
>>>>>> exclusive template constraints, they will not collide with our
>>>>>> function.
>>>>>
>>>>> Okay, for parse that may work, but what about to!()?
>>>>
>>>> What's the problem with to!()?
>>>
>>> to!() definitely doesn't have a template constraint that excludes
>>> JSONValue. Instead, it will convert any struct type that doesn't
>>> define toString() to a D-like representation.
>>
>> For converting a JSONValue to a different type, JSONValue can implement
>> `opCast`, which is the regular interface that std.conv.to uses if it's
>> available.
>>
>> For converting something _to_ a JSONValue, std.conv.to will simply
>> create an instance of it by calling the constructor.
>
> That would just introduce the said dependency cycle between JSONValue, the parser and the lexer. Possible, but not particularly pretty. Also, using the JSONValue constructor to parse an input string would contradict the intuitive behavior to just store the string value.

That's what I expect it to do anyway. For parsing, there are already other functions. "mystring".to!JSONValue should just wrap "mystring".
August 23, 2014
Am 23.08.2014 20:31, schrieb "Marc Schütz" <schuetzm@gmx.net>":
> On Saturday, 23 August 2014 at 17:32:01 UTC, Sönke Ludwig wrote:
>> Am 23.08.2014 19:25, schrieb "Marc Schütz" <schuetzm@gmx.net>":
>>> On Saturday, 23 August 2014 at 16:49:23 UTC, Sönke Ludwig wrote:
>>>> Am 22.08.2014 21:00, schrieb "Marc Schütz" <schuetzm@gmx.net>":
>>>>> On Friday, 22 August 2014 at 18:08:34 UTC, Sönke Ludwig wrote:
>>>>>> Am 22.08.2014 19:57, schrieb "Marc Schütz" <schuetzm@gmx.net>":
>>>>>>> The easiest and cleanest way would be to add a function in
>>>>>>> std.data.json:
>>>>>>>
>>>>>>>   auto parse(Target, Source)(Source input)
>>>>>>>       if(is(Target == JSONValue))
>>>>>>>   {
>>>>>>>       return ...;
>>>>>>>   }
>>>>>>>
>>>>>>> The various overloads of `std.conv.parse` already have mutually
>>>>>>> exclusive template constraints, they will not collide with our
>>>>>>> function.
>>>>>>
>>>>>> Okay, for parse that may work, but what about to!()?
>>>>>
>>>>> What's the problem with to!()?
>>>>
>>>> to!() definitely doesn't have a template constraint that excludes
>>>> JSONValue. Instead, it will convert any struct type that doesn't
>>>> define toString() to a D-like representation.
>>>
>>> For converting a JSONValue to a different type, JSONValue can implement
>>> `opCast`, which is the regular interface that std.conv.to uses if it's
>>> available.
>>>
>>> For converting something _to_ a JSONValue, std.conv.to will simply
>>> create an instance of it by calling the constructor.
>>
>> That would just introduce the said dependency cycle between JSONValue,
>> the parser and the lexer. Possible, but not particularly pretty. Also,
>> using the JSONValue constructor to parse an input string would
>> contradict the intuitive behavior to just store the string value.
>
> That's what I expect it to do anyway. For parsing, there are already
> other functions. "mystring".to!JSONValue should just wrap "mystring".

Probably, but then to!() is inconsistent with parse!(). Usually they are both the same apart from how the tail of the input string is handled.
August 23, 2014
On 8/23/2014 10:46 AM, Walter Bright via Digitalmars-d wrote:
> On 8/23/2014 10:42 AM, Sönke Ludwig wrote:
>> Am 23.08.2014 19:38, schrieb Walter Bright:
>>> On 8/23/2014 9:36 AM, Sönke Ludwig wrote:
>>>> input types "string" and "immutable(ubyte)[]"
>>>
>>> Why the immutable(ubyte)[] ?
>>
>> I've adopted that basically from Andrei's module. The idea is to allow
>> processing data with arbitrary character encoding. However, the output
>> will
>> always be Unicode and JSON is defined to be encoded as Unicode, too,
>> so that
>> could probably be dropped...
>
> I feel that non-UTF encodings should be handled by adapter algorithms,
> not embedded into the JSON lexer, so yes, I'd drop that.

For performance purposes, determining encoding during lexing is useful.  You can avoid any conversion costs when you know that the original string is ascii or utf-8 or other.  The cost during lexing is essentially zero.  The cost of storing that state might be a concern, or it might be free in otherwise unused padding space.  The cost of re-scanning strings that can be avoided is non-trivial.

My past experience with this was in an http parser, where there's even more complex logic than json parsing, but the concepts still apply.