std.data.json formal review (page 3) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » std.data.json formal review (page 3)

July 29, 2015

Re: std.data.json formal review

Posted by Rikki Cattermole
in reply to Etienne Cimon

Rikki Cattermole

Posted in reply to Etienne Cimon

On 29/07/2015 4:23 a.m., Etienne Cimon wrote:
> On Tuesday, 28 July 2015 at 15:55:04 UTC, Brad Anderson wrote:
>> On Tuesday, 28 July 2015 at 15:07:46 UTC, Rikki Cattermole wrote:
>>> On 29/07/2015 2:07 a.m., Atila Neves wrote:
>>>> Start of the two week process, folks.
>>>>
>>>> Code: https://github.com/s-ludwig/std_data_json
>>>> Docs: http://s-ludwig.github.io/std_data_json/
>>>>
>>>> Atila
>>>
>>> Right now, my view is no.
>>
>> Just a reminder that this is the review thread, not the vote thread
>> (in case anyone reading got confused).
>>
>>> Unless there is some sort of proof that it will work with allocators.
>>>
>>> I have used the code from vibe.d days so its not an issue of how well
>>> it works nor nit picky. Just can I pass it an allocator (optionally)
>>> and have it use that for all memory usage?
>>>
>>> After all, I really would rather be able to deallocate all memory
>>> allocated during a request then you know, rely on the GC.
>>
>> That's a good point. This is the perfect opportunity to hammer out how
>> allocators are going to be integrated into other parts of Phobos.
>
>  From what I see from std.allocator, there's no Allocator interface? I
> think this would require changing the type to `struct
> JSONValue(Allocator)`, unless we see an actual interface implemented in
> phobos.

There is one. IAllocator.
I use it throughout std.experimental.image. Unfortunately site is down atm so can't link docs *grumbles*.
Btw even if an allocator is a struct, there is a type to wrap it up in a class.

July 29, 2015

Re: std.data.json formal review

Posted by Rikki Cattermole
in reply to Mathias Lang

Rikki Cattermole

Posted in reply to Mathias Lang

On 29/07/2015 4:25 a.m., Mathias Lang via Digitalmars-d wrote:
> 2015-07-28 17:55 GMT+02:00 Brad Anderson via Digitalmars-d
> <digitalmars-d@puremagic.com <mailto:digitalmars-d@puremagic.com>>:
>
>
>         Unless there is some sort of proof that it will work with
>         allocators.
>
>         I have used the code from vibe.d days so its not an issue of how
>         well it works nor nit picky. Just can I pass it an allocator
>         (optionally) and have it use that for all memory usage?
>
>         After all, I really would rather be able to deallocate all
>         memory allocated during a request then you know, rely on the GC.
>
>
>     That's a good point. This is the perfect opportunity to hammer out
>     how allocators are going to be integrated into other parts of Phobos.
>
>
> Allocator is definitely a separate issue. It's a moving target, it's not
> yet part of a release, and consequently barely field-tested. We will
> find bugs, we might find design mistakes, we might head in a direction
> which will turn out to be an anti-pattern (just like `opDispatch` for
> JSONValue ;) )
> It's not to say the quality of the module isn't good - that would mean
> our release process is broken -, but making a module inclusion to
> experimental dependent on another module in experimental will not
> improve the quality of the reviewed module.

Right now we just need a plan, and we're all good for std.data.json.
Doesn't need to implemented right now, but I'd rather we had a plan going forward to add allocators to it, then you know find out a year down the track that it would need a whole rewrite.

July 29, 2015

Re: std.data.json formal review

Posted by Rikki Cattermole
in reply to Sönke Ludwig

Rikki Cattermole

Posted in reply to Sönke Ludwig

On 29/07/2015 4:41 a.m., Sönke Ludwig wrote:
> Am 28.07.2015 um 17:07 schrieb Rikki Cattermole:
>> On 29/07/2015 2:07 a.m., Atila Neves wrote:
>>> Start of the two week process, folks.
>>>
>>> Code: https://github.com/s-ludwig/std_data_json
>>> Docs: http://s-ludwig.github.io/std_data_json/
>>>
>>> Atila
>>
>> Right now, my view is no.
>> Unless there is some sort of proof that it will work with allocators.
>>
>> I have used the code from vibe.d days so its not an issue of how well it
>> works nor nit picky. Just can I pass it an allocator (optionally) and
>> have it use that for all memory usage?
>>
>> After all, I really would rather be able to deallocate all memory
>> allocated during a request then you know, rely on the GC.
>
> If you pass a string or byte array as input, then there will be no
> allocations at all (the interface is @nogc).
>
> For other cases it supports custom allocation through an appender
> factory [1][2], since there is no standard allocator interface, yet. But
> since that's the only place where memory is allocated (apart from lower
> level code, such as BigInt), as soon as Appender supports custom
> allocators, or you write your own appender, the JSON parser will, too.
>
> Only if you use the DOM parser, there will be some inevitable GC
> allocations, because the DOM representation uses dynamic and associative
> arrays.
>
> 1:
> https://github.com/s-ludwig/std_data_json/blob/aac6d846d596750623fd5c546343f4f9d19447fa/source/stdx/data/json/lexer.d#L66
>
> 2:
> https://github.com/s-ludwig/std_data_json/blob/aac6d846d596750623fd5c546343f4f9d19447fa/source/stdx/data/json/parser.d#L286

It was after 3am when I did my initial look. But I saw the appender usage. I'm ok with this.
The DOM parser on the other hand.. ugh this is where we do need IAllocator being used. Although by the sounds of it, we would need a map collection which supports allocators before it can be done.

July 29, 2015

Re: std.data.json formal review

Posted by Rikki Cattermole
in reply to Sönke Ludwig

Rikki Cattermole

Posted in reply to Sönke Ludwig

On 29/07/2015 4:43 a.m., Sönke Ludwig wrote:
> Am 28.07.2015 um 17:07 schrieb Rikki Cattermole:
>> I have used the code from vibe.d days so its not an issue of how well it
>> works nor nit picky.
>
> You should still have a closer look, as it isn't very similar to the
> vibe.d code at all, but a rather radical evolution.

Again after 3am when I first looked. I'll take a closer look and create a new thread on this post about anything I find.

July 29, 2015

Re: std.data.json formal review

Posted by Walter Bright
in reply to H. S. Teoh

Walter Bright

Posted in reply to H. S. Teoh

On 7/28/2015 5:15 PM, H. S. Teoh via Digitalmars-d wrote:
>> Probably simply returning an InputRange of JSON values.
> But how would you capture the nesting substructures?

A JSON value is a tagged union of the various types.


> ??!  Surely you have heard of the non-allocating overload of toString?
> 	void toString(scope void delegate(const(char)[]) dg);

Not range friendly.

July 29, 2015

Re: std.data.json formal review

Posted by Walter Bright
in reply to Walter Bright

Walter Bright

Posted in reply to Walter Bright

On 7/28/2015 3:55 PM, Walter Bright wrote:
>> OTOH, some people might want the option of parser-driven data processing
>> instead (e.g. the JSON data is very large and we don't want to store the
>> whole thing in memory at once).
>
> That is a good point.

So it appears that JSON can be in one of 3 useful states:

1. a range of characters (rc)
2. a range of nodes (rn)
3. a container of JSON values (values)

What's necessary is simply the ability to convert between these states:

(names are just for illustration)

   rn = rc.toNodes();
   values = rn.toValues();
   rn = values.toNodes();
   rc = rn.toChars();

So, if I wanted to simply pretty print a JSON string s:

   s.toNodes.toChars();

I.e. it's all composable.

July 29, 2015

Re: std.data.json formal review

Posted by H. S. Teoh
in reply to Walter Bright

H. S. Teoh

Posted in reply to Walter Bright

On Tue, Jul 28, 2015 at 10:43:20PM -0700, Walter Bright via Digitalmars-d wrote:
> On 7/28/2015 3:55 PM, Walter Bright wrote:
> >>OTOH, some people might want the option of parser-driven data processing instead (e.g. the JSON data is very large and we don't want to store the whole thing in memory at once).
> >
> >That is a good point.
> 
> So it appears that JSON can be in one of 3 useful states:
> 
> 1. a range of characters (rc)
> 2. a range of nodes (rn)
> 3. a container of JSON values (values)
[...]

How does a linear range of nodes convey a nested structure?


T

-- 
Let's call it an accidental feature. -- Larry Wall

July 29, 2015

Re: std.data.json formal review

Posted by Andrea Fontana
in reply to Atila Neves

Andrea Fontana

Posted in reply to Atila Neves

On Tuesday, 28 July 2015 at 14:07:19 UTC, Atila Neves wrote:
> Start of the two week process, folks.
>
> Code: https://github.com/s-ludwig/std_data_json
> Docs: http://s-ludwig.github.io/std_data_json/
>
> Atila

Why don't do a shortcut like:

jv.opt("/this/is/a/path") ?

I use it in my json/bson binding.

Anyway, opt(...).isNull return true if that sub-obj doesn't exists.
How can I check instead if that sub-object is actually null?

Something like:  { "a" : { "b" : null} } ?

It would be nice to have a way to get a default if it doesn't exists.
On my library that behave in a different way i write:

Object is :  { address : { number: 15 } }

// as!xxx try to get a value of that type, if it can't it tries to convert it using .to!xxx if it fails again it returns default

// Converted as string
assert(obj["/address/number"].as!string == "15");

// This doesn't exists
assert(obj["/address/asdasd"].as!int == int.init);

// A default value is specified		
assert(obj["/address/asdasd"].as!int(50) == 50);

// A default value is specified (but value exists)
assert(obj["/address/number"].as!int(50) == 15);

// This doesn't exists
assert(!obj["address"]["number"]["this"].exists);

My library has a get!xxx string too (that throws an exception if value is not xxx) and to!xxx that throws an exception if value can't converted to xxx.

Other feature:
// This field doesn't exists return default value
auto tmpField = obj["/address/asdasd"].as!int(50);
assert(tmpField.error == true);   // Value is defaulted ...
assert(tmpField.exists == false); // ... because it doesn't exists
assert(tmpField == 50);

// This field exists, but can't be converted to int. Return default value.
tmpField = obj["/tags/0"].as!int(50);
assert(tmpField.error == true);   // Value is defaulted ...
assert(tmpField.exists == true);  // ... but a field is actually here
assert(tmpField == 50);

July 29, 2015

Re: std.data.json formal review

Posted by Sönke Ludwig
in reply to Walter Bright

Sönke Ludwig

Posted in reply to Walter Bright

Am 29.07.2015 um 00:29 schrieb Walter Bright:
> On 7/28/2015 7:07 AM, Atila Neves wrote:
>> Start of the two week process, folks.
>
> Thank you very much, Sönke, for taking this on. Thank you, Atila, for
> taking on the thankless job of being review manager.
>
> Just looking at the documentation only, some general notes:
>
> 1. Not sure that 'JSON' needs to be embedded in the public names.
> 'parseJSONStream' should just be 'parseStream', etc. Name
> disambiguation, if needed, should be ably taken care of by a number of D
> features for that purpose. Additionally, I presume that the stdx.data
> package implies a number of different formats. These formats should all
> use the same names with as similar as possible APIs - this won't work
> too well if JSON is embedded in the APIs.

This is actually one of my pet peeves. Having a *readable* API that tells the reader immediately what happens is IMO one of the most important aspects (far more important than an API that allows quick typing). A number of times I've seen D code that omits part of what it actually does in its name and the result was that it was constantly necessary to scroll up to see where a particular name might come from. So I have a strong preference to keep "JSON", because it's an integral part of the semantics.

>
> 2. JSON is a trivial format, http://json.org/. But I count 6 files and
> 30 names in the public API.

The whole thing provides a stream parser with high level helpers to make it convenient to use, a DOM module, a separate lexer and a generator module that operates in various different modes (maybe two additional modes still to come!). Every single function provides real and frequently useful benefits. So if anything, there are still some little things missing.

All in all, even if JSON may be a simple format, the source code is already almost 5k LOC (includes unit tests of course). But apart from maintainability they have mainly been separated to minimize the amount of code that needs to be dragged in for a particular functionality (not only other JSON modules, but also from different parts of Phobos).

>
> 3. Stepping back a bit, when I think of parsing JSON data, I think:
>
>      auto ast = inputrange.toJSON();
>
> where toJSON() accepts an input range and produces a container, the ast.
> The ast is just a JSON value. Then, I can just query the ast to see what
> kind of value it is (using overloading), and walk it as necessary.

We can drop the "Value" part of the name of course, if we expect that function to be used a lot, but there is still the parseJSONStream function which is arguably not less important. BTW, you just mentioned the DOM part so far, but for any code that where performance is a priority, the stream based pull parser is basically the way to go. This would also be the natural entry point for any serialization library.

And my prediction is, if we do it right, that working with JSON will in most cases simply mean "S s = deserializeJSON(json_input);", where S is a D struct that gets populated with the deserialized JSON data. Where that doesn't fit, performance oriented code would use the pull parser. So the DOM part of the system, which is the only thing the current JSON module has, will only be left as a niche functionality.

> To create output:
>
>      auto r = ast.toChars();  // r is an InputRange of characters
>      writeln(r);

Do we have an InputRange version of the various number-to-string conversions? It would be quite inconvenient to reinvent those (double, long, BigInt) in the JSON package. Of course, using to!string internally would be an option, but it would obviously destroy all @nogc opportunities and performance benefits.

>
> So, we'll need:
>      toJSON
>      toChars
>      JSONException
>
> The possible JSON values are:
>      string
>      number
>      object (associative arrays)
>      array
>      true
>      false
>      null
>
> Since these are D builtin types, they can actually be a simple union of
> D builtin types.

The idea is to have JSONValue be a simple alias to Algebraic!(...), just that there are currently still some workarounds for DMD < 2.067.0 on top, which means that JSONValue is a struct that "alias this" inherits from Algebraic for the time being. Those workarounds will be removed when the code is actually put into Phobos.

But a simple union would obviously not be enough, it still needs a type tag of some form and needs to provide a @safe interface on top of it. Algebraic is the only thing that comes close right now, but I'd really prefer to have a fully statically typed version of Algebraic that uses an enum as the type tag instead of working with delegates/typeinfo.

>
> There is a decision needed about whether toJSON() allocates data or
> returns slices into its inputrange. This can be 'static if' tested by:
> if inputrange can return immutable slices.

The test is currently "is(T == string) || is (T == immutable(ubyte)[])", but slicing is done in those cases and the non-DOM parser interface is even @nogc as long as exceptions are disabled.

> toChars() can take a compile
> time argument to determine if it is 'pretty' or not.

As long as JSON DOM values are stored in a generic Algebraic (which is a huge win in terms of interoperability!), toChars won't suffice as a name. It would have to be toJSON(Chars) (as it basically is now). I've gave the "pretty" version a separate name simply because it's more convenient to use and pretty printing will probably be by far the most frequently used option when converting to a string.

July 29, 2015

Re: std.data.json formal review

Posted by Sönke Ludwig
in reply to H. S. Teoh

Sönke Ludwig

Posted in reply to H. S. Teoh

Am 29.07.2015 um 00:37 schrieb H. S. Teoh via Digitalmars-d:
> On Tue, Jul 28, 2015 at 03:29:02PM -0700, Walter Bright via Digitalmars-d wrote:
> [...]
>> 3. Stepping back a bit, when I think of parsing JSON data, I think:
>>
>>      auto ast = inputrange.toJSON();
>>
>> where toJSON() accepts an input range and produces a container, the
>> ast. The ast is just a JSON value. Then, I can just query the ast to
>> see what kind of value it is (using overloading), and walk it as
>> necessary.
>
> +1. The API should be as simple as possible.

http://s-ludwig.github.io/std_data_json/stdx/data/json/parser/toJSONValue.html

>
> Ideally, I'd say hook it up to std.conv.to for maximum flexibility. Then
> you can just use to() to convert between a JSON container and the value
> that it represents (assuming the types are compatible).

We could maybe do that if we keep the current JSONValue as a struct wrapper around Algebraic. But it I guess that this will create an ambiguity between JSONValue("...") creating parsing a JSON string, or being constructed as a JSON string value. Or does to! hook up to something else than the constructor?

>
> OTOH, some people might want the option of parser-driven data processing
> instead (e.g. the JSON data is very large and we don't want to store the
> whole thing in memory at once). I'm not sure what a good API for that
> would be, though.

See http://s-ludwig.github.io/std_data_json/stdx/data/json/parser/parseJSONStream.html
and the various UFCS "read" and "skip" functions in http://s-ludwig.github.io/std_data_json/stdx/data/json/parser.html

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation