Thread overview
Combining JSON arrays into a single JSON array -- better way than this?
Aug 01, 2022
ikelaiah
Aug 01, 2022
frame
Aug 01, 2022
ikelaiah
Aug 01, 2022
frame
Aug 01, 2022
ikelaiah
Aug 01, 2022
Christian Köstlin
Aug 01, 2022
ikelaiah
Aug 01, 2022
Salih Dincer
Aug 01, 2022
ikelaiah
Aug 01, 2022
kdevel
August 01, 2022

Hi,

I've written a cli tool to merge JSON files (containing JSON array) in the current folder as a single JSON file.

My algorithm:

  1. Create a string to store the output JSON array as a string,
  2. read each file
  3. read each object in JSON array from input file
  4. append the string representation of each JSON object in a string
  5. Parse the result string as JSON
  6. Save string representation of point 6 above.

Question

While this works, is there a more concise way than this?
Thank you for your time.

module combinejsonv2;

import std.file;
import std.stdio;
import std.json;
import std.array;
import std.algorithm.searching;

void main()
{
    // can't I use this for JSON array?
    JSONValue jj;

    // create a string opening [ for JSON array
    string stringResult = "[";

    foreach (string filename; dirEntries(".", "*.json", SpanMode.shallow))
    {
        // if filename contains 'output' in it, ignore
        if(canFind(filename, "output")) {
            std.stdio.writeln("ignoring: " ~ filename);
            continue;
        }

        // show status to console
        std.stdio.writeln("processing: " ~ filename);

        // read JSON file as string
        string content = std.file.readText(filename);
        // parse as JSON
        JSONValue j = parseJSON(content);

        foreach (JSONValue jsonObject; j.array) {
            // Combine objects from files into a single list of JSON object
            // std.stdio.writeln(jsonObject.toPrettyString);
            stringResult ~= jsonObject.toString;
            stringResult ~= ",";
        }
    }

    // create closing ] for the JSON array
    stringResult ~= "]";

    // parse the string as a JSON object
    JSONValue jsonResult = parseJSON(stringResult);

    // write to file
    std.file.write("output-combined.json", jsonResult.toPrettyString);
}

August 01, 2022

On Monday, 1 August 2022 at 04:24:41 UTC, ikelaiah wrote:

>

Hi,

I've written a cli tool to merge JSON files (containing JSON array) in the current folder as a single JSON file.

My algorithm:

  1. Create a string to store the output JSON array as a string,
  2. read each file
  3. read each object in JSON array from input file
  4. append the string representation of each JSON object in a string
  5. Parse the result string as JSON
  6. Save string representation of point 6 above.

If the JSON files are already parsed, why you stringify and reparse it? The assign operator of JSONValue allows it to use it as associative or plain array (JSON object / JSON array). Each member of such an array is of the same type: JSONValue. So to merge an array into another, you can simply iterate over its members and assign it into to a target array. To initialize as such just use jsonResult.array = [];

A plain array should be also mergeable via jsonResult.array ~= j.array (if j really is an array, you need to check the type first)

August 01, 2022

On 2022-08-01 06:24, ikelaiah wrote:

>

Hi,

I've written a cli tool to merge JSON files (containing JSON array) in the current folder as a single JSON file.

My algorithm:

  1. Create a string to store the output JSON array as a string,
  2. read each file
  3. read each object in JSON array from input file
  4. append the string representation of each JSON object in a string
  5. Parse the result string as JSON
  6. Save string representation of point 6 above.

Question

While this works, is there a more concise way than this?
Thank you for your time.

module combinejsonv2;

import std.file;
import std.stdio;
import std.json;
import std.array;
import std.algorithm.searching;

void main()
{
     // can't I use this for JSON array?
     JSONValue jj;

     // create a string opening [ for JSON array
     string stringResult = "[";

     foreach (string filename; dirEntries(".", "*.json", SpanMode.shallow))
     {
         // if filename contains 'output' in it, ignore
         if(canFind(filename, "output")) {
             std.stdio.writeln("ignoring: " ~ filename);
             continue;
         }

         // show status to console
         std.stdio.writeln("processing: " ~ filename);

         // read JSON file as string
         string content = std.file.readText(filename);
         // parse as JSON
         JSONValue j = parseJSON(content);

         foreach (JSONValue jsonObject; j.array) {
             // Combine objects from files into a single list of JSON object
             // std.stdio.writeln(jsonObject.toPrettyString);
             stringResult ~= jsonObject.toString;
             stringResult ~= ",";
         }
     }

     // create closing ] for the JSON array
     stringResult ~= "]";

     // parse the string as a JSON object
     JSONValue jsonResult = parseJSON(stringResult);

     // write to file
     std.file.write("output-combined.json", jsonResult.toPrettyString);
}

An arguably shorter solution (that drops some of your logging) could be:

import std;

void main() {
    dirEntries(".", "*.json", SpanMode.shallow)
        .filter!(f => !f.name.canFind("output"))
        .map!(readText)
        .map!(parseJSON)
        .fold!((result, json) { result ~= json.array; return result; })
        .toPrettyString
        .reverseArgs!(std.file.write)("output-combined.json");
}

not sure if you are looking for this style though.

kind regards,
Christian

August 01, 2022

On Monday, 1 August 2022 at 07:35:34 UTC, Christian Köstlin wrote:

>

An arguably shorter solution (that drops some of your logging) could be:

import std;

void main() {
    dirEntries(".", "*.json", SpanMode.shallow)
        .filter!(f => !f.name.canFind("output"))
        .map!(readText)
        .map!(parseJSON)
        .fold!((result, json) { result ~= json.array; return result; })
        .toPrettyString
        .reverseArgs!(std.file.write)("output-combined.json");
}

not sure if you are looking for this style though.

kind regards,
Christian

Hi Christian,

So we can do that in D?!
Thanks for sharing this amazing approach.

-ikelaiah

August 01, 2022

On Monday, 1 August 2022 at 05:52:36 UTC, frame wrote:

>

If the JSON files are already parsed, why you stringify and reparse it?

Because of ...

  1. mental block and
  2. I didn't know jsonResult.array = [];

Many thanks for pointing this out. I tried the following, and didn't work, and hence my earlier convoluted approach:

  • JSONValue[] jsonResult;
  • JSONValue jsonResult = [];
>

A plain array should be also mergeable via jsonResult.array ~= j.array (if j really is an array, you need to check the type first)

Based in your suggestion, the snippet is now more brief.

module combinejsonv3;

import std.file;
import std.stdio;
import std.json;
import std.array;
import std.algorithm.searching;

    void main()
    {
        // merged JSON to be stored here
        JSONValue jsonResult;
        jsonResult.array = [];

        foreach (string filename; dirEntries(".", "*.json", SpanMode.shallow))
        {
            // if filename contains 'output' in it, ignore
            if(canFind(filename, "output")) {
                std.stdio.writeln("ignoring: " ~ filename);
                continue;
            }

            // read JSON file as string
            string content = std.file.readText(filename);

            // parse as JSON
            JSONValue j = parseJSON(content);

            // if JSONType is array, merge
            if(j.type == JSONType.array) {
                // show status to console
                std.stdio.writeln("processing JSON array from: " ~ filename);
                jsonResult.array ~= j.array;
            }
        }

        // write to file
        std.file.write("output-combined.json", jsonResult.toPrettyString);
    }

Thank you!

-ikel

August 01, 2022

On Monday, 1 August 2022 at 09:01:35 UTC, ikelaiah wrote:

>

Based in your suggestion, the snippet is now more brief.

While your string attempt wasn't that bad, because loading all in memory while not necessary is wasted memory if you have large files to process. I would just process each file and write it to the file directly but it always depends on the intended purpose.

August 01, 2022

On Monday, 1 August 2022 at 08:48:28 UTC, ikelaiah wrote:

>

On Monday, 1 August 2022 at 07:35:34 UTC, Christian Köstlin wrote:

>

An arguably shorter solution (that drops some of your logging) could be:

import std;

void main() {
    dirEntries(".", "*.json", SpanMode.shallow)
        .filter!(f => !f.name.canFind("output"))
        .map!(readText)
        .map!(parseJSON)
        .fold!((result, json) { result ~= json.array; return result; })
        .toPrettyString
        .reverseArgs!(std.file.write)("output-combined.json");
}

not sure if you are looking for this style though.

kind regards,
Christian

Hi Christian,

So we can do that in D?!
Thanks for sharing this amazing approach.

-ikelaiah

When printing the steps import std.stdio : writeln; use it. Thus, it does not confilics with std.file.

SDB@79

August 01, 2022

On Monday, 1 August 2022 at 07:35:34 UTC, Christian Köstlin wrote:
[...]

> >

An arguably shorter solution (that drops some of your logging) could be:

import std;

void main() {
    dirEntries(".", "*.json", SpanMode.shallow)
        .filter!(f => !f.name.canFind("output"))
        .map!(readText)
        .map!(parseJSON)
        .fold!((result, json) { result ~= json.array; return result; })
        .toPrettyString
        .reverseArgs!(std.file.write)("output-combined.json");
}

Is there an implementation which does not interpret the objects in the array:

Case 1

[{"A":"A","A":"B"}] -> [
    {
        "A": "B"
    }
]

Case 2

[1.0000000000000001] -> [
    1.0
]

Case 3

[99999999999999999999] -> std.conv.ConvOverflowException@...Overflow in integral conversion
August 01, 2022

On Monday, 1 August 2022 at 16:11:52 UTC, frame wrote:

>

... because loading all in memory while not necessary is wasted memory if you have large files to process.

Noted with thanks. Actually, some JSON input files are pretty large (>1Mb).

>

I would just process each file and write it to the file directly but it always depends on the intended purpose.

Noted as well. Thank you, this will come handy in the future.

-ikel

August 01, 2022

On Monday, 1 August 2022 at 17:02:10 UTC, Salih Dincer wrote:

>

When printing the steps import std.stdio : writeln; use it. Thus, it does not conflict with std.file.

SDB@79

Will do and thank you for this.