Thread overview
csvReader: how to read only selected columns while the class Layout has extra field?
Oct 02, 2022
mw
Oct 02, 2022
mw
Oct 02, 2022
rassoc
Oct 02, 2022
mw
Oct 02, 2022
rassoc
Oct 03, 2022
jmh530
Oct 03, 2022
Salih Dincer
Oct 03, 2022
mw
October 02, 2022

Hi,

I'm following the example on

https://dlang.org/phobos/std_csv.html

    class Layout
    {
        int value;
        double other;
        string name;
        int extra_field;  // un-comment to see the error
    }

void main()
{
    import std.csv;
    import std.stdio: write, writeln, writef, writefln;
    import std.algorithm.comparison : equal;
    string text = "a,b,c\nHello,65,2.5\nWorld,123,7.5";

    auto records =
        text.csvReader!Layout(["b","c","a"]);  // Read only these column
    foreach (r; records) writeln(r.name);
}

This works fine so far, but if I un-comment the extra_field line, I got runtime error:

core.exception.ArrayIndexError@/dlang/dmd/linux/bin64/../../src/phobos/std/csv.d(1209): index [3] is out of bounds for array of length 3
----------------
??:? _d_arraybounds_indexp [0x5565b4b974d1]
/dlang/dmd/linux/bin64/../../src/phobos/std/csv.d:1209 pure @safe void std.csv.CsvReader!(onlineapp.Layout, 1, immutable(char)[], dchar, immutable(char)[][]).CsvReader.prime() [0x5565b4b73ed2]
/dlang/dmd/linux/bin64/../../src/phobos/std/csv.d:1154 pure @safe void std.csv.CsvReader!(onlineapp.Layout, 1, immutable(char)[], dchar, immutable(char)[][]).CsvReader.popFront() [0x5565b4b73c80]
/dlang/dmd/linux/bin64/../../src/phobos/std/csv.d:1069 pure ref @safe std.csv.CsvReader!(onlineapp.Layout, 1, immutable(char)[], dchar, immutable(char)[][]).CsvReader std.csv.CsvReader!(onlineapp.Layout, 1, immutable(char)[], dchar, immutable(char)[][]).CsvReader.__ctor(immutable(char)[], immutable(char)[][], dchar, dchar, bool) [0x5565b4b73ae8]
/dlang/dmd/linux/bin64/../../src/phobos/std/csv.d:366 pure @safe std.csv.CsvReader!(onlineapp.Layout, 1, immutable(char)[], dchar, immutable(char)[][]).CsvReader std.csv.csvReader!(onlineapp.Layout, 1, immutable(char)[], immutable(char)[][], char).csvReader(immutable(char)[], immutable(char)[][], char, char, bool) [0x5565b4b735f3]
./onlineapp.d:18 _Dmain [0x5565b4b72ca4]

I'm just wondering how to work-around this?

Thanks.

October 02, 2022
        text.csvReader!Layout(["b","c","a"]);  // Read only these column

The intention is very clear: only read the selected columns from the csv, and for any other fields of class Layout, just ignore (with the default D .init value).

October 02, 2022
On 10/2/22 21:48, mw via Digitalmars-d-learn wrote:
> ```
>          text.csvReader!Layout(["b","c","a"]);  // Read only these column
> ```
> 
> The intention is very clear: only read the selected columns from the csv, and for any other fields of class Layout, just ignore (with the default D .init value).
> 

Here's why it's not currently working:

"An optional header can be provided. The first record will be read in as the header. If Contents is a struct then the header provided is expected to correspond to the fields in the struct."

"expected to correspond" means that the number of fields in the content struct can't exceed the header element count as you can see in the actual code [1]:

```
foreach (ti, ToType; Fields!(Contents))
{
    if (indices[ti] == colIndex) // indices.length depends on passed in colHeaders.length
    ...
}
```

The current index exception is bad, this needs an assert in the constructor with a nicer error message.

But say, I'm curious, what's the purpose of adding an optional/useless contents field? What's the use-case here?

[1] https://github.com/dlang/phobos/blob/8e8aaae5080ccc2e0a2202cbe9778dca96496a95/std/csv.d#L1209
October 02, 2022
On Sunday, 2 October 2022 at 21:03:40 UTC, rassoc wrote:

> But say, I'm curious, what's the purpose of adding an optional/useless contents field? What's the use-case here?


We have a class/struct for a data record,
some of its data fields need to be saved/loaded from CSV files; while there are other helper fields which are useful for various computation tasks (e.g. caching some intermediate computation results), these fields do not need to be saved/loaded from the csv files.


A CSV library should consider all the use cases, and allow users to ignore certain fields.
October 02, 2022
On 10/2/22 23:18, mw via Digitalmars-d-learn wrote:
> A CSV library should consider all the use cases, and allow users to ignore certain fields.

Filed issue: https://issues.dlang.org/show_bug.cgi?id=23383

Let's see what others have to say.
October 03, 2022
On Sunday, 2 October 2022 at 21:18:43 UTC, mw wrote:
> [snipping]
>
> A CSV library should consider all the use cases, and allow users to ignore certain fields.

In R, you have to force `NULL` for `colClasses` for the other columns. In other words, the user has to know the number of columns of the csv file in order to be able to skip them.
https://stackoverflow.com/questions/29332713/how-to-skip-column-when-reading-csv-file
October 03, 2022

On Sunday, 2 October 2022 at 19:48:52 UTC, mw wrote:

>
        text.csvReader!Layout(["b","c","a"]);  // Read only these column

The intention is very clear: only read the selected columns from the csv, and for any other fields of class Layout, just ignore (with the default D .init value).

Why don't you do this? For example you can try the following?

```d
import std.csv, std.math.algebraic : abs;

     string str = "a,b,c\nHello,65,63.63\n➊➋➂❹,123,3673.562";
     struct Layout
     {
         int value;
         double other;
         string name;
     }

     auto records = csvReader!Layout(str, ["b","c","a"]);

     Layout[2] ans;
     ans[0].name = "Hello";
     ans[0].value = 65;
     ans[0].other = 63.63;
     ans[1].name = "➊➋➂❹";
     ans[1].value = 123;
     ans[1].other = 3673.562;

     int count;
     foreach (record; records)
     {
         assert(ans[count].name == record.name);
         assert(ans[count].value == record.value);
         assert(abs(ans[count].other - record.other) < 0.00001);
         count++;
     }
     assert(count == ans.length);
```

SDB@79

October 03, 2022

On Monday, 3 October 2022 at 18:02:51 UTC, Salih Dincer wrote:

>

On Sunday, 2 October 2022 at 19:48:52 UTC, mw wrote:

>
        text.csvReader!Layout(["b","c","a"]);  // Read only these column

The intention is very clear: only read the selected columns from the csv, and for any other fields of class Layout, just ignore (with the default D .init value).

Why don't you do this? For example you can try the following?

```d
import std.csv, std.math.algebraic : abs;

     string str = "a,b,c\nHello,65,63.63\n➊➋➂❹,123,3673.562";
     struct Layout
     {
         int value;
         double other;
         string name;
     }

You didn't get my question, please add:

int extra_field;  // un-comment to see the error

to the struct, then you will see the error.