Thread overview
nested csv into tsv
Mar 18, 2012
bioinfornatics
Mar 18, 2012
Jesse Phillips
Mar 18, 2012
bioinfornatics
Mar 18, 2012
Ali Çehreli
Mar 18, 2012
bioinfornatics
Mar 18, 2012
Ali Çehreli
March 18, 2012
dear, i have this data:
________________________________
data1	data2	data3a;data3b;data3c
cata1	cata2	cata3a;cata3b;cata3c
tata1	tata2	tata3a;tata3b;tata3c
________________________________

field are sepaated by tab but third field contain data separeted by semi colon

I have try:
________________________________
import std.csv;
import std.string;
import std.stdio;

struct Data{
    public:
        string field1;
        string field2;

    @property void field3( string field ){
        _field3 = field.split(";");
    }
    @property string[] field3(  ){
        return _field3;
    }

    private:
        string[] _field3;
}

void main(){
    Data[] result;
    File f = File( "data.csv", "r" );
    foreach( char[] line; f.byLine() ){
        result ~= csvReader!Data(line, '\t').front;
    }
}
________________________________


This build fine but do not works at runtime

________________________________
$ ./test_csv
std.csv.CSVException@/usr/include/d/std/csv.d(1047): Can't parse string:
"[" is missing
std.conv.ConvException@/usr/include/d/std/conv.d(2714): Can't parse
string: "[" is missing
std.conv.ConvException@/usr/include/d/std/conv.d(1597): Unexpected 'd'
when converting from type string to type string[]
________________________________



March 18, 2012
On Sunday, 18 March 2012 at 14:45:42 UTC, bioinfornatics wrote:
> ________________________________
> $ ./test_csv
> std.csv.CSVException@/usr/include/d/std/csv.d(1047): Can't parse string:
> "[" is missing
> std.conv.ConvException@/usr/include/d/std/conv.d(2714): Can't parse
> string: "[" is missing
> std.conv.ConvException@/usr/include/d/std/conv.d(1597): Unexpected 'd'
> when converting from type string to type string[]
> ________________________________

I'm going to harbor a guess that you have confused std.conv.to by using two different types for field3

   @property void field3( string field ){
       _field3 = field.split(";");
   }
   @property string[] field3(  ){
       return _field3;
   }

The first says it is a string, the second a string[]. I assume that std.conv.to sees field3 as a string[] and is trying to convert a string to it. In this case it expects the string to be formatted, ["this is an","array","of string"]
March 18, 2012
Le dimanche 18 mars 2012 à 16:53 +0100, Jesse Phillips a écrit :
> On Sunday, 18 March 2012 at 14:45:42 UTC, bioinfornatics wrote:
> > ________________________________
> > $ ./test_csv
> > std.csv.CSVException@/usr/include/d/std/csv.d(1047): Can't
> > parse string:
> > "[" is missing
> > std.conv.ConvException@/usr/include/d/std/conv.d(2714): Can't
> > parse
> > string: "[" is missing
> > std.conv.ConvException@/usr/include/d/std/conv.d(1597):
> > Unexpected 'd'
> > when converting from type string to type string[]
> > ________________________________
> 
> I'm going to harbor a guess that you have confused std.conv.to by using two different types for field3
> 
>     @property void field3( string field ){
>         _field3 = field.split(";");
>     }
>     @property string[] field3(  ){
>         return _field3;
>     }
> 
> The first says it is a string, the second a string[]. I assume that std.conv.to sees field3 as a string[] and is trying to convert a string to it. In this case it expects the string to be formatted, ["this is an","array","of string"]

If i do this:
________________________________
>import std.csv;
import std.string;
import std.stdio;

struct Data{
    public:
        string field1;
        string field2;

    @property void field3( string field ){
        _field3 = field.split(";");
    }

    @property string field3( ){
        string result;
        foreach( item; _field3 )
            result ~= " %s;".format( item );
        return result;
    }

    @property attributes(){
        return  _field3;
    }

    private:
        string[] _field3;
}

void main(){
    Data[] result;
    File f = File( "data.csv", "r" );
    foreach( char[] line; f.byLine() ){
        result ~= csvReader!Data(line, '\t').front;
    }
}
________________________________


Same result

March 18, 2012
On 03/18/2012 07:45 AM, bioinfornatics wrote:
> dear, i have this data:
> ________________________________
> data1	data2	data3a;data3b;data3c
> cata1	cata2	cata3a;cata3b;cata3c
> tata1	tata2	tata3a;tata3b;tata3c
> ________________________________
>
> field are sepaated by tab but third field contain data separeted by semi
> colon
>
> I have try:
> ________________________________
> import std.csv;
> import std.string;
> import std.stdio;
>
> struct Data{
>      public:
>          string field1;
>          string field2;
>
>      @property void field3( string field ){
>          _field3 = field.split(";");
>      }
>      @property string[] field3(  ){
>          return _field3;
>      }

Besides the confusion that Jesse Phillips has pointed out, csvReader cannot decide to treat those two property functions as if they represent a member of Data.

>
>      private:
>          string[] _field3;

Data still has three members: field1, field2, and _field3.

The problem is, although the format clearly states that there are three strings that are delimited by '\t', the third field of the struct is not a string.

> }
>
> void main(){
>      Data[] result;
>      File f = File( "data.csv", "r" );
>      foreach( char[] line; f.byLine() ){
>          result ~= csvReader!Data(line, '\t').front;
>      }
> }

So the solution is that _field3 must be a string:

import std.csv;
import std.string;
import std.stdio;

struct Data{
    public:
        string field1;
        string field2;

    private:
        string _field3;
}

void main(){
    Data[] result;
    File f = File( "data.csv", "r" );
    foreach( char[] line; f.byLine() ){
        result ~= csvReader!Data(line, '\t').front;
    }

    writeln(result);
}

You must provide the properties on top of that:

import std.csv;
import std.string;
import std.stdio;

struct Data{
    public:
        string field1;
        string field2;

    void field3( string[] field ) @property {
        _field3 = field.join();
    }

    string[] field3(  ) @property {
        return _field3.split(";");
    }

    string toString() {
        return format("%s,%s,%s", field1, field2, field3);
    }

    private:
        string _field3;
}

void main(){
    Data[] result;
    File f = File( "data.csv", "r" );
    foreach( char[] line; f.byLine() ){
        result ~= csvReader!Data(line, '\t').front;
    }

    writeln(result);
}

Note that to avoid confusing the readers, the property functions both use string[], not string. (I've also put @property at the end of the function signature, which I started to favor recently.)

The optimizations can come after that. The following calls split() only whene necessary:

import std.csv;
import std.string;
import std.stdio;

struct Data{
    public:
        string field1;
        string field2;

    void field3( string[] field ) @property {
        _field3 = field;
        _raw_field3 = null;
    }

    string[] field3(  ) @property {
        if (_raw_field3 !is null) {
            _field3 = _raw_field3.split(";");
        }
        return _field3;
    }

    string toString() {
        return format("%s,%s,%s", field1, field2, field3);
    }

    private:
        string _raw_field3;
        string[] _field3;
}

void main(){
    Data[] result;
    File f = File( "data.csv", "r" );
    foreach( char[] line; f.byLine() ){
        result ~= csvReader!Data(line, '\t').front;
    }

    writeln(result);
}

Ali

March 18, 2012
Le dimanche 18 mars 2012 à 09:53 -0700, Ali Çehreli a écrit :
> On 03/18/2012 07:45 AM, bioinfornatics wrote:
>  > dear, i have this data:
>  > ________________________________
>  > data1	data2	data3a;data3b;data3c
>  > cata1	cata2	cata3a;cata3b;cata3c
>  > tata1	tata2	tata3a;tata3b;tata3c
>  > ________________________________
>  >
>  > field are sepaated by tab but third field contain data separeted by semi
>  > colon
>  >
>  > I have try:
>  > ________________________________
>  > import std.csv;
>  > import std.string;
>  > import std.stdio;
>  >
>  > struct Data{
>  >      public:
>  >          string field1;
>  >          string field2;
>  >
>  >      @property void field3( string field ){
>  >          _field3 = field.split(";");
>  >      }
>  >      @property string[] field3(  ){
>  >          return _field3;
>  >      }
> 
> Besides the confusion that Jesse Phillips has pointed out, csvReader cannot decide to treat those two property functions as if they represent a member of Data.
> 
>  >
>  >      private:
>  >          string[] _field3;
> 
> Data still has three members: field1, field2, and _field3.
> 
> The problem is, although the format clearly states that there are three strings that are delimited by '\t', the third field of the struct is not a string.
> 
>  > }
>  >
>  > void main(){
>  >      Data[] result;
>  >      File f = File( "data.csv", "r" );
>  >      foreach( char[] line; f.byLine() ){
>  >          result ~= csvReader!Data(line, '\t').front;
>  >      }
>  > }
> 
> So the solution is that _field3 must be a string:
> 
> import std.csv;
> import std.string;
> import std.stdio;
> 
> struct Data{
>      public:
>          string field1;
>          string field2;
> 
>      private:
>          string _field3;
> }
> 
> void main(){
>      Data[] result;
>      File f = File( "data.csv", "r" );
>      foreach( char[] line; f.byLine() ){
>          result ~= csvReader!Data(line, '\t').front;
>      }
> 
>      writeln(result);
> }
> 
> You must provide the properties on top of that:
> 
> import std.csv;
> import std.string;
> import std.stdio;
> 
> struct Data{
>      public:
>          string field1;
>          string field2;
> 
>      void field3( string[] field ) @property {
>          _field3 = field.join();
>      }
> 
>      string[] field3(  ) @property {
>          return _field3.split(";");
>      }
> 
>      string toString() {
>          return format("%s,%s,%s", field1, field2, field3);
>      }
> 
>      private:
>          string _field3;
> }
> 
> void main(){
>      Data[] result;
>      File f = File( "data.csv", "r" );
>      foreach( char[] line; f.byLine() ){
>          result ~= csvReader!Data(line, '\t').front;
>      }
> 
>      writeln(result);
> }
> 
> Note that to avoid confusing the readers, the property functions both use string[], not string. (I've also put @property at the end of the function signature, which I started to favor recently.)
> 
> The optimizations can come after that. The following calls split() only whene necessary:
> 
> import std.csv;
> import std.string;
> import std.stdio;
> 
> struct Data{
>      public:
>          string field1;
>          string field2;
> 
>      void field3( string[] field ) @property {
>          _field3 = field;
>          _raw_field3 = null;
>      }
> 
>      string[] field3(  ) @property {
>          if (_raw_field3 !is null) {
>              _field3 = _raw_field3.split(";");
>          }
>          return _field3;
>      }
> 
>      string toString() {
>          return format("%s,%s,%s", field1, field2, field3);
>      }
> 
>      private:
>          string _raw_field3;
>          string[] _field3;
> }
> 
> void main(){
>      Data[] result;
>      File f = File( "data.csv", "r" );
>      foreach( char[] line; f.byLine() ){
>          result ~= csvReader!Data(line, '\t').front;
>      }
> 
>      writeln(result);
> }
> 
> Ali
> 

Very interesing big thanks for this snippet code

March 18, 2012
Bug fix release: :)

On 03/18/2012 10:13 AM, bioinfornatics wrote:
> Le dimanche 18 mars 2012 à 09:53 -0700, Ali Çehreli a écrit :

>>       void field3( string[] field ) @property {
>>           _field3 = field.join();

I think that should have been field.join(";"). (But join() is not used in the final version of the program anyway.)

>>       string[] field3(  ) @property {
>>           if (_raw_field3 !is null) {
>>               _field3 = _raw_field3.split(";");

This line must be added so that split() is not called every time:

                 _raw_field3 = null;

>>           }
>>           return _field3;
>>       }

Ali