View mode: basic / threaded / horizontal-split · Log in · Help
February 29, 2012
about std.csv and derived format
Dear,

I would like to parse this file:
http://genome.ucsc.edu/goldenPath/help/ItemRGBDemo.txt

struct Bed{
	string    chrom;	// 0
	size_t    chromStart;	// 1
	size_t    chromEnd;	// 2
	string    name;		// 3
	size_t    score;	// 4
	char      strand;	// 5
	size_t    thickStart;	// 6
	size_t    thickEnd;	// 7
	size_t[3] itemRgb;	// 8
       size_t    blockCount;	// 9
       size_t    blockSizes;	// 10
       size_t    blockStarts;	// 11
}

In more fields 3 to 11 are optional. Then you can have:
* field 0 - 3
* field 0 - 4
* field 0 - 5
... to 0 - 12
February 29, 2012
Re: about std.csv and derived format
Le mercredi 29 février 2012 à 12:42 +0100, bioinfornatics a écrit :
> Dear,
> 
> I would like to parse this file:
> http://genome.ucsc.edu/goldenPath/help/ItemRGBDemo.txt
> 
> struct Bed{
> 	string    chrom;	// 0
> 	size_t    chromStart;	// 1
> 	size_t    chromEnd;	// 2
> 	string    name;		// 3
> 	size_t    score;	// 4
> 	char      strand;	// 5
> 	size_t    thickStart;	// 6
> 	size_t    thickEnd;	// 7
> 	size_t[3] itemRgb;	// 8
>         size_t    blockCount;	// 9
>         size_t    blockSizes;	// 10
>         size_t    blockStarts;	// 11
> }
> 
> In more fields 3 to 11 are optional. Then you can have:
> * field 0 - 3
> * field 0 - 4
> * field 0 - 5
> ... to 0 - 12
> 


line 0 -> 2 into ItemRGBDemo.txt are metadata so they should be parsed
by hand.

browser position chr7:127471196-127495720
browser hide all
track name="ItemRGBDemo" description="Item RGB demonstration"
visibility=2 itemRgb="On"

My problem is:
- need to parse data in csv format
- how manage with optional field
February 29, 2012
Re: about std.csv and derived format
On Wednesday, 29 February 2012 at 11:51:29 UTC, bioinfornatics 
wrote:
> Le mercredi 29 février 2012 à 12:42 +0100, bioinfornatics a 
> écrit :
>> Dear,
>> 
>> I would like to parse this file:
>> http://genome.ucsc.edu/goldenPath/help/ItemRGBDemo.txt

> My problem is:
> - need to parse data in csv format
> - how manage with optional field

It looks like the data is tab delimited so separator is a tab. 
There are no optional fields in CSV, but you can disable 
exceptions.

auto records = csvReader!(Bed,Malformed.ignore)(str,'\t');
March 01, 2012
Re: about std.csv and derived format
Le mercredi 29 février 2012 à 13:23 +0100, Jesse Phillips a écrit :
> On Wednesday, 29 February 2012 at 11:51:29 UTC, bioinfornatics 
> wrote:
> > Le mercredi 29 février 2012 à 12:42 +0100, bioinfornatics a 
> > écrit :
> >> Dear,
> >> 
> >> I would like to parse this file:
> >> http://genome.ucsc.edu/goldenPath/help/ItemRGBDemo.txt
> 
> > My problem is:
> > - need to parse data in csv format
> > - how manage with optional field
> 
> It looks like the data is tab delimited so separator is a tab. 
> There are no optional fields in CSV, but you can disable 
> exceptions.
> 
> auto records = csvReader!(Bed,Malformed.ignore)(str,'\t');

thanks jesse;

how i can convert inputRange return type to Bed ?
csvReader return a type that change dynamycally so if i use a template
function the type is never same and i can't hard write a copy to Bed
type.
example if i use BedData3 or BedData4:

-------------------------
struct BedData3{
   string    chrom;        // 0
   size_t    chromStart;   // 1
   size_t    chromEnd;     // 2
   string    name;         // 3
}

struct BedData4{
   string    chrom;        // 0
   size_t    chromStart;   // 1
   size_t    chromEnd;     // 2
   string    name;         // 3
   size_t    score;        // 4
}
------------------------

i have try to deal with ReturnType but i fail.

paste https://gist.github.com/1946288

at line 294 bedReader take ane BedData3 tp 11
then at line 338 how get an array of record and store this array into
struct bed line 192


thanks a lot
March 01, 2012
Re: about std.csv and derived format
Le jeudi 01 mars 2012 à 01:52 +0100, bioinfornatics a écrit :
> Le mercredi 29 février 2012 à 13:23 +0100, Jesse Phillips a écrit :
> > On Wednesday, 29 February 2012 at 11:51:29 UTC, bioinfornatics 
> > wrote:
> > > Le mercredi 29 février 2012 à 12:42 +0100, bioinfornatics a 
> > > écrit :
> > >> Dear,
> > >> 
> > >> I would like to parse this file:
> > >> http://genome.ucsc.edu/goldenPath/help/ItemRGBDemo.txt
> > 
> > > My problem is:
> > > - need to parse data in csv format
> > > - how manage with optional field
> > 
> > It looks like the data is tab delimited so separator is a tab. 
> > There are no optional fields in CSV, but you can disable 
> > exceptions.
> > 
> > auto records = csvReader!(Bed,Malformed.ignore)(str,'\t');
> 
> thanks jesse;
> 
> how i can convert inputRange return type to Bed ?
> csvReader return a type that change dynamycally so if i use a template
> function the type is never same and i can't hard write a copy to Bed
> type.
> example if i use BedData3 or BedData4:
> 
> -------------------------
> struct BedData3{
>     string    chrom;        // 0
>     size_t    chromStart;   // 1
>     size_t    chromEnd;     // 2
>     string    name;         // 3
> }
> 
> struct BedData4{
>     string    chrom;        // 0
>     size_t    chromStart;   // 1
>     size_t    chromEnd;     // 2
>     string    name;         // 3
>     size_t    score;        // 4
> }
> ------------------------
> 
> i have try to deal with ReturnType but i fail.
> 
> paste https://gist.github.com/1946288
> 
> at line 294 bedReader take ane BedData3 tp 11
> then at line 338 how get an array of record and store this array into
> struct bed line 192
> 
> 
> thanks a lot
> 

It is ok i have found a way maybe is not an efficient way but it works:
https://gist.github.com/1946669

a minor bug exist for parse track line will be fixed tomorrow. time to
bed


Big thanks to all
March 01, 2012
Re: about std.csv and derived format
On Thursday, 1 March 2012 at 02:07:44 UTC, bioinfornatics wrote:

> It is ok i have found a way maybe is not an efficient way but 
> it works:
> https://gist.github.com/1946669
>
> a minor bug exist for parse track line will be fixed tomorrow. 
> time to
> bed
>
>
> Big thanks to all

You can edit a gist instead of creating a new.

This seems like a very fragile implementation, and hard to 
follow. My quick untested code:

auto str = readText(filePath);

// Ignoring first three lines.
str = array(str.util(newline).until(newline).until(newline));

auto bedInstances = 
csvReader!(BedData11,Malformed.ignore)(str,'\t');

But if you must keep the separate structs, I don't have any 
better suggestions.
March 01, 2012
Re: about std.csv and derived format
Le jeudi 01 mars 2012 à 04:36 +0100, Jesse Phillips a écrit :
> On Thursday, 1 March 2012 at 02:07:44 UTC, bioinfornatics wrote:
> 
> > It is ok i have found a way maybe is not an efficient way but 
> > it works:
> > https://gist.github.com/1946669
> >
> > a minor bug exist for parse track line will be fixed tomorrow. 
> > time to
> > bed
> >
> >
> > Big thanks to all
> 
> You can edit a gist instead of creating a new.
> 
> This seems like a very fragile implementation, and hard to 
> follow. My quick untested code:
> 
> auto str = readText(filePath);
> 
> // Ignoring first three lines.
> str = array(str.util(newline).until(newline).until(newline));
> 
> auto bedInstances = 
> csvReader!(BedData11,Malformed.ignore)(str,'\t');
> 
> But if you must keep the separate structs, I don't have any 
> better suggestions.

and how convert bedInstances input array to BedData11[] ?

Add a constructo to BedData11 and use std.algorithm.map?
map!"BedData11(a.filed1, a.filed2...)"(bedInstances);
March 01, 2012
Re: about std.csv and derived format
On Thursday, 1 March 2012 at 10:09:55 UTC, bioinfornatics wrote:

> and how convert bedInstances input array to BedData11[] ?

std.array.array()
Top | Discussion index | About this forum | D home