Thread overview
about std.csv and derived format
Feb 29, 2012
bioinfornatics
Feb 29, 2012
bioinfornatics
Feb 29, 2012
Jesse Phillips
Mar 01, 2012
bioinfornatics
Mar 01, 2012
bioinfornatics
Mar 01, 2012
Jesse Phillips
Mar 01, 2012
bioinfornatics
Mar 01, 2012
Jesse Phillips
February 29, 2012
Dear,

I would like to parse this file: http://genome.ucsc.edu/goldenPath/help/ItemRGBDemo.txt

struct Bed{
	string    chrom;	// 0
	size_t    chromStart;	// 1
	size_t    chromEnd;	// 2
	string    name;		// 3
	size_t    score;	// 4
	char      strand;	// 5
	size_t    thickStart;	// 6
	size_t    thickEnd;	// 7
	size_t[3] itemRgb;	// 8
        size_t    blockCount;	// 9
        size_t    blockSizes;	// 10
        size_t    blockStarts;	// 11
}

In more fields 3 to 11 are optional. Then you can have:
* field 0 - 3
* field 0 - 4
* field 0 - 5
... to 0 - 12

February 29, 2012
Le mercredi 29 février 2012 à 12:42 +0100, bioinfornatics a écrit :
> Dear,
> 
> I would like to parse this file: http://genome.ucsc.edu/goldenPath/help/ItemRGBDemo.txt
> 
> struct Bed{
> 	string    chrom;	// 0
> 	size_t    chromStart;	// 1
> 	size_t    chromEnd;	// 2
> 	string    name;		// 3
> 	size_t    score;	// 4
> 	char      strand;	// 5
> 	size_t    thickStart;	// 6
> 	size_t    thickEnd;	// 7
> 	size_t[3] itemRgb;	// 8
>         size_t    blockCount;	// 9
>         size_t    blockSizes;	// 10
>         size_t    blockStarts;	// 11
> }
> 
> In more fields 3 to 11 are optional. Then you can have:
> * field 0 - 3
> * field 0 - 4
> * field 0 - 5
> ... to 0 - 12
> 


line 0 -> 2 into ItemRGBDemo.txt are metadata so they should be parsed by hand.

browser position chr7:127471196-127495720
browser hide all
track name="ItemRGBDemo" description="Item RGB demonstration"
visibility=2 itemRgb="On"

My problem is:
- need to parse data in csv format
- how manage with optional field

February 29, 2012
On Wednesday, 29 February 2012 at 11:51:29 UTC, bioinfornatics wrote:
> Le mercredi 29 février 2012 à 12:42 +0100, bioinfornatics a écrit :
>> Dear,
>> 
>> I would like to parse this file:
>> http://genome.ucsc.edu/goldenPath/help/ItemRGBDemo.txt

> My problem is:
> - need to parse data in csv format
> - how manage with optional field

It looks like the data is tab delimited so separator is a tab. There are no optional fields in CSV, but you can disable exceptions.

auto records = csvReader!(Bed,Malformed.ignore)(str,'\t');
March 01, 2012
Le mercredi 29 février 2012 à 13:23 +0100, Jesse Phillips a écrit :
> On Wednesday, 29 February 2012 at 11:51:29 UTC, bioinfornatics wrote:
> > Le mercredi 29 février 2012 à 12:42 +0100, bioinfornatics a écrit :
> >> Dear,
> >> 
> >> I would like to parse this file: http://genome.ucsc.edu/goldenPath/help/ItemRGBDemo.txt
> 
> > My problem is:
> > - need to parse data in csv format
> > - how manage with optional field
> 
> It looks like the data is tab delimited so separator is a tab. There are no optional fields in CSV, but you can disable exceptions.
> 
> auto records = csvReader!(Bed,Malformed.ignore)(str,'\t');

thanks jesse;

how i can convert inputRange return type to Bed ?
csvReader return a type that change dynamycally so if i use a template
function the type is never same and i can't hard write a copy to Bed
type.
example if i use BedData3 or BedData4:

-------------------------
struct BedData3{
    string    chrom;        // 0
    size_t    chromStart;   // 1
    size_t    chromEnd;     // 2
    string    name;         // 3
}

struct BedData4{
    string    chrom;        // 0
    size_t    chromStart;   // 1
    size_t    chromEnd;     // 2
    string    name;         // 3
    size_t    score;        // 4
}
------------------------

i have try to deal with ReturnType but i fail.

paste https://gist.github.com/1946288

at line 294 bedReader take ane BedData3 tp 11
then at line 338 how get an array of record and store this array into
struct bed line 192


thanks a lot

March 01, 2012
Le jeudi 01 mars 2012 à 01:52 +0100, bioinfornatics a écrit :
> Le mercredi 29 février 2012 à 13:23 +0100, Jesse Phillips a écrit :
> > On Wednesday, 29 February 2012 at 11:51:29 UTC, bioinfornatics wrote:
> > > Le mercredi 29 février 2012 à 12:42 +0100, bioinfornatics a écrit :
> > >> Dear,
> > >> 
> > >> I would like to parse this file: http://genome.ucsc.edu/goldenPath/help/ItemRGBDemo.txt
> > 
> > > My problem is:
> > > - need to parse data in csv format
> > > - how manage with optional field
> > 
> > It looks like the data is tab delimited so separator is a tab. There are no optional fields in CSV, but you can disable exceptions.
> > 
> > auto records = csvReader!(Bed,Malformed.ignore)(str,'\t');
> 
> thanks jesse;
> 
> how i can convert inputRange return type to Bed ?
> csvReader return a type that change dynamycally so if i use a template
> function the type is never same and i can't hard write a copy to Bed
> type.
> example if i use BedData3 or BedData4:
> 
> -------------------------
> struct BedData3{
>     string    chrom;        // 0
>     size_t    chromStart;   // 1
>     size_t    chromEnd;     // 2
>     string    name;         // 3
> }
> 
> struct BedData4{
>     string    chrom;        // 0
>     size_t    chromStart;   // 1
>     size_t    chromEnd;     // 2
>     string    name;         // 3
>     size_t    score;        // 4
> }
> ------------------------
> 
> i have try to deal with ReturnType but i fail.
> 
> paste https://gist.github.com/1946288
> 
> at line 294 bedReader take ane BedData3 tp 11
> then at line 338 how get an array of record and store this array into
> struct bed line 192
> 
> 
> thanks a lot
> 

It is ok i have found a way maybe is not an efficient way but it works: https://gist.github.com/1946669

a minor bug exist for parse track line will be fixed tomorrow. time to bed


Big thanks to all

March 01, 2012
On Thursday, 1 March 2012 at 02:07:44 UTC, bioinfornatics wrote:

> It is ok i have found a way maybe is not an efficient way but it works:
> https://gist.github.com/1946669
>
> a minor bug exist for parse track line will be fixed tomorrow. time to
> bed
>
>
> Big thanks to all

You can edit a gist instead of creating a new.

This seems like a very fragile implementation, and hard to follow. My quick untested code:

auto str = readText(filePath);

// Ignoring first three lines.
str = array(str.util(newline).until(newline).until(newline));

auto bedInstances = csvReader!(BedData11,Malformed.ignore)(str,'\t');

But if you must keep the separate structs, I don't have any better suggestions.
March 01, 2012
Le jeudi 01 mars 2012 à 04:36 +0100, Jesse Phillips a écrit :
> On Thursday, 1 March 2012 at 02:07:44 UTC, bioinfornatics wrote:
> 
> > It is ok i have found a way maybe is not an efficient way but
> > it works:
> > https://gist.github.com/1946669
> >
> > a minor bug exist for parse track line will be fixed tomorrow.
> > time to
> > bed
> >
> >
> > Big thanks to all
> 
> You can edit a gist instead of creating a new.
> 
> This seems like a very fragile implementation, and hard to follow. My quick untested code:
> 
> auto str = readText(filePath);
> 
> // Ignoring first three lines.
> str = array(str.util(newline).until(newline).until(newline));
> 
> auto bedInstances = csvReader!(BedData11,Malformed.ignore)(str,'\t');
> 
> But if you must keep the separate structs, I don't have any better suggestions.

and how convert bedInstances input array to BedData11[] ?

Add a constructo to BedData11 and use std.algorithm.map?
map!"BedData11(a.filed1, a.filed2...)"(bedInstances);

March 01, 2012
On Thursday, 1 March 2012 at 10:09:55 UTC, bioinfornatics wrote:

> and how convert bedInstances input array to BedData11[] ?

std.array.array()