Thread overview | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
|
February 29, 2012 about std.csv and derived format | ||||
---|---|---|---|---|
| ||||
Dear, I would like to parse this file: http://genome.ucsc.edu/goldenPath/help/ItemRGBDemo.txt struct Bed{ string chrom; // 0 size_t chromStart; // 1 size_t chromEnd; // 2 string name; // 3 size_t score; // 4 char strand; // 5 size_t thickStart; // 6 size_t thickEnd; // 7 size_t[3] itemRgb; // 8 size_t blockCount; // 9 size_t blockSizes; // 10 size_t blockStarts; // 11 } In more fields 3 to 11 are optional. Then you can have: * field 0 - 3 * field 0 - 4 * field 0 - 5 ... to 0 - 12 |
February 29, 2012 Re: about std.csv and derived format | ||||
---|---|---|---|---|
| ||||
Posted in reply to bioinfornatics | Le mercredi 29 février 2012 à 12:42 +0100, bioinfornatics a écrit :
> Dear,
>
> I would like to parse this file: http://genome.ucsc.edu/goldenPath/help/ItemRGBDemo.txt
>
> struct Bed{
> string chrom; // 0
> size_t chromStart; // 1
> size_t chromEnd; // 2
> string name; // 3
> size_t score; // 4
> char strand; // 5
> size_t thickStart; // 6
> size_t thickEnd; // 7
> size_t[3] itemRgb; // 8
> size_t blockCount; // 9
> size_t blockSizes; // 10
> size_t blockStarts; // 11
> }
>
> In more fields 3 to 11 are optional. Then you can have:
> * field 0 - 3
> * field 0 - 4
> * field 0 - 5
> ... to 0 - 12
>
line 0 -> 2 into ItemRGBDemo.txt are metadata so they should be parsed by hand.
browser position chr7:127471196-127495720
browser hide all
track name="ItemRGBDemo" description="Item RGB demonstration"
visibility=2 itemRgb="On"
My problem is:
- need to parse data in csv format
- how manage with optional field
|
February 29, 2012 Re: about std.csv and derived format | ||||
---|---|---|---|---|
| ||||
Posted in reply to bioinfornatics | On Wednesday, 29 February 2012 at 11:51:29 UTC, bioinfornatics wrote: > Le mercredi 29 février 2012 à 12:42 +0100, bioinfornatics a écrit : >> Dear, >> >> I would like to parse this file: >> http://genome.ucsc.edu/goldenPath/help/ItemRGBDemo.txt > My problem is: > - need to parse data in csv format > - how manage with optional field It looks like the data is tab delimited so separator is a tab. There are no optional fields in CSV, but you can disable exceptions. auto records = csvReader!(Bed,Malformed.ignore)(str,'\t'); |
March 01, 2012 Re: about std.csv and derived format | ||||
---|---|---|---|---|
| ||||
Posted in reply to Jesse Phillips | Le mercredi 29 février 2012 à 13:23 +0100, Jesse Phillips a écrit : > On Wednesday, 29 February 2012 at 11:51:29 UTC, bioinfornatics wrote: > > Le mercredi 29 février 2012 à 12:42 +0100, bioinfornatics a écrit : > >> Dear, > >> > >> I would like to parse this file: http://genome.ucsc.edu/goldenPath/help/ItemRGBDemo.txt > > > My problem is: > > - need to parse data in csv format > > - how manage with optional field > > It looks like the data is tab delimited so separator is a tab. There are no optional fields in CSV, but you can disable exceptions. > > auto records = csvReader!(Bed,Malformed.ignore)(str,'\t'); thanks jesse; how i can convert inputRange return type to Bed ? csvReader return a type that change dynamycally so if i use a template function the type is never same and i can't hard write a copy to Bed type. example if i use BedData3 or BedData4: ------------------------- struct BedData3{ string chrom; // 0 size_t chromStart; // 1 size_t chromEnd; // 2 string name; // 3 } struct BedData4{ string chrom; // 0 size_t chromStart; // 1 size_t chromEnd; // 2 string name; // 3 size_t score; // 4 } ------------------------ i have try to deal with ReturnType but i fail. paste https://gist.github.com/1946288 at line 294 bedReader take ane BedData3 tp 11 then at line 338 how get an array of record and store this array into struct bed line 192 thanks a lot |
March 01, 2012 Re: about std.csv and derived format | ||||
---|---|---|---|---|
| ||||
Posted in reply to bioinfornatics | Le jeudi 01 mars 2012 à 01:52 +0100, bioinfornatics a écrit : > Le mercredi 29 février 2012 à 13:23 +0100, Jesse Phillips a écrit : > > On Wednesday, 29 February 2012 at 11:51:29 UTC, bioinfornatics wrote: > > > Le mercredi 29 février 2012 à 12:42 +0100, bioinfornatics a écrit : > > >> Dear, > > >> > > >> I would like to parse this file: http://genome.ucsc.edu/goldenPath/help/ItemRGBDemo.txt > > > > > My problem is: > > > - need to parse data in csv format > > > - how manage with optional field > > > > It looks like the data is tab delimited so separator is a tab. There are no optional fields in CSV, but you can disable exceptions. > > > > auto records = csvReader!(Bed,Malformed.ignore)(str,'\t'); > > thanks jesse; > > how i can convert inputRange return type to Bed ? > csvReader return a type that change dynamycally so if i use a template > function the type is never same and i can't hard write a copy to Bed > type. > example if i use BedData3 or BedData4: > > ------------------------- > struct BedData3{ > string chrom; // 0 > size_t chromStart; // 1 > size_t chromEnd; // 2 > string name; // 3 > } > > struct BedData4{ > string chrom; // 0 > size_t chromStart; // 1 > size_t chromEnd; // 2 > string name; // 3 > size_t score; // 4 > } > ------------------------ > > i have try to deal with ReturnType but i fail. > > paste https://gist.github.com/1946288 > > at line 294 bedReader take ane BedData3 tp 11 > then at line 338 how get an array of record and store this array into > struct bed line 192 > > > thanks a lot > It is ok i have found a way maybe is not an efficient way but it works: https://gist.github.com/1946669 a minor bug exist for parse track line will be fixed tomorrow. time to bed Big thanks to all |
March 01, 2012 Re: about std.csv and derived format | ||||
---|---|---|---|---|
| ||||
Posted in reply to bioinfornatics | On Thursday, 1 March 2012 at 02:07:44 UTC, bioinfornatics wrote: > It is ok i have found a way maybe is not an efficient way but it works: > https://gist.github.com/1946669 > > a minor bug exist for parse track line will be fixed tomorrow. time to > bed > > > Big thanks to all You can edit a gist instead of creating a new. This seems like a very fragile implementation, and hard to follow. My quick untested code: auto str = readText(filePath); // Ignoring first three lines. str = array(str.util(newline).until(newline).until(newline)); auto bedInstances = csvReader!(BedData11,Malformed.ignore)(str,'\t'); But if you must keep the separate structs, I don't have any better suggestions. |
March 01, 2012 Re: about std.csv and derived format | ||||
---|---|---|---|---|
| ||||
Posted in reply to Jesse Phillips | Le jeudi 01 mars 2012 à 04:36 +0100, Jesse Phillips a écrit :
> On Thursday, 1 March 2012 at 02:07:44 UTC, bioinfornatics wrote:
>
> > It is ok i have found a way maybe is not an efficient way but
> > it works:
> > https://gist.github.com/1946669
> >
> > a minor bug exist for parse track line will be fixed tomorrow.
> > time to
> > bed
> >
> >
> > Big thanks to all
>
> You can edit a gist instead of creating a new.
>
> This seems like a very fragile implementation, and hard to follow. My quick untested code:
>
> auto str = readText(filePath);
>
> // Ignoring first three lines.
> str = array(str.util(newline).until(newline).until(newline));
>
> auto bedInstances = csvReader!(BedData11,Malformed.ignore)(str,'\t');
>
> But if you must keep the separate structs, I don't have any better suggestions.
and how convert bedInstances input array to BedData11[] ?
Add a constructo to BedData11 and use std.algorithm.map?
map!"BedData11(a.filed1, a.filed2...)"(bedInstances);
|
March 01, 2012 Re: about std.csv and derived format | ||||
---|---|---|---|---|
| ||||
Posted in reply to bioinfornatics | On Thursday, 1 March 2012 at 10:09:55 UTC, bioinfornatics wrote:
> and how convert bedInstances input array to BedData11[] ?
std.array.array()
|
Copyright © 1999-2021 by the D Language Foundation