Thread overview | ||||||
---|---|---|---|---|---|---|
|
August 20, 2005 inconsistent behavior of std.string.split | ||||
---|---|---|---|---|
| ||||
According to the documentation: <spec> char[][] split(char[] s) Split s[] into an array of words, using whitespace as the delimiter. char[][] split(char[] s, char[] delim) Split s[] into an array of words, using delim[] as the delimiter. </spec> Intuitively, split(s) should be equivalent to split(s, " \t\f\r\n\v"). But the former function discards empty lines while the latter does not. The following example demonstrates the difference. <code> import std.stdio; import std.string; void main(){ writefln(std.string.split("0 3"," ")); //[0,,3] writefln(std.string.split("0 3")); //[0,3] writefln(std.string.split(" "," ")); //[,,,,] writefln(std.string.split(" ")); //[] } </code> |
August 20, 2005 Re: inconsistent behavior of std.string.split | ||||
---|---|---|---|---|
| ||||
Posted in reply to zwang | "zwang" <nehzgnaw@gmail.com> wrote in message news:de7c7e$17au$1@digitaldaemon.com... > According to the documentation: > <spec> > char[][] split(char[] s) > Split s[] into an array of words, using whitespace as the delimiter. > > char[][] split(char[] s, char[] delim) > Split s[] into an array of words, using delim[] as the delimiter. > </spec> > > Intuitively, split(s) should be equivalent to split(s, " \t\f\r\n\v"). > But the former function discards empty lines while the latter does not. > The following example demonstrates the difference. > > <code> > import std.stdio; > import std.string; > void main(){ > writefln(std.string.split("0 3"," ")); //[0,,3] > writefln(std.string.split("0 3")); //[0,3] > writefln(std.string.split(" "," ")); //[,,,,] > writefln(std.string.split(" ")); //[] > } > </code> Yeah, the one that takes a delimiter string should skip any zero-length strings in-between delimiters. The whitespace one will keep skipping characters until it hits a non-whitespace one, but the delimiter one will create a new string after every delimiter, when it should just keep reading delimiters until it hits a non-delimiter sequence. |
August 20, 2005 Re: inconsistent behavior of std.string.split | ||||
---|---|---|---|---|
| ||||
Posted in reply to Jarrett Billingsley | Jarrett Billingsley wrote:
> "zwang" <nehzgnaw@gmail.com> wrote in message news:de7c7e$17au$1@digitaldaemon.com...
>
>>According to the documentation:
>><spec>
>>char[][] split(char[] s)
>> Split s[] into an array of words, using whitespace as the delimiter.
>>
>>char[][] split(char[] s, char[] delim)
>> Split s[] into an array of words, using delim[] as the delimiter.
>></spec>
>>
>>Intuitively, split(s) should be equivalent to split(s, " \t\f\r\n\v").
>>But the former function discards empty lines while the latter does not.
>>The following example demonstrates the difference.
>>
>><code>
>>import std.stdio;
>>import std.string;
>>void main(){
>>writefln(std.string.split("0 3"," ")); //[0,,3]
>>writefln(std.string.split("0 3")); //[0,3]
>>writefln(std.string.split(" "," ")); //[,,,,]
>>writefln(std.string.split(" ")); //[]
>>}
>></code>
>
>
> Yeah, the one that takes a delimiter string should skip any zero-length strings in-between delimiters. The whitespace one will keep skipping characters until it hits a non-whitespace one, but the delimiter one will create a new string after every delimiter, when it should just keep reading delimiters until it hits a non-delimiter sequence.
>
>
Keeping zero-length strings is sometimes useful, for example, when parsing a CSV or tab-delimited file. A better solution might be two versions of split that handle consecutive delimiters differently. Or another two overloaded split functions for the special case of whitespace delimiters.
|
August 20, 2005 Re: inconsistent behavior of std.string.split | ||||
---|---|---|---|---|
| ||||
Posted in reply to zwang | "zwang" <nehzgnaw@gmail.com> wrote in message news:de7e7l$18se$1@digitaldaemon.com... > Keeping zero-length strings is sometimes useful, for example, when parsing a CSV or tab-delimited file. A better solution might be two versions of split that handle consecutive delimiters differently. Or another two overloaded split functions for the special case of whitespace delimiter Good point. |
Copyright © 1999-2021 by the D Language Foundation