June 23, 2012 Re: phobos and splitting things... but not with whitespace. | ||||
---|---|---|---|---|
| ||||
Posted in reply to Chad J | On Sat, 23 Jun 2012 19:52:32 +0200, Chad J <chadjoan@__spam.is.bad__gmail.com> wrote: > > As an additional note: I could probably do this easily if I had a function like findSplit where the predicate is used /instead/ of a delimiter. So like this: > auto findSplit(alias pred = "a", R)(R haystack); > ... > auto tuple = findSplit!(`a == "\n" || a == "\r\n" || a == "\r"`)(text); > return tuple[2]; I don't think it can match on ranges, but it's pretty trivial to implement something that would work for your case import std.array, std.algorithm, std.typecons; auto newlineSplit(string data) { auto rest = data.findAmong("\r\n"); if(!rest.empty) { // found auto pre = data[0..data.length-rest.length]; string match; if(rest.front == '\r' && (rest.length > 1 && rest[1] == '\n')) { // \r\n match = rest[0..2]; rest = rest[2..$]; } else { // \r or \n match = rest[0..1]; rest = rest[1..$]; } return tuple(pre, match, rest); } else { return tuple(data, "", ""); } } unittest { auto text = "1\n2\r\n3\r4"; auto res = text.newlineSplit(); assert(res[0] == "1"); assert(res[1] == "\n"); assert(res[2] == "2\r\n3\r4"); res = res[2].newlineSplit(); assert(res[0] == "2"); assert(res[1] == "\r\n"); assert(res[2] == "3\r4"); res = res[2].newlineSplit(); assert(res[0] == "3"); assert(res[1] == "\r"); assert(res[2] == "4"); res = res[2].newlineSplit(); assert(res[0] == "4"); assert(res[1] == ""); assert(res[2] == ""); } |
June 23, 2012 Re: phobos and splitting things... but not with whitespace. | ||||
---|---|---|---|---|
| ||||
Posted in reply to simendsjo | On 06/23/2012 02:17 PM, simendsjo wrote:
> On Sat, 23 Jun 2012 19:52:32 +0200, Chad J
> <chadjoan@__spam.is.bad__gmail.com> wrote:
>
>>
>> As an additional note: I could probably do this easily if I had a
>> function like findSplit where the predicate is used /instead/ of a
>> delimiter. So like this:
>> auto findSplit(alias pred = "a", R)(R haystack);
>> ...
>> auto tuple = findSplit!(`a == "\n" || a == "\r\n" || a == "\r"`)(text);
>> return tuple[2];
>
> I don't think it can match on ranges, but it's pretty trivial to
> implement something that would work for your case
>
> import std.array, std.algorithm, std.typecons;
>
> auto newlineSplit(string data) {
> auto rest = data.findAmong("\r\n");
> if(!rest.empty) { // found
> auto pre = data[0..data.length-rest.length];
> string match;
> if(rest.front == '\r' && (rest.length > 1 && rest[1] == '\n')) { // \r\n
> match = rest[0..2];
> rest = rest[2..$];
> } else { // \r or \n
> match = rest[0..1];
> rest = rest[1..$];
> }
> return tuple(pre, match, rest);
> } else {
> return tuple(data, "", "");
> }
> }
> unittest {
> auto text = "1\n2\r\n3\r4";
> auto res = text.newlineSplit();
> assert(res[0] == "1");
> assert(res[1] == "\n");
> assert(res[2] == "2\r\n3\r4");
>
> res = res[2].newlineSplit();
> assert(res[0] == "2");
> assert(res[1] == "\r\n");
> assert(res[2] == "3\r4");
>
> res = res[2].newlineSplit();
> assert(res[0] == "3");
> assert(res[1] == "\r");
> assert(res[2] == "4");
>
> res = res[2].newlineSplit();
> assert(res[0] == "4");
> assert(res[1] == "");
> assert(res[2] == "");
> }
Hey, thanks for doing all of that. I didn't expect you to write all of that.
Once I've established that the issue isn't just a lack of learning on my part, my subsequent objective is filling any missing functionality in phobos. IMO the "take away a single line" thing should be accomplishable with a single concise expression. Then there should be a function in std.string that contains that single expression and wraps it in easy-to-find documentation. This kind of thing is a fairly common operation. Otherwise, I find it odd that there is a function to split up an arbitrary number of lines but no function to split off only one!
Also, any function that works with whitespace should have versions/variants that work with arbitrary delimiters. Not unless it is impossible to generalize it that way for some reason. If the variants are found in a separate module, then the documentation should reference them.
|
June 23, 2012 Re: phobos and splitting things... but not with whitespace. | ||||
---|---|---|---|---|
| ||||
Posted in reply to Chad J | On Sat, 23 Jun 2012 20:41:29 +0200, Chad J <chadjoan@__spam.is.bad__gmail.com> wrote: > > Hey, thanks for doing all of that. I didn't expect you to write all of that. np > Once I've established that the issue isn't just a lack of learning on my part, my subsequent objective is filling any missing functionality in phobos. IMO the "take away a single line" thing should be accomplishable with a single concise expression. Then there should be a function in std.string that contains that single expression and wraps it in easy-to-find documentation. This kind of thing is a fairly common operation. Otherwise, I find it odd that there is a function to split up an arbitrary number of lines but no function to split off only one! > Also, any function that works with whitespace should have versions/variants that work with arbitrary delimiters. Not unless it is impossible to generalize it that way for some reason. If the variants are found in a separate module, then the documentation should reference them. The problem here is there isn't a version of findSplit only taking a predicate and not a needle. If it had an overload just taking a function, you could have solved it by writing: auto res = myText.findSplit!(a => a.startsWith("\r\n", "\n", "\r")); |
June 23, 2012 Re: phobos and splitting things... but not with whitespace. | ||||
---|---|---|---|---|
| ||||
Posted in reply to Chad J | On Sat, 23 Jun 2012 20:41:29 +0200, Chad J <chadjoan@__spam.is.bad__gmail.com> wrote: > IMO the "take away a single line" thing should be accomplishable with a single concise expression This takes a range to match against, so much like startsWith: auto findSplitAny(Range, Ranges...)(Range data, Ranges matches) { auto rest = data; for(; !rest.empty; rest.popFront()) { foreach(match; matches) { if(rest.startsWith(match)) { auto restStart = data.length-rest.length; auto pre = data[0..restStart]; // we'll fetch it from the data instead of using the supplied // match to be consistent with findSplit auto dataMatch = data[restStart..restStart+match.length]; auto post = rest[match.length..$]; return tuple(pre, dataMatch, post); } } } return tuple(data, Range.init, Range.init); } unittest { auto text = "1\n2\r\n3\r4"; auto res = text.findSplitAny("\r\n", "\n", "\r"); assert(res[0] == "1"); assert(res[1] == "\n"); assert(res[2] == "2\r\n3\r4"); res = res[2].findSplitAny("\r\n", "\n", "\r"); assert(res[0] == "2"); assert(res[1] == "\r\n"); assert(res[2] == "3\r4"); res = res[2].findSplitAny("\r\n", "\n", "\r"); assert(res[0] == "3"); assert(res[1] == "\r"); assert(res[2] == "4"); res = res[2].findSplitAny("\r\n", "\n", "\r"); assert(res[0] == "4"); assert(res[1] == ""); assert(res[2] == ""); } |
June 23, 2012 Re: phobos and splitting things... but not with whitespace. | ||||
---|---|---|---|---|
| ||||
Posted in reply to simendsjo | On 06/23/2012 02:53 PM, simendsjo wrote:
> On Sat, 23 Jun 2012 20:41:29 +0200, Chad J
> <chadjoan@__spam.is.bad__gmail.com> wrote:
>
>>
>> Hey, thanks for doing all of that. I didn't expect you to write all of
>> that.
> np
>
>> Once I've established that the issue isn't just a lack of learning on
>> my part, my subsequent objective is filling any missing functionality
>> in phobos. IMO the "take away a single line" thing should be
>> accomplishable with a single concise expression. Then there should be
>> a function in std.string that contains that single expression and
>> wraps it in easy-to-find documentation. This kind of thing is a fairly
>> common operation. Otherwise, I find it odd that there is a function to
>> split up an arbitrary number of lines but no function to split off
>> only one!
>> Also, any function that works with whitespace should have
>> versions/variants that work with arbitrary delimiters. Not unless it
>> is impossible to generalize it that way for some reason. If the
>> variants are found in a separate module, then the documentation should
>> reference them.
>
>
> The problem here is there isn't a version of findSplit only taking a
> predicate and not a needle.
> If it had an overload just taking a function, you could have solved it
> by writing:
> auto res = myText.findSplit!(a => a.startsWith("\r\n", "\n", "\r"));
True, although I'm a bigger fan of the compile-time alias predicate because of it's superior inline-ability. ;)
|
June 23, 2012 Re: phobos and splitting things... but not with whitespace. | ||||
---|---|---|---|---|
| ||||
Posted in reply to simendsjo | On 06/23/2012 03:41 PM, simendsjo wrote:
> On Sat, 23 Jun 2012 20:41:29 +0200, Chad J
> <chadjoan@__spam.is.bad__gmail.com> wrote:
>
>> IMO the "take away a single line" thing should be accomplishable with
>> a single concise expression
>
> This takes a range to match against, so much like startsWith:
>
> auto findSplitAny(Range, Ranges...)(Range data, Ranges matches) {
> auto rest = data;
> for(; !rest.empty; rest.popFront()) {
> foreach(match; matches) {
> if(rest.startsWith(match)) {
> auto restStart = data.length-rest.length;
> auto pre = data[0..restStart];
> // we'll fetch it from the data instead of using the supplied
> // match to be consistent with findSplit
> auto dataMatch = data[restStart..restStart+match.length];
> auto post = rest[match.length..$];
> return tuple(pre, dataMatch, post);
> }
> }
> }
> return tuple(data, Range.init, Range.init);
> }
> unittest {
> auto text = "1\n2\r\n3\r4";
>
> auto res = text.findSplitAny("\r\n", "\n", "\r");
> assert(res[0] == "1");
> assert(res[1] == "\n");
> assert(res[2] == "2\r\n3\r4");
>
> res = res[2].findSplitAny("\r\n", "\n", "\r");
> assert(res[0] == "2");
> assert(res[1] == "\r\n");
> assert(res[2] == "3\r4");
>
> res = res[2].findSplitAny("\r\n", "\n", "\r");
> assert(res[0] == "3");
> assert(res[1] == "\r");
> assert(res[2] == "4");
>
> res = res[2].findSplitAny("\r\n", "\n", "\r");
> assert(res[0] == "4");
> assert(res[1] == "");
> assert(res[2] == "");
> }
I, for one, would like to see that in phobos...
Although it should probably be called findSplitAmong to be consistent with findAmong ;)
|
June 24, 2012 Re: phobos and splitting things... but not with whitespace. | ||||
---|---|---|---|---|
| ||||
Posted in reply to Chad J | Just found a follow-up post: http://dblog.aldacron.net/2012/06/24/my-only-gripes-about-d/ |
June 24, 2012 Re: phobos and splitting things... but not with whitespace. | ||||
---|---|---|---|---|
| ||||
Posted in reply to Roman D. Boiko | On Sun, 24 Jun 2012 10:02:07 +0200, Roman D. Boiko <rb@d-coding.com> wrote:
> Just found a follow-up post: http://dblog.aldacron.net/2012/06/24/my-only-gripes-about-d/
Just found it myself. RSS for the win :)
I can't say I disagree. You have to read through several modules to find what you need: std.string, std.range, std.array, std.algorithm (and, hopefully soon, std.collection)
|
Copyright © 1999-2021 by the D Language Foundation