Thread overview
Splitting a string on multiple tokens
Oct 10, 2012
ixid
Oct 10, 2012
jerro
Oct 11, 2012
ixid
Oct 11, 2012
Dmitry Olshansky
October 10, 2012
Is there an effective way of splitting a string with a set of tokens? Splitter feels rather limited and multiple passes gives you an array of arrays of strings rather than an array of strings. I'm not sure if I'm missing an obvious application of library methods or if this is absent.
October 10, 2012
On Wednesday, 10 October 2012 at 00:18:17 UTC, ixid wrote:
> Is there an effective way of splitting a string with a set of tokens? Splitter feels rather limited and multiple passes gives you an array of arrays of strings rather than an array of strings. I'm not sure if I'm missing an obvious application of library methods or if this is absent.

You can use std.regex.splitter like this:

auto r = regex(`,| |(--)`);
auto str = "string we,want--to,split";
writeln(splitter(str, r)); //will pring ["string", "we", "want", "to", "split"]
October 11, 2012
On Wednesday, 10 October 2012 at 02:21:05 UTC, jerro wrote:
> On Wednesday, 10 October 2012 at 00:18:17 UTC, ixid wrote:
>> Is there an effective way of splitting a string with a set of tokens? Splitter feels rather limited and multiple passes gives you an array of arrays of strings rather than an array of strings. I'm not sure if I'm missing an obvious application of library methods or if this is absent.
>
> You can use std.regex.splitter like this:
>
> auto r = regex(`,| |(--)`);
> auto str = "string we,want--to,split";
> writeln(splitter(str, r)); //will pring ["string", "we", "want", "to", "split"]

Thank you, though that removes the tokens and being varied those would be messy to replace. Is there a way that lets you cut on tokens and keep those tokens at the ends of the statements they cause to get cut? This seem like basic parsing features that are absent.

October 11, 2012
On 11-Oct-12 06:40, ixid wrote:
> On Wednesday, 10 October 2012 at 02:21:05 UTC, jerro wrote:
>> On Wednesday, 10 October 2012 at 00:18:17 UTC, ixid wrote:
>>> Is there an effective way of splitting a string with a set of tokens?
>>> Splitter feels rather limited and multiple passes gives you an array
>>> of arrays of strings rather than an array of strings. I'm not sure if
>>> I'm missing an obvious application of library methods or if this is
>>> absent.
>>
>> You can use std.regex.splitter like this:
>>
>> auto r = regex(`,| |(--)`);
>> auto str = "string we,want--to,split";
>> writeln(splitter(str, r)); //will pring ["string", "we", "want", "to",
>> "split"]
>
> Thank you, though that removes the tokens and being varied those would
> be messy to replace. Is there a way that lets you cut on tokens and keep
> those tokens at the ends of the statements they cause to get cut? This
> seem like basic parsing features that are absent.
>
Well I guess something along these lines:
auto r = regex(`(?<=,| |(--))`);
auto str = "string we,want--to,split";
writeln(splitter(str, r));
//will print: ["string ", "we,", "want--", "to,", "split"]

And just in case - splitter doesn't copy anything, it just slices the original array.

If you meant to use this for tight loops like in a compiler then you really need something handcrafted for optimal speed.

-- 
Dmitry Olshansky