Thread overview
Using regex to "sort" its matches
Apr 16, 2013
Linden Krouse
Apr 16, 2013
Dmitry Olshansky
Apr 16, 2013
Linden Krouse
April 16, 2013
Is there a way to use the regex library to put matches of different regexs or classes into different slices? For instance, if I had the regular expressions "(?<=#)\w+\b" and "(?<=%)\w+\b", could I use them to match a string at the same time and stop if the first one is found and keep their results separate (as if the regex was "(?<=#)\w+\b|(?<=%)\w+\b)" and gave a slice for "(?<=#)\w+\b" and a different one for "(?<=%)\w+\b" ). I'm a little new to regexs for the record.
April 16, 2013
16-Apr-2013 22:59, Linden Krouse пишет:
> Is there a way to use the regex library to put matches of different
> regexs or classes into different slices? For instance, if I had the
> regular expressions "(?<=#)\w+\b" and "(?<=%)\w+\b", could I use them to
> match a string at the same time and stop if the first one is found and
> keep their results separate (as if the regex was
> "(?<=#)\w+\b|(?<=%)\w+\b)" and gave a slice for "(?<=#)\w+\b" and a
> different one for "(?<=%)\w+\b" ). I'm a little new to regexs for the
> record.

For that particular case you can as well use this pattern:
"(?<=#|%)\w+\b"

and then check what character is the last one before your match by hand:

foreach(m; match(<your string>, "(?<=#|%)\w+\b")){
	if(m.pre.back == '#')
		//put in one place
	else if(m.pre.back == '%')
		// put somewhere else
}

More general interface to do it is being discussed recently, in essence it gives you ability to switch over a bunch of regular expressions.

See:
https://github.com/D-Programming-Language/phobos/pull/1241

The machinery to efficiently combine regular expressions like that will come sometime later but the interface should stay the same.

-- 
Dmitry Olshansky
April 16, 2013
On Tuesday, 16 April 2013 at 20:00:37 UTC, Dmitry Olshansky wrote:
> 16-Apr-2013 22:59, Linden Krouse пишет:
>> Is there a way to use the regex library to put matches of different
>> regexs or classes into different slices? For instance, if I had the
>> regular expressions "(?<=#)\w+\b" and "(?<=%)\w+\b", could I use them to
>> match a string at the same time and stop if the first one is found and
>> keep their results separate (as if the regex was
>> "(?<=#)\w+\b|(?<=%)\w+\b)" and gave a slice for "(?<=#)\w+\b" and a
>> different one for "(?<=%)\w+\b" ). I'm a little new to regexs for the
>> record.
>
> For that particular case you can as well use this pattern:
> "(?<=#|%)\w+\b"
>
> and then check what character is the last one before your match by hand:
>
> foreach(m; match(<your string>, "(?<=#|%)\w+\b")){
> 	if(m.pre.back == '#')
> 		//put in one place
> 	else if(m.pre.back == '%')
> 		// put somewhere else
> }
>
> More general interface to do it is being discussed recently, in essence it gives you ability to switch over a bunch of regular expressions.
>
> See:
> https://github.com/D-Programming-Language/phobos/pull/1241
>
> The machinery to efficiently combine regular expressions like that will come sometime later but the interface should stay the same.

Nice, that work well for what I'm trying to do when it gets added. Thanks