Thread overview
Help with Regular Expressions (std.regex)
Mar 03, 2019
Samir
Mar 03, 2019
user1234
Mar 03, 2019
user1234
Mar 04, 2019
Samir
Mar 04, 2019
dwdv
Mar 05, 2019
Samir
March 03, 2019
I am belatedly working my way through the 2018 edition of the Advent of Code[1] programming challenges using D and am stumped on Problem 3[2].  The challenge requires you to parse a set of lines in the format:
#99 @ 652,39: 24x23
#100 @ 61,13: 15x24
#101 @ 31,646: 16x28

I would like to store each number (match) as an element in an array so that I can refer to them by index.  For example, for the first line:

m = [99, 652, 39, 24, 23]
assert(m[0] == 99);
assert(m[1] == 652);
// ...
assert(m[4] == 23);

What is the best way to do this?  (I will worry about converting characters to integers later.)

I have the following solution so far based on reading Dmitry Olshansky's article on std.regex[3] and the std.regex documention[4]:

import std.stdio;
import std.regex;

void main() {
    auto line    = "#99 @ 652,39: 24x23";
    auto pattern = regex(r"\d+");
    auto m       = matchAll(line, pattern);
    writeln(m);
}

which results in:
[["99"], ["652"], ["39"], ["24"], ["23"]]

But this doesn't seem to be an iterable array as changing writeln(m) to writeln(m[0]) yields
Error: no [] operator overload for type RegexMatch!string

Changing the line to writeln(m.front[0]) yields
99

but m.front doesn't allow me to access other elements (i.e. m.front[1]):
requested submatch number 1 is out of range
----------------
??:? _d_assert_msg [0x4dc27a]
??:? inout pure nothrow @trusted inout(immutable(char)[]) std.regex.Captures!(immutable(char)[]).Captures.opIndex!().opIndex(ulong) [0x4d8d57]
??:? _Dmain [0x49ffc8]

I've tried something like
foreach (m; matchAll(line, pattern))
        writeln(m.hit);

which is close but doesn't result in an array.  Do I need to use matchFirst?

Thanks in advance.
Samir

[1] https://adventofcode.com/2018
[2] https://adventofcode.com/2018/day/3
[3] https://dlang.org/articles/regular-expression.html
[4] https://dlang.org/phobos/std_regex.html
March 03, 2019
On Sunday, 3 March 2019 at 18:07:57 UTC, Samir wrote:
> I am belatedly working my way through the 2018 edition of the Advent of Code[1] programming challenges using D and am stumped on Problem 3[2].  The challenge requires you to parse a set of lines in the format:
> #99 @ 652,39: 24x23
> #100 @ 61,13: 15x24
> #101 @ 31,646: 16x28
>
> I would like to store each number (match) as an element in an array so that I can refer to them by index.  For example, for the first line:
>
> m = [99, 652, 39, 24, 23]
> assert(m[0] == 99);
> assert(m[1] == 652);
> // ...
> assert(m[4] == 23);
>
> What is the best way to do this?  (I will worry about converting characters to integers later.)
>
> I have the following solution so far based on reading Dmitry Olshansky's article on std.regex[3] and the std.regex documention[4]:
>
> import std.stdio;
> import std.regex;
>
> void main() {
>     auto line    = "#99 @ 652,39: 24x23";
>     auto pattern = regex(r"\d+");
>     auto m       = matchAll(line, pattern);
>     writeln(m);
> }
>
> which results in:
> [["99"], ["652"], ["39"], ["24"], ["23"]]
>
> But this doesn't seem to be an iterable array as changing writeln(m) to writeln(m[0]) yields
> Error: no [] operator overload for type RegexMatch!string
>
> Changing the line to writeln(m.front[0]) yields
> 99
>
> but m.front doesn't allow me to access other elements (i.e. m.front[1]):
> requested submatch number 1 is out of range
> ----------------
> ??:? _d_assert_msg [0x4dc27a]
> ??:? inout pure nothrow @trusted inout(immutable(char)[]) std.regex.Captures!(immutable(char)[]).Captures.opIndex!().opIndex(ulong) [0x4d8d57]
> ??:? _Dmain [0x49ffc8]
>
> I've tried something like
> foreach (m; matchAll(line, pattern))
>         writeln(m.hit);
>
> which is close but doesn't result in an array.  Do I need to use matchFirst?
>
> Thanks in advance.
> Samir
>
> [1] https://adventofcode.com/2018
> [2] https://adventofcode.com/2018/day/3
> [3] https://dlang.org/articles/regular-expression.html
> [4] https://dlang.org/phobos/std_regex.html

Hello, Something like this should work:

  import std.array: array;
  auto allMatches = matchAll(line, pattern).array;

or  // sorry i don't have the regex API in mind

  import std.array: array;
  import std.alogrithm.iteration : map;
  auto allMatches = matchAll(line, pattern).map(a => a.hit).array;


What happened with `writeln` is that it iterates the `matchAll` results which is an input range, which is lazy. `.array` stores the results in an array.
March 03, 2019
On Sunday, 3 March 2019 at 18:32:14 UTC, user1234 wrote:
> On Sunday, 3 March 2019 at 18:07:57 UTC, Samir wrote:
> or  // sorry i don't have the regex API in mind
>
>   import std.array: array;
>   import std.alogrithm.iteration : map;
>   auto allMatches = matchAll(line, pattern).map(a => a.hit).array;

oops forgot the bang

  auto allMatches = matchAll(line, pattern).map!(a => a.hit).array;


March 04, 2019
On Sunday, 3 March 2019 at 19:27:17 UTC, user1234 wrote:
> oops forgot the bang
>
>   auto allMatches = matchAll(line, pattern).map!(a => a.hit).array;

Thanks, user1234!  Looks like `map` is another topic I need to read up upon.  I slightly modified your suggestion and went with:

auto allMatches = matchAll(line, pattern).map!(a => to!int(a.hit)).array;

which also takes care of converting the string to int.

Samir

March 04, 2019
On 3/3/19 7:07 PM, Samir via Digitalmars-d-learn wrote:
> I am belatedly working my way through the 2018 edition of the Advent of Code[1] programming challenges using D and am stumped on Problem 3[2].  The challenge requires you to parse a set of lines in the format:
> #99 @ 652,39: 24x23
> #100 @ 61,13: 15x24
> #101 @ 31,646: 16x28
> 
> I would like to store each number (match) as an element in an array so that I can refer to them by index.

There is also std.file.slurp which makes this quite easy:
slurp!(int, int, int, int, int)("03.input", "#%d @ %d,%d: %dx%d");

You can then later expand the matches in a loop and process the claims:
foreach(id, offX, offY, width, height; ...
March 05, 2019
On Monday, 4 March 2019 at 18:57:34 UTC, dwdv wrote:
> There is also std.file.slurp which makes this quite easy:
> slurp!(int, int, int, int, int)("03.input", "#%d @ %d,%d: %dx%d");

That's brilliant!  This language just keeps putting a smile on my face every time I learn something new like this!