Thread overview
find regex in backward direction ?
Dec 19, 2020
kdevel
December 19, 2020
We have:
    dstring s = "abc3abc7";

Source:
    https://run.dlang.io/is/PtjN4T

Goal:
    size_t pos = findRegexBackward( r"abc"d );
    assert( pos == 4 );


How to find regex in backward direction ?

December 19, 2020
On Saturday, 19 December 2020 at 12:52:54 UTC, Виталий Фадеев wrote:
> Goal:
>     size_t pos = findRegexBackward( r"abc"d );
>     assert( pos == 4 );


module LastOccurrence;

size_t findRegexBackward_1 (dstring s, dstring pattern)
{
   import std.regex : matchAll;
   auto results = matchAll (s, pattern);
   if (results.empty)
      throw new Exception ("could not match");
   size_t siz;
   foreach (rm; results)
      siz = rm.pre.length;
   return siz;
}

size_t findRegexBackward_2 (dstring s, dstring pattern)
// this does not work with irreversible patterns ...
{
   import std.regex : matchFirst;
   import std.array : array;
   import std.range: retro;
   auto result = matchFirst (s.retro.array, pattern.retro.array);
   if (result.empty)
      throw new Exception ("could not match");
   return result.post.length;
}

unittest {
   import std.exception : assertThrown;
   static foreach (f; [&findRegexBackward_1, &findRegexBackward_2]) {
      assert (f ("abc3abc7", r""d) == 8);
      assert (f ("abc3abc7", r"abc"d) == 4);
      assertThrown (f ("abc3abc7", r"abx"d));
      assert (f ("abababababab", r"ab"d) == 10);
   }
}
December 20, 2020
On Saturday, 19 December 2020 at 23:16:18 UTC, kdevel wrote:
> On Saturday, 19 December 2020 at 12:52:54 UTC, Виталий Фадеев wrote:
>> Goal:
>>     size_t pos = findRegexBackward( r"abc"d );
>>     assert( pos == 4 );
>
>
> module LastOccurrence;
>
> size_t findRegexBackward_1 (dstring s, dstring pattern)
> {
>    import std.regex : matchAll;
>    auto results = matchAll (s, pattern);
>    if (results.empty)
>       throw new Exception ("could not match");
>    size_t siz;
>    foreach (rm; results)
>       siz = rm.pre.length;
>    return siz;
> }
>
> size_t findRegexBackward_2 (dstring s, dstring pattern)
> // this does not work with irreversible patterns ...
> {
>    import std.regex : matchFirst;
>    import std.array : array;
>    import std.range: retro;
>    auto result = matchFirst (s.retro.array, pattern.retro.array);
>    if (result.empty)
>       throw new Exception ("could not match");
>    return result.post.length;
> }
>
> unittest {
>    import std.exception : assertThrown;
>    static foreach (f; [&findRegexBackward_1, &findRegexBackward_2]) {
>       assert (f ("abc3abc7", r""d) == 8);
>       assert (f ("abc3abc7", r"abc"d) == 4);
>       assertThrown (f ("abc3abc7", r"abx"d));
>       assert (f ("abababababab", r"ab"d) == 10);
>    }
> }

Thanks.
But, not perfect.

We can't use reverse, becausу "ab\w" will be "w\ba" ( expect matching "abc". revesed is "cba" ).

> size_t findRegexBackward_2 (dstring s, dstring pattern)
> ...
>    assert (f ("abc3abc7", r"ab\w"d) == 4);
> ...

Of course, I using matchAll. But it scan all text in forward direction.

>   size_t findRegexBackward_1 (dstring s, dstring pattern)

    /** */
    size_t findRegexBackwardMatchCase( dstring s, dstring needle, out size_t matchedLength )
    {
        auto matches = matchAll( s, needle );
        if ( matches.empty )
        {
            return -1;
        }
        else
        {
            auto last = matches.front;
            foreach ( m; matches )
            {
                last = m;
            }
            matchedLength = last.hit.length;
            return last.pre.length;
        }
    }

Thank!
Fastest solution wanted!

May be... some like a "RightToLeft" in Win32 API...

https://docs.microsoft.com/en-us/dotnet/api/system.text.regularexpressions.regexoptions?view=net-5.0#System_Text_RegularExpressions_RegexOptions_RightToLeft

but how on Linux? MS-regex and Linux-regex is identical ?

December 20, 2020
On Sunday, 20 December 2020 at 04:33:21 UTC, Виталий Фадеев wrote:
> On Saturday, 19 December 2020 at 23:16:18 UTC, kdevel wrote:
>> On Saturday, 19 December 2020 at 12:52:54 UTC, Виталий Фадеев wrote:
> ...

"retro" possible when using simple expression "abc".
For complex "ab\w" or "(?P<name>regex)" should be parsing: [ "a", "b", "\w" ],  [ "(", "?", "P", "<name>", "regex", ")"]..., i think.

up.