Thread overview
[Issue 5674] New: AssertError in std.regex
Mar 01, 2011
Jacob Carlborg
Mar 31, 2011
Magnus Lie Hetland
Apr 06, 2011
Matt Peterson
Apr 06, 2011
Matt Peterson
Apr 20, 2011
Dmitry Olshansky
Feb 24, 2012
Dmitry Olshansky
March 01, 2011
http://d.puremagic.com/issues/show_bug.cgi?id=5674

           Summary: AssertError in std.regex
           Product: D
           Version: D2
          Platform: Other
        OS/Version: Mac OS X
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Phobos
        AssignedTo: nobody@puremagic.com
        ReportedBy: doob@me.com


--- Comment #0 from Jacob Carlborg <doob@me.com> 2011-03-01 08:14:43 PST ---
The following code results in an AssertError or RangeError (don't know if the RangeError is expected behavior) :

import std.regex;
import std.stdio;

void main ()
{
    auto m = "abc".match(`a(\w)b`);

    writeln(m.hit); // AssertError in regex.d:1795
    writeln(m.captures); // RangeError in regex.d:1719
}

Can't "hit" just return an empty string and "captures" an empty range?

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
March 31, 2011
http://d.puremagic.com/issues/show_bug.cgi?id=5674


Magnus Lie Hetland <magnus@hetland.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |magnus@hetland.org


--- Comment #1 from Magnus Lie Hetland <magnus@hetland.org> 2011-03-31 07:09:07 PDT ---
I have similar problems with stuff like this:

import std.stdio, std.regex;
void main() {
    foreach (m; match("abc", "a|(x)")) {
        foreach (e; m.captures) {
            writeln(e);
        }
    }
}

Here it prints out "a" and then I get a range violation. Whether or not m.captures[1] exists, iterating over m.captures should be possible?

Also: Checking whether m.captures[1] exists would be highly useful -- to see what has matched. (Doing this by length wouldn't work in general, of course.)

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
April 06, 2011
http://d.puremagic.com/issues/show_bug.cgi?id=5674


Matt Peterson <revcompgeek@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |revcompgeek@gmail.com


--- Comment #2 from Matt Peterson <revcompgeek@gmail.com> 2011-04-06 11:41:18 PDT ---
After some debugging, it looks like Captures is looking for the first unmatched group and stopping there when giving the length of the captures, which I believe is the cause of the assert error.

The second problem is that when a group is unmatched the startIdx and endIdx are stored as size_t.max, and when Captures.front/opIndex as well as RegexMatch.hit try to slice the input with those numbers causes a range violation. Most regex engines handle this by returning null if a group is unmatched.

I'll try to submit a patch soon if I get it working.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
April 06, 2011
http://d.puremagic.com/issues/show_bug.cgi?id=5674



--- Comment #3 from Matt Peterson <revcompgeek@gmail.com> 2011-04-06 12:50:32 PDT ---
Created an attachment (id=939)
This patch fixes the problems with unmatched groups in a match.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
April 20, 2011
http://d.puremagic.com/issues/show_bug.cgi?id=5674


Dmitry Olshansky <dmitry.olsh@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |dmitry.olsh@gmail.com


--- Comment #4 from Dmitry Olshansky <dmitry.olsh@gmail.com> 2011-04-20 04:04:53 PDT ---
(In reply to comment #3)
> Created an attachment (id=939) [details]
> This patch fixes the problems with unmatched groups in a match.

Acctually I'm working on fixing all of the issues of std.regex, see this pull request https://github.com/D-Programming-Language/phobos/pull/22

There is a litle problem with your patch.
If the match is empty (there are such regexes) or there is not match
RegexMatch.hit still happily returns "", maybe it's better to let it hit assert
on no match just like it was to enforce checking of empty.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
February 24, 2012
http://d.puremagic.com/issues/show_bug.cgi?id=5674


Dmitry Olshansky <dmitry.olsh@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |WONTFIX


--- Comment #5 from Dmitry Olshansky <dmitry.olsh@gmail.com> 2012-02-24 12:04:19 PST ---
Things got mixed here a bit, but initial issue is a clean won't fix as it works as designed. One should test RegexMatch for empty just like any other range.

The second issue here was fixed with pull 22 for the previous version of std.regex, and never existed in a new one.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------