Jump to page: 1 2
Thread overview
[Issue 8203] New: Use of std.regex.match() generates "not enough preallocated memory" error
Jun 06, 2012
Dmitry Olshansky
Jun 07, 2012
Dmitry Olshansky
Jun 07, 2012
Dmitry Olshansky
Jun 08, 2012
Jonathan M Davis
June 06, 2012
http://d.puremagic.com/issues/show_bug.cgi?id=8203

           Summary: Use of std.regex.match() generates "not enough
                    preallocated memory" error
           Product: D
           Version: D2
          Platform: x86_64
        OS/Version: Windows
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Phobos
        AssignedTo: nobody@puremagic.com
        ReportedBy: phshaffer@gmail.com


--- Comment #0 from phshaffer@gmail.com 2012-06-06 05:16:26 PDT ---
Created an attachment (id=1110)
File to Compare

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
June 06, 2012
http://d.puremagic.com/issues/show_bug.cgi?id=8203



--- Comment #1 from phshaffer@gmail.com 2012-06-06 05:17:42 PDT ---
Created an attachment (id=1111)
File to Compare

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
June 06, 2012
http://d.puremagic.com/issues/show_bug.cgi?id=8203



--- Comment #2 from phshaffer@gmail.com 2012-06-06 05:21:26 PDT ---
Created an attachment (id=1112)
Source File

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
June 06, 2012
http://d.puremagic.com/issues/show_bug.cgi?id=8203



--- Comment #3 from phshaffer@gmail.com 2012-06-06 05:23:23 PDT ---
Created an attachment (id=1113)
Console Screenshot with Error Showing

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
June 06, 2012
http://d.puremagic.com/issues/show_bug.cgi?id=8203



--- Comment #4 from phshaffer@gmail.com 2012-06-06 05:43:36 PDT ---
Dmitry Olshansky recommended I submit this as a bug.

The program is executed as : icomp2 fold.txt fnew.txt

It should search fold.txt for certain text patterns and then see if all "found" text also appears in fnew.txt.  Fold.txt and Fnew.txt are identical so all "found" text should appeart in Fnew.txt as well.

I added some diagnostic loops counters for troubleshooting: writeln(cntOld,"  ",cntNew,"  ",matchOld.hit,"  ",matchNew.hit);

As the screenshot shows after several iterations, it crashes with -> core.exception.AssertError@C:\D\dmd2\windows\bin\..\..\src\phobos\std\regex.d(60 50): not enough preallocated memory

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
June 06, 2012
http://d.puremagic.com/issues/show_bug.cgi?id=8203


Dmitry Olshansky <dmitry.olsh@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |dmitry.olsh@gmail.com


--- Comment #5 from Dmitry Olshansky <dmitry.olsh@gmail.com> 2012-06-06 06:13:01 PDT ---
(In reply to comment #4)
> Dmitry Olshansky recommended I submit this as a bug.
> 

Yup, case I'm the only one to fix it, at least in near future ;)

> The program is executed as : icomp2 fold.txt fnew.txt
> 
> It should search fold.txt for certain text patterns and then see if all "found" text also appears in fnew.txt.  Fold.txt and Fnew.txt are identical so all "found" text should appeart in Fnew.txt as well.
> 
> I added some diagnostic loops counters for troubleshooting: writeln(cntOld,"  ",cntNew,"  ",matchOld.hit,"  ",matchNew.hit);
> 
> As the screenshot shows after several iterations, it crashes with -> core.exception.AssertError@C:\D\dmd2\windows\bin\..\..\src\phobos\std\regex.d(60 50): not enough preallocated memory

Thanks, I'm on it. We'd better get fixed it in 2.060.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
June 07, 2012
http://d.puremagic.com/issues/show_bug.cgi?id=8203



--- Comment #6 from Dmitry Olshansky <dmitry.olsh@gmail.com> 2012-06-07 04:35:32 PDT ---
I've studied it a bit, and here is the details:
it only happens, when re-running the same match object many times:

foreach(v; match(...)) // no bug
vs
auto m = match(....)
foreach(v; m) //does run out of memory

In your case I see from comments that you try hard to do eager evalutaion, and
first find all matches then work through two arrays of them. Yet it's not what
program does, it still performes N*M regex searches because
auto uniCapturesNew = match(uniFileOld, regex(...));

just starts the engine and finds 1st match. Then you copy engine state on each iteration of nested loop (this copy operation is bogus apparently) and run engine till all matches are found. Next iteration of loop  - another copy.

So in your case I strongly suggest to do this magic recipe, that work for all
lazy ranges:
auto allMatches = array(match(....);

and work with arrays from now on.


Anyway, the root cause is now clear and I've reduced it to:

import std.regex;
string data = "
NAME   = XPAW01_STA:STATION
NAME   = XPAW01_STA
";
// Main function
void main(){
    auto uniFileOld = data;
    auto uniCapturesNew = match(uniFileOld, regex(r"^NAME   =
(?P<comp>[a-zA-Z0-9_]+):*(?P<blk>[a-zA-Z0-9_]*)","gm"));

    for(int i=0; i<20; i++)
  { foreach (matchNew; uniCapturesNew) {} }
}

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
June 07, 2012
http://d.puremagic.com/issues/show_bug.cgi?id=8203


Dmitry Olshansky <dmitry.olsh@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |pull
           Platform|x86_64                      |All
         OS/Version|Windows                     |All


--- Comment #7 from Dmitry Olshansky <dmitry.olsh@gmail.com> 2012-06-07 14:38:19 PDT ---
https://github.com/D-Programming-Language/phobos/pull/623

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
June 08, 2012
http://d.puremagic.com/issues/show_bug.cgi?id=8203



--- Comment #8 from github-bugzilla@puremagic.com 2012-06-08 01:07:21 PDT ---
Commits pushed to master at https://github.com/D-Programming-Language/phobos

https://github.com/D-Programming-Language/phobos/commit/0c35fcd694481753cebae9803906f6d857fe954f fix Issue 8203

Change RegexMatch objects to follow proper COW semantics

https://github.com/D-Programming-Language/phobos/commit/245782bb6393b4a415c0e1e93b8a05f448e1457f unittest for bug 8203

https://github.com/D-Programming-Language/phobos/commit/f1757b88fa2fda9f5db74493be762c058d3e0111 Merge pull request #623 from blackwhale/nested-regex

fix Issue 8203 std.regex.match() generates "not enough preallocated memory"

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
June 08, 2012
http://d.puremagic.com/issues/show_bug.cgi?id=8203



--- Comment #9 from github-bugzilla@puremagic.com 2012-06-08 01:17:51 PDT ---
Commit pushed to master at https://github.com/D-Programming-Language/phobos

https://github.com/D-Programming-Language/phobos/commit/065e7a1f78f176b988820b0a54e22d8eb9d59819 Updated changelog for fix to issue 8203.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
« First   ‹ Prev
1 2