December 07, 2003
Recently Achim Schmitt pointed out[1] that performing multiple "match()"'s using the same regular expression on different strings had unexpected results.  This is due to the index of the last match not being reset to 0 when the input string changes (see my previous reply[2] for a longer, possible more confusing, explanation).

In an attempt to fix this behaviour I edited the "regexp.d" file such that "input" became a class property with a simple getter and setter. This worked, sort of. Whilst it fixed the problem of the match indexes being reset when the input string changes it has also caused a rather strange segmentation fault to arise.

When running the unit tests a segmentation fault occurred when testing the following regular expression:

 > r = new RegExp("<(\\/)?([^<>]+)>", null);
 > result = r.split("a<b>font</b>bar<TAG>hello</TAG>");

To try and get a better idea of what was going on I recompiled with "-debug=regexp".  Instead of giving me an idea of where the fault was it actually "fixed" it, the unit test no longer caused a seg fault.

This led to a bit of investigation which resulted in the following two solutions:

     1. Change any references to "input" in the "testmatch" function
        /after/ "case REend:" to "m_input".
     2. Insert a function call (any function call) before the return
        statement in "case REend:".

I'm at a bit of a loss as to where to go from here, there's obviously something wrong, whether it's something I'm doing or otherwise, but I'll be damned if I can figure out what it is.  Has any got any ideas?

Attached is a diff against the "regexp.d" file that comes with DMD 0.76 that implements the changes I've made.  You'll notice the following line:

 > void doNothing(){}; doNothing();

Without that the seg fault described above is caused, with it everything works OK.  Any suggestions anyone can offer would be appreciated.


[1] http://www.digitalmars.com/drn-bin/wwwnews?D/19959
[2] http://www.digitalmars.com/drn-bin/wwwnews?D/19967