Thread overview
[Issue 18600] Regex performance enhancement for repeated matchFirst calls
May 14, 2018
Jon Degenhardt
March 22, 2018
https://issues.dlang.org/show_bug.cgi?id=18600

--- Comment #1 from github-bugzilla@puremagic.com ---
Commit pushed to master at https://github.com/dlang/phobos

https://github.com/dlang/phobos/commit/4318073f40ae3e82ac1847da5e037ab2f091d6fc Fix issue 18600 Revamp std.regex caching for matchFirst case

--
March 22, 2018
https://issues.dlang.org/show_bug.cgi?id=18600

github-bugzilla@puremagic.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |FIXED

--
May 14, 2018
https://issues.dlang.org/show_bug.cgi?id=18600

--- Comment #2 from Jon Degenhardt <jrdemail2000-dlang@yahoo.com> ---
Phobos PR 6268 was included in LDC 1.10.0-beta1. For this release the standard benchmark I used for the TSV Utilities improved as follows:

LDC 1.7.0 (before regression):     8.37 seconds
LDC 1.8.0 (after regression):     10.01 seconds
LDC 1.9.0 (first fixes):           9.44 seconds
LDC 1.10.0-beta1 (Phobos PR 6268):  5.85 seconds

First fixes: Phobos PR 5981, DMD PR 7599

The benchmark test used reads a TSV file line-by-line and checks individual fields for regex matches. A significant amount of processing time is IO, so the percentage gain on the regex portion is higher than the overall gain. The overall gain from LDC 1.7.0 is 30%.

Test was run on MacOS, MacMini with 16GB RAM, SSD drives. The file used was
2.7GB, 14 million lines. Test info can be found here:
https://github.com/eBay/tsv-utils-dlang/blob/master/docs/ComparativeBenchmarks2018.md

Great result!

--