Jump to page: 1 2
Thread overview
[Issue 13532] std.regex performance (enums; regex vs ctRegex)
Sep 26, 2014
Vladimir Panteleev
Sep 27, 2014
Dmitry Olshansky
Apr 06, 2016
Dmitry Olshansky
May 09, 2016
Vladimir Panteleev
Jul 02, 2017
Vladimir Panteleev
Jul 21, 2017
Vladimir Panteleev
Sep 05, 2017
Dmitry Olshansky
Sep 08, 2017
Dmitry Olshansky
September 26, 2014
https://issues.dlang.org/show_bug.cgi?id=13532

hsteoh@quickfur.ath.cx changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |hsteoh@quickfur.ath.cx

--- Comment #1 from hsteoh@quickfur.ath.cx ---
ctRegex is slower than regular regex?! Whoa. That just sounds completely wrong. What's the cause of this slowdown? I thought the whole point of ctRegex is to outperform runtime regex by making use of compile-time optimization. Whatever happened to that?? If this is the case, we might as well throw ctRegex away.

--
September 26, 2014
https://issues.dlang.org/show_bug.cgi?id=13532

Vladimir Panteleev <thecybershadow@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |dmitry.olsh@gmail.com

--- Comment #2 from Vladimir Panteleev <thecybershadow@gmail.com> ---
Well, it's slower for this particular case, not necessarily in general. CCing Dmitry.

--
September 27, 2014
https://issues.dlang.org/show_bug.cgi?id=13532

--- Comment #3 from Dmitry Olshansky <dmitry.olsh@gmail.com> ---
(In reply to Vladimir Panteleev from comment #0)
> The first surprise for me was that declaring a regex object (either Regex or > StaticRegex) with "enum" was so much slower. It makes sense now that I think > about it: creating a struct literal inside a loop will be more expensive than > referencing one already residing somewhere in memory. Perhaps it might be > worth mentioning in the documentation to avoid using enum with compiled regexes.

It's a common anti-pattern, it's the same issue with array literals, it's the same issue with anything that takes some time to compute or allocates. regex function call does both.

It's worth adding a note though, fell free t create a pull. I'm not sure I'll get to it soon.

(In reply to Vladimir Panteleev from comment #2)
> Well, it's slower for this particular case, not necessarily in general. CCing Dmitry.

That's right. Problem is simple backtracking engine of CTFE version which is an unfortunate historical point as I'd pick the other engine of the two if I could go back in time.

(In reply to hsteoh from comment #1)
> ctRegex is slower than regular regex?! Whoa. That just sounds completely wrong. What's the cause of this slowdown? I thought the whole point of ctRegex is to outperform runtime regex by making use of compile-time optimization. Whatever happened to that?? If this is the case, we might as well throw ctRegex away.

I'm fully aware of this. Unfortunately adding yet another engine (C-T "robust" engine) is increasingly a maintenace disaster.

Consider also that working on compile-time generated regex is a nightmare of
~5-10 minutes to run all tests and
constant out of memory conditions. Duplicating the amount of work done at CTFE
is something DMD CAN'T handle at the moment.

Another problem is regex accumulated a lot of technical debt, and needs a serious amount of refactoring before pling up more stuff. Then with modular design (the one I roughly outlined in my talk) we can put more and more components into it. Sadly all of this goes very slooowly.

--
April 06, 2016
https://issues.dlang.org/show_bug.cgi?id=13532

--- Comment #4 from Dmitry Olshansky <dmitry.olsh@gmail.com> ---
(In reply to Vladimir Panteleev from comment #2)
> Well, it's slower for this particular case, not necessarily in general. CCing Dmitry.

Please try with https://github.com/D-Programming-Language/phobos/pull/4147

--
May 09, 2016
https://issues.dlang.org/show_bug.cgi?id=13532

--- Comment #5 from Vladimir Panteleev <thecybershadow@gmail.com> ---
(In reply to Dmitry Olshansky from comment #4)
> (In reply to Vladimir Panteleev from comment #2)
> > Well, it's slower for this particular case, not necessarily in general. CCing Dmitry.
> 
> Please try with https://github.com/D-Programming-Language/phobos/pull/4147

As before (all changes are at most 20% since 2014).

--
July 02, 2017
https://issues.dlang.org/show_bug.cgi?id=13532

--- Comment #6 from Vladimir Panteleev <dlang-bugzilla@thecybershadow.net> ---
2017 timings with LDC 1.2.0 (DMD v2.072.2, LLVM 4.0.0):

regexInline    7 secs, 342 ms, 775 μs, and 9 hnsecs
regexAuto    5 secs, 195 ms, and 526 μs
regexStatic    5 secs, 158 ms, 479 μs, and 2 hnsecs
regexEnum    18 secs, 777 ms, 420 μs, and 7 hnsecs
ctRegexInline    20 secs, 38 ms, and 25 μs
ctRegexAuto    6 secs, 16 ms, 155 μs, and 1 hnsec
ctRegexStatic    5 secs, 921 ms, 572 μs, and 3 hnsecs
ctRegexEnum    20 secs, 422 ms, 889 μs, and 4 hnsecs
reInline    5 secs, 84 ms, 943 μs, and 1 hnsec

--
July 21, 2017
https://issues.dlang.org/show_bug.cgi?id=13532

Vladimir Panteleev <dlang-bugzilla@thecybershadow.net> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           See Also|                            |https://issues.dlang.org/sh
                   |                            |ow_bug.cgi?id=16457

--
September 05, 2017
https://issues.dlang.org/show_bug.cgi?id=13532

--- Comment #7 from Dmitry Olshansky <dmitry.olsh@gmail.com> ---
https://github.com/dlang/phobos/pull/5722

--
September 08, 2017
https://issues.dlang.org/show_bug.cgi?id=13532

Dmitry Olshansky <dmitry.olsh@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Assignee|nobody@puremagic.com        |dmitry.olsh@gmail.com

--
October 16, 2017
https://issues.dlang.org/show_bug.cgi?id=13532

--- Comment #8 from github-bugzilla@puremagic.com ---
Commits pushed to master at https://github.com/dlang/phobos

https://github.com/dlang/phobos/commit/a877469f07819fa26cd12248f11fd59cbea6563a Fix issue 13532 - std.regex performance (enums; regex vs ctRegex)

https://github.com/dlang/phobos/commit/ad489989ec3fac9f65f4bb9d43d2254a0b718dc7 Merge pull request #5722 from DmitryOlshansky/regex-matcher-interfaces

std.regex: major internal redesign, also fixes issue 13532 merged-on-behalf-of: Andrei Alexandrescu <andralex@users.noreply.github.com>

--
« First   ‹ Prev
1 2