Jump to page: 1 25  
Page
Thread overview
Bye bye, fast compilation times
Feb 05, 2018
H. S. Teoh
Feb 06, 2018
psychoticRabbit
Feb 06, 2018
psychoticRabbit
Feb 06, 2018
rikki cattermole
Feb 06, 2018
Dmitry Olshansky
Feb 06, 2018
Dmitry Olshansky
Feb 06, 2018
Nathan S.
Feb 06, 2018
Dmitry Olshansky
Feb 06, 2018
H. S. Teoh
Feb 06, 2018
Walter Bright
Feb 06, 2018
Timothee Cour
Feb 06, 2018
Walter Bright
Feb 06, 2018
H. S. Teoh
Feb 07, 2018
Nathan S.
Feb 08, 2018
Walter Bright
Feb 09, 2018
bauss
Feb 11, 2018
aliak
Feb 11, 2018
Adam D. Ruppe
Feb 11, 2018
aliak
Feb 06, 2018
Jacob Carlborg
Feb 06, 2018
Walter Bright
Feb 06, 2018
Andres Clari
Feb 07, 2018
H. S. Teoh
Feb 07, 2018
Andres Clari
Feb 07, 2018
H. S. Teoh
Feb 07, 2018
Walter Bright
Feb 07, 2018
psychoticRabbit
Feb 06, 2018
Dmitry Olshansky
Feb 06, 2018
H. S. Teoh
Feb 07, 2018
Stefan Koch
Feb 07, 2018
Stefan Koch
Feb 07, 2018
Bastiaan Veelo
Feb 08, 2018
Stefan Koch
Feb 08, 2018
Stefan Koch
Feb 12, 2018
Bastiaan Veelo
Feb 13, 2018
Dmitry Olshansky
Feb 13, 2018
Stefan Koch
February 05, 2018
One of my D projects for the past while has been taking unusually long times to compile.  This morning, I finally decided to sit down and figure out exactly why. What I found was rather disturbing:

------
import std.regex;
void main() {
	auto re = regex(``);
}
------

Compile command: time dmd -c test.d

Output:
------
real    0m3.113s
user    0m2.884s
sys     0m0.226s
------

Comment out the call to `regex()`, and I get:

------
real    0m0.285s
user    0m0.262s
sys     0m0.023s
------

Clearly, something is wrong if the mere act of compiling a regex causes a 4-line program to take *3 seconds* to compile, where normally dmd takes less than a second.

Apparently, the offending Phobos PR was merged late last year:

	https://issues.dlang.org/show_bug.cgi?id=18378

This is a serious slap-in-the-face to dmd's reputation of super-fast compilation.  Makes our "fast code, fast" slogan look more and more ironic. :-(

(Note: this particular regression is in *compilation* times; it's not directly related to the *performance* of the regex code itself. The latter department as also suffered a regression; see for example: https://github.com/dlang/phobos/pull/5981.)


T

-- 
Маленькие детки - маленькие бедки.
February 06, 2018
On Monday, 5 February 2018 at 21:27:57 UTC, H. S. Teoh wrote:
>
> Comment out the call to `regex()`, and I get:
>
> ------
> real    0m0.285s
> user    0m0.262s
> sys     0m0.023s
> ------
>

regex is not the only one I avoid..

how long you think this takes to compile?
(try ldc2 too ..just for laughs ;-)

----
import std.net.isemail;

void main()
{
    auto checkEmail = "someone@somewhere.com".isEmail();
}
----
February 06, 2018
On Tuesday, 6 February 2018 at 04:09:24 UTC, psychoticRabbit wrote:
> how long you think this takes to compile?
> (try ldc2 too ..just for laughs ;-)
>
> ----
> import std.net.isemail;
>
> void main()
> {
>     auto checkEmail = "someone@somewhere.com".isEmail();
> }
> ----

oh.. and for an even bigger laugh... -O -release  (ldc2 took ~10 seconds)


February 05, 2018
On 2/5/18 11:09 PM, psychoticRabbit wrote:
> On Monday, 5 February 2018 at 21:27:57 UTC, H. S. Teoh wrote:
>>
>> Comment out the call to `regex()`, and I get:
>>
>> ------
>> real    0m0.285s
>> user    0m0.262s
>> sys     0m0.023s
>> ------
>>
> 
> regex is not the only one I avoid..
> 
> how long you think this takes to compile?
> (try ldc2 too ..just for laughs ;-)
> 
> ----
> import std.net.isemail;
> 
> void main()
> {
>      auto checkEmail = "someone@somewhere.com".isEmail();
> }
> ----

I was surprised at this, then I looked at the first line of isEmail:

    static ipRegex = ctRegex!(`\b(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}`~

`(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$`.to!(const(Char)[]));

So it's really still related to regex.

-Steve
February 06, 2018
On 06/02/2018 4:35 AM, Steven Schveighoffer wrote:
> On 2/5/18 11:09 PM, psychoticRabbit wrote:
>> On Monday, 5 February 2018 at 21:27:57 UTC, H. S. Teoh wrote:
>>>
>>> Comment out the call to `regex()`, and I get:
>>>
>>> ------
>>> real    0m0.285s
>>> user    0m0.262s
>>> sys     0m0.023s
>>> ------
>>>
>>
>> regex is not the only one I avoid..
>>
>> how long you think this takes to compile?
>> (try ldc2 too ..just for laughs ;-)
>>
>> ----
>> import std.net.isemail;
>>
>> void main()
>> {
>>      auto checkEmail = "someone@somewhere.com".isEmail();
>> }
>> ----
> 
> I was surprised at this, then I looked at the first line of isEmail:
> 
>      static ipRegex = ctRegex!(`\b(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}`~
> 
> `(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$`.to!(const(Char)[]));
> 
> So it's really still related to regex.
> 
> -Steve

On that note, we really should remove it performance-aside, you cannot really trust it.
February 06, 2018
On Tuesday, 6 February 2018 at 04:35:42 UTC, Steven Schveighoffer wrote:
> On 2/5/18 11:09 PM, psychoticRabbit wrote:
>> On Monday, 5 February 2018 at 21:27:57 UTC, H. S. Teoh wrote:
>>>
>>> Comment out the call to `regex()`, and I get:
>>>
>>> ------
>>> real    0m0.285s
>>> user    0m0.262s
>>> sys     0m0.023s
>>> ------
>>>
>> 
>> regex is not the only one I avoid..
>> 
>> how long you think this takes to compile?
>> (try ldc2 too ..just for laughs ;-)
>> 
>> ----
>> import std.net.isemail;
>> 
>> void main()
>> {
>>      auto checkEmail = "someone@somewhere.com".isEmail();
>> }
>> ----
>
> I was surprised at this, then I looked at the first line of isEmail:
>
>     static ipRegex = ctRegex!(`\b(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}`~
>
> `(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$`.to!(const(Char)[]));
>
> So it's really still related to regex.


That’s really bad idea - isEmail is template so the burden of freaking slow ctRegex
is paid on per instantiation basis. Could be horrible with separate compilation.

> -Steve


February 06, 2018
On Monday, 5 February 2018 at 21:27:57 UTC, H. S. Teoh wrote:
> One of my D projects for the past while has been taking unusually long times to compile.  This morning, I finally decided to sit down and figure out exactly why. What I found was rather disturbing:
>
> ------
> import std.regex;
> void main() {
> 	auto re = regex(``);
> }
> ------
>
> Compile command: time dmd -c test.d
>
> Output:
> ------
> real    0m3.113s
> user    0m2.884s
> sys     0m0.226s
> ------
>
> Comment out the call to `regex()`, and I get:
>
> ------
> real    0m0.285s
> user    0m0.262s
> sys     0m0.023s
> ------
>

> Clearly, something is wrong if the mere act of compiling a regex causes a 4-line program to take *3 seconds* to compile,

There is a fuckton of templates involved, plus a couple of tries are built at CTFE.
The regression is curious though, maybe something gets recomputed at CTFE over and over again.

> where normally dmd takes less than a second.

Honestly I’m tired to hell of working with our compiler and its compile time features. When it doesn’t pee itself due to OOM I’m almost happy.

In retrospect I should have just provided a C interface and compiled the whole thing separately. And CTFE could easily be replaced by a small custom JIT compiler, it would also work at run-time(!).

Especially considering that it’s been 6 years but it’s still is not practical to use ctRegex.

> The latter department as also suffered a regression; see for example: https://github.com/dlang/phobos/pull/5981.)
>

Yup, Martin seems on top of it, thankfully.

>
> T


February 06, 2018
On 2/6/18 12:35 AM, Dmitry Olshansky wrote:
> On Tuesday, 6 February 2018 at 04:35:42 UTC, Steven Schveighoffer wrote:
>> On 2/5/18 11:09 PM, psychoticRabbit wrote:
>>> On Monday, 5 February 2018 at 21:27:57 UTC, H. S. Teoh wrote:
>>>>
>>>> Comment out the call to `regex()`, and I get:
>>>>
>>>> ------
>>>> real    0m0.285s
>>>> user    0m0.262s
>>>> sys     0m0.023s
>>>> ------
>>>>
>>>
>>> regex is not the only one I avoid..
>>>
>>> how long you think this takes to compile?
>>> (try ldc2 too ..just for laughs ;-)
>>>
>>> ----
>>> import std.net.isemail;
>>>
>>> void main()
>>> {
>>>      auto checkEmail = "someone@somewhere.com".isEmail();
>>> }
>>> ----
>>
>> I was surprised at this, then I looked at the first line of isEmail:
>>
>>     static ipRegex = ctRegex!(`\b(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}`~
>>
>> `(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$`.to!(const(Char)[]));
>>
>> So it's really still related to regex.
> 
> 
> That’s really bad idea - isEmail is template so the burden of freaking slow ctRegex
> is paid on per instantiation basis. Could be horrible with separate compilation.

Obviously it is horrible. On my mac, it took about 2.5 seconds to compile this one line.

I'm not sure how to fix it though... I suppose you could make it 3 overloads, but this defeats a lot of the purpose of having templates in the first place.

-Steve
February 06, 2018
On Tuesday, 6 February 2018 at 05:45:35 UTC, Steven Schveighoffer wrote:
> On 2/6/18 12:35 AM, Dmitry Olshansky wrote:
>> 
>> That’s really bad idea - isEmail is template so the burden of freaking slow ctRegex
>> is paid on per instantiation basis. Could be horrible with separate compilation.
>
> Obviously it is horrible. On my mac, it took about 2.5 seconds to compile this one line.
>
> I'm not sure how to fix it though... I suppose you could make

Just use the run-time version, it’s not that much slower. But then again static ipRegex = regex(...) will parse and build regex at CTFE.

Maybe lazy init?

> it 3 overloads, but this defeats a lot of the purpose of having templates in the first place.
>
> -Steve


February 06, 2018
On Tuesday, 6 February 2018 at 06:11:55 UTC, Dmitry Olshansky wrote:
> On Tuesday, 6 February 2018 at 05:45:35 UTC, Steven Schveighoffer wrote:
>> On 2/6/18 12:35 AM, Dmitry Olshansky wrote:
>>> 
>>> That’s really bad idea - isEmail is template so the burden of freaking slow ctRegex
>>> is paid on per instantiation basis. Could be horrible with separate compilation.
>>
>> Obviously it is horrible. On my mac, it took about 2.5 seconds to compile this one line.
>>
>> I'm not sure how to fix it though... I suppose you could make
>
> Just use the run-time version, it’s not that much slower. But then again static ipRegex = regex(...) will parse and build regex at CTFE.
>
> Maybe lazy init?

FYI I've made a pull request that replaces uses of regexes in std.net.isemail. It turns out they weren't being used for anything indispensable. Import benchmark results were encouraging.

https://github.com/dlang/phobos/pull/6129
« First   ‹ Prev
1 2 3 4 5