Jump to page: 1 2
Thread overview
What happened to phobos compile time?
Aug 04, 2020
RazvanN
Aug 04, 2020
Mathias LANG
Aug 04, 2020
RazvanN
Aug 04, 2020
Seb
Aug 04, 2020
Stefan Koch
Aug 04, 2020
Avrina
Aug 04, 2020
Stefan Koch
Aug 04, 2020
Patrick Schluter
Aug 04, 2020
H. S. Teoh
Aug 04, 2020
Per Nordlöw
Aug 05, 2020
wjoe
Aug 05, 2020
Per Nordlöw
Aug 06, 2020
wjoe
August 04, 2020
Hello everyone!

I just tried compiling phobos on machine to get updated with the latest changes and I noticed an explosion in compile time. On my machine it takes roughly 5 minutes (!!!) to compile it while last year it took somewhere around 15-30 seconds. Does anyone know what has caused this serious performance regression?

Thanks for answers,
RazvanN
August 04, 2020
On Tuesday, 4 August 2020 at 03:54:53 UTC, RazvanN wrote:
> Hello everyone!
>
> I just tried compiling phobos on machine to get updated with the latest changes and I noticed an explosion in compile time. On my machine it takes roughly 5 minutes (!!!) to compile it while last year it took somewhere around 15-30 seconds. Does anyone know what has caused this serious performance regression?
>
> Thanks for answers,
> RazvanN

Welcome to the wonderful world of DMD inliner, we hope you enjoy your stay.

```
$ make -f posix.mak -j8  88.90s user 0.89s system 99% cpu 1:30.25 total
$ git show HEAD | head -n 5
commit 2f0ea3fdedc2889b63f266de908cb8658ce98ec9
Author: Walter Bright <walter@walterbright.com>
Date:   Tue Jul 21 01:12:35 2020 -0700

    sha: inline critical functions
$ git checkout HEAD^
Previous HEAD position was 2f0ea3fde sha: inline critical functions
HEAD is now at e364edfc8 Merge pull request #7561 from WalterBright/fabs-float
$ make -f posix.mak -j8  9.42s user 0.55s system 98% cpu 10.128 total
```
August 04, 2020
On Tuesday, 4 August 2020 at 04:41:15 UTC, Mathias LANG wrote:
> On Tuesday, 4 August 2020 at 03:54:53 UTC, RazvanN wrote:
>> [...]
>
> Welcome to the wonderful world of DMD inliner, we hope you enjoy your stay.
>
> ```
> $ make -f posix.mak -j8  88.90s user 0.89s system 99% cpu 1:30.25 total
> $ git show HEAD | head -n 5
> commit 2f0ea3fdedc2889b63f266de908cb8658ce98ec9
> Author: Walter Bright <walter@walterbright.com>
> Date:   Tue Jul 21 01:12:35 2020 -0700
>
>     sha: inline critical functions
> $ git checkout HEAD^
> Previous HEAD position was 2f0ea3fde sha: inline critical functions
> HEAD is now at e364edfc8 Merge pull request #7561 from WalterBright/fabs-float
> $ make -f posix.mak -j8  9.42s user 0.55s system 98% cpu 10.128 total
> ```

I'm curios if there's actually a provable runtime benefit, otherwise the performance regression is unacceptable.
August 04, 2020
On 8/4/20 12:41 AM, Mathias LANG wrote:
> On Tuesday, 4 August 2020 at 03:54:53 UTC, RazvanN wrote:
>> Hello everyone!
>>
>> I just tried compiling phobos on machine to get updated with the latest changes and I noticed an explosion in compile time. On my machine it takes roughly 5 minutes (!!!) to compile it while last year it took somewhere around 15-30 seconds. Does anyone know what has caused this serious performance regression?
>>
>> Thanks for answers,
>> RazvanN
> 
> Welcome to the wonderful world of DMD inliner, we hope you enjoy your stay.
> 
> ```
> $ make -f posix.mak -j8  88.90s user 0.89s system 99% cpu 1:30.25 total
> $ git show HEAD | head -n 5
> commit 2f0ea3fdedc2889b63f266de908cb8658ce98ec9
> Author: Walter Bright <walter@walterbright.com>
> Date:   Tue Jul 21 01:12:35 2020 -0700
> 
>      sha: inline critical functions
> $ git checkout HEAD^
> Previous HEAD position was 2f0ea3fde sha: inline critical functions
> HEAD is now at e364edfc8 Merge pull request #7561 from WalterBright/fabs-float
> $ make -f posix.mak -j8  9.42s user 0.55s system 98% cpu 10.128 total
> ```

Looking at that change, a few functions were force-inlined. Most of them were trivial.

And I don't think these are ones that are used in a lot of places. Phobos is compiled all-at-once. So you can't explain the slowdown by multiple instances of compilation.

Has anyone profiled to see where the slowdown is? If I remove the pragma(inline) from the two functions T_SHA2_0_15 and T_SHA2_16_79, the compile time comes back to normal.

Looking at uses of those functions I get a total of 80 uses. Considering the compile time goes from 12 seconds on my system to 92 seconds, that's a full second to inline each call. Something doesn't add up, it can't be that bad.

-Steve
August 04, 2020
On Tuesday, 4 August 2020 at 05:27:36 UTC, RazvanN wrote:
>
> I'm curios if there's actually a provable runtime benefit, otherwise the performance regression is unacceptable.

It doesn't even matter whether there's a provable runtime benefit as no one seriously uses DMD for performance-related tasks. There was a recent internal discussion and everyone on the D dev team (except Walter) agreed that it's smarter to use an optimizer with much more stakeholders than to divert D's small development capacities into a self-maintained optimizer.

Team Phobos for a long time now doesn't even benchmark Phobos functions with DMD, but LDC only. In other words: Phobos does not cater anymore for the shortcomings of DMD's optimizer and I don't see any reason why it should.

So this PR should have never been merged and should be reverted immediately.

August 04, 2020
On Tuesday, 4 August 2020 at 12:42:04 UTC, Steven Schveighoffer wrote:
> On 8/4/20 12:41 AM, Mathias LANG wrote:
>> On Tuesday, 4 August 2020 at 03:54:53 UTC, RazvanN wrote:
>>> Hello everyone!
>>>
>>> I just tried compiling phobos on machine to get updated with the latest changes and I noticed an explosion in compile time. On my machine it takes roughly 5 minutes (!!!) to compile it while last year it took somewhere around 15-30 seconds. Does anyone know what has caused this serious performance regression?
>>>
>>> Thanks for answers,
>>> RazvanN
>> 
>> Welcome to the wonderful world of DMD inliner, we hope you enjoy your stay.
>> 
>> ```
>> $ make -f posix.mak -j8  88.90s user 0.89s system 99% cpu 1:30.25 total
>> $ git show HEAD | head -n 5
>> commit 2f0ea3fdedc2889b63f266de908cb8658ce98ec9
>> Author: Walter Bright <walter@walterbright.com>
>> Date:   Tue Jul 21 01:12:35 2020 -0700
>> 
>>      sha: inline critical functions
>> $ git checkout HEAD^
>> Previous HEAD position was 2f0ea3fde sha: inline critical functions
>> HEAD is now at e364edfc8 Merge pull request #7561 from WalterBright/fabs-float
>> $ make -f posix.mak -j8  9.42s user 0.55s system 98% cpu 10.128 total
>> ```
>

>
> Looking at uses of those functions I get a total of 80 uses. Considering the compile time goes from 12 seconds on my system to 92 seconds, that's a full second to inline each call. Something doesn't add up, it can't be that bad.

Hmm if those are inlined in a few places then that will bloat the code they were inlinened in.

Most optimization and code-gen algorithms work on the function as a unit have a super linear relationship to the number of statements and expressions in that function body.

I.e. fewer functions with larger bodies can take significantly more time than more function with smaller bodies.

At least if optimizations are enabled.

If you increase the size of a couple functions by a lot.
August 04, 2020
On Tuesday, 4 August 2020 at 12:55:44 UTC, Seb wrote:

> So this PR should have never been merged and should be reverted immediately.

I assume linker problems also had something to do with it being merged?
I might be wrong though.
August 04, 2020
On 8/4/20 12:41 AM, Mathias LANG wrote:
> On Tuesday, 4 August 2020 at 03:54:53 UTC, RazvanN wrote:
>> Hello everyone!
>>
>> I just tried compiling phobos on machine to get updated with the latest changes and I noticed an explosion in compile time. On my machine it takes roughly 5 minutes (!!!) to compile it while last year it took somewhere around 15-30 seconds. Does anyone know what has caused this serious performance regression?
>>
>> Thanks for answers,
>> RazvanN
> 
> Welcome to the wonderful world of DMD inliner, we hope you enjoy your stay.
> 
> ```
> $ make -f posix.mak -j8  88.90s user 0.89s system 99% cpu 1:30.25 total
> $ git show HEAD | head -n 5
> commit 2f0ea3fdedc2889b63f266de908cb8658ce98ec9
> Author: Walter Bright <walter@walterbright.com>
> Date:   Tue Jul 21 01:12:35 2020 -0700
> 
>      sha: inline critical functions
> $ git checkout HEAD^
> Previous HEAD position was 2f0ea3fde sha: inline critical functions
> HEAD is now at e364edfc8 Merge pull request #7561 from WalterBright/fabs-float
> $ make -f posix.mak -j8  9.42s user 0.55s system 98% cpu 10.128 total
> ```

That's a large penalty. I hope at least the debug build hasn't been affected.

I recall the change was made to get performance parity with gdc and ldc for the sha code. So I wonder (a) how the resulting performance of the sha functions compares with those, and (b) how long it takes to build phobos with gdc and ldc.
August 04, 2020
cc Walter

The functions Ch and Maj are the culprits:

https://github.com/dlang/phobos/blob/master/std/digest/sha.d#L318

Each is responsible for about half of the slowdown. If those are not inlined the build speed is back to the previous.

The templates are only instantiated with uint and ulong, but this didn't help any:

uint Maj(uint x, uint y, uint z) { return (x & y) | (z & (x ^ y)); }
uint Ch(uint x, uint y, uint z) { return z ^ (x & (y ^ z)); }
ulong Maj(ulong x, ulong y, ulong z) { return (x & y) | (z & (x ^ y)); }
ulong Ch(ulong x, ulong y, ulong z) { return z ^ (x & (y ^ z)); }

In turn, these functions are called from the inline functions T_SHA2_0_15 and T_SHA2_16_79. Turning inlining off on T_SHA2_16_79 instead again brings build speed back.

Fix: https://github.com/dlang/phobos/pull/7577
August 04, 2020
On 8/4/20 9:51 AM, Stefan Koch wrote:
> On Tuesday, 4 August 2020 at 12:42:04 UTC, Steven Schveighoffer wrote:
>> Looking at uses of those functions I get a total of 80 uses. Considering the compile time goes from 12 seconds on my system to 92 seconds, that's a full second to inline each call. Something doesn't add up, it can't be that bad.
> 
> Hmm if those are inlined in a few places then that will bloat the code they were inlinened in.
> 
> Most optimization and code-gen algorithms work on the function as a unit have a super linear relationship to the number of statements and expressions in that function body.

I guess my question is: is it reasonable for the compiler to take an additional second per call to inline a function? Maybe it is, but I don't know that my experience with inlining matches that.

The nice thing about this change is that it's easy to test what the differences are. If you remove the pragma(inline) it's fast. So it should be possible to tell where all the extra time is going.

> I.e. fewer functions with larger bodies can take significantly more time than more function with smaller bodies.
> 
> At least if optimizations are enabled.

I don't know if I've ever seen an optimization cause a 1 second increase to compile a function. But maybe I'm wrong.

-Steve
« First   ‹ Prev
1 2