Jump to page: 1 2
Thread overview
Profile-Guided Optimization (PGO) support in D ecosystem
Nov 10
Sergey
Nov 11
Johan
November 10

Hi!

I am investigating the Profile-Guided Optimization (PGO) state across the ecosystem - all my current results (with a lot of benchmarks, PGO-related information, and much more) are available at https://github.com/zamazan4ik/awesome-pgo . I am interested in PGO state in the D ecosystem too - that's why I am here.

I had been researching a little about PGO in D but compared to C++ almost no information is available in the official documentation (or I just don't know where to search it, hah). I have the following questions (for each D compiler: DMD, GDC, LDC - am interested in all of them):

  1. What is the most up-to-date place for PGO documentation? Right now I found only this for LDC. What about DMD and GDC?
  2. Does any D compiler support Sampling PGO (also known as AutoFDO)? If Sampling PGO is not supported - do you plan to support it in the future? For us sampling PGO can be important since it's much easier to use for gathering the PGO profiles directly from the production environment without hurting the production performance a lot.
  3. Do you support other PGO modes like CSIR PGO in D compilers? If not, do you plan to support them in the future?
  4. What performance improvements did you get with enabling LTO + PGO on D compilers? Could you please share the number for each compiler? With this information it's much easier to consider rebuilding a D compiler (due to strict security requirements) locally with PGO since we can estimate benefits from PGO for the D compiler based on the actual benchmarks from the compiler developers.
  5. Is there any documentation on how to build DMD and GDC with LTO+PGO? I am looking for smth like it's done in the ClickHouse documentation (or like it's done for Clang or Rustc).
  6. Am I right that the officially released D compiler binaries are already LTO + PGO optimized? According to the script it's true at least for LDC. What about other compilers?

Similar questions about LDC in the upstream: https://github.com/ldc-developers/ldc/discussions/4524

Thanks a lot for the help!

November 10

On Friday, 10 November 2023 at 12:47:56 UTC, Alexander Zaitsev wrote:

>

Hi!

Similar questions about LDC in the upstream: https://github.com/ldc-developers/ldc/discussions/4524

Thanks a lot for the help!

Hi,

Some internal details about PGO is in this article: https://archive.fosdem.org/2017/schedule/event/ldc_d_optimization/attachments/slides/1819/export/events/attachments/ldc_d_optimization/slides/1819/FOSDEM_2017.pdf

and blog posts from LDC dev:
http://johanengelen.github.io/ldc/2016/11/10/Link-Time-Optimization-LDC.html
http://johanengelen.github.io/ldc/2016/04/13/PGO-in-LDC-virtual-calls.html

For PGO + LTO example of usage and the performance improvements, I think one of the most helpful source is this repo:

https://github.com/eBay/tsv-utils/blob/master/docs/BuildingWithLTO.md
https://github.com/eBay/tsv-utils/blob/master/docs/lto-pgo-study.md

November 11

On Friday, 10 November 2023 at 12:47:56 UTC, Alexander Zaitsev wrote:

>

Hi!

I am investigating the Profile-Guided Optimization (PGO) state across the ecosystem - all my current results (with a lot of benchmarks, PGO-related information, and much more) are available at https://github.com/zamazan4ik/awesome-pgo . I am interested in PGO state in the D ecosystem too - that's why I am here.

I had been researching a little about PGO in D but compared to C++ almost no information is available in the official documentation (or I just don't know where to search it, hah). I have the following questions (for each D compiler: DMD, GDC, LDC - am interested in all of them):

Thanks a lot for the help!

I have only used PGO with LDC, if I remember correctly I posted something about it in the forums. Let me see if I can find it.

I think it was this:
https://forum.dlang.org/post/ajorqeooyccwuwpvteue@forum.dlang.org

November 11

On Friday, 10 November 2023 at 13:09:01 UTC, Sergey wrote:

>

On Friday, 10 November 2023 at 12:47:56 UTC, Alexander Zaitsev wrote:

>

Hi!

Similar questions about LDC in the upstream: https://github.com/ldc-developers/ldc/discussions/4524

Thanks a lot for the help!

Hi,

Some internal details about PGO is in this article: https://archive.fosdem.org/2017/schedule/event/ldc_d_optimization/attachments/slides/1819/export/events/attachments/ldc_d_optimization/slides/1819/FOSDEM_2017.pdf

and blog posts from LDC dev:
http://johanengelen.github.io/ldc/2016/11/10/Link-Time-Optimization-LDC.html
http://johanengelen.github.io/ldc/2016/04/13/PGO-in-LDC-virtual-calls.html

For PGO + LTO example of usage and the performance improvements, I think one of the most helpful source is this repo:

https://github.com/eBay/tsv-utils/blob/master/docs/BuildingWithLTO.md
https://github.com/eBay/tsv-utils/blob/master/docs/lto-pgo-study.md

This list is a fantastic "overview" Sergey, thanks!
Please add this to the discussion https://github.com/ldc-developers/ldc/discussions/4524, so it does not get lost as easily.

Cheers,
Johan

November 12

On Friday, 10 November 2023 at 12:47:56 UTC, Alexander Zaitsev wrote:

>

Hi!

IIRC, Jon wrote a bit about LTO and PGO (with benchmarks somewhere) for tsv-utils.

https://github.com/eBay/tsv-utils/

>
  1. What is the most up-to-date place for PGO documentation? Right now I found only this for LDC. What about DMD and GDC?

Look up any GCC documentation/how-tos on using -fprofile-generate= and -fprofile-use=.

>
  1. Is there any documentation on how to build DMD and GDC with LTO+PGO? I am looking for smth like it's done in the ClickHouse documentation (or like it's done for Clang or Rustc).

LTO and PGO aren't a feature of the language, rather the compiler infrastructure.

November 12

On Friday, 10 November 2023 at 12:47:56 UTC, Alexander Zaitsev wrote:

>

Hi!

I am investigating the Profile-Guided Optimization (PGO) state across the ecosystem - all my current results (with a lot of benchmarks, PGO-related information, and much more) are available at https://github.com/zamazan4ik/awesome-pgo . I am interested in PGO state in the D ecosystem too - that's why I am here.

[...]

If you search for PGO in the dmd repo you will find that I implemented a pgo build for the compiler a while ago. I'm not sure if it's enabled for releases but we use it internally at Symmetry IIRC.

One thing if note is that GCC has a feature called AutoFDO which is quite interesting. I think LLVM might have a similar concept but I'm not sure, but also has a tool called Bolt which does the same thing only after compilation.

December 18

On Sunday, 12 November 2023 at 18:03:19 UTC, max haughton wrote:

>

If you search for PGO in the dmd repo you will find that I implemented a pgo build for the compiler a while ago.

The PGO code is there, but it seems to fail at compiling "dshell_prebuilt.d" file. And aborts collecting the profiling data prematurely because of this:

[...]
Built dmd with PGO instrumentation
Compiling dmd testsuite to generate PGO data
Executing: ldmd2 -m64 -of/tmp/dmd/compiler/test/test_results/d_do_test /tmp/dmd/compiler/test/tools/d_do_test.d -fPIC -I/tmp/dmd/compiler/test/tools -i -version=NoMain
Executing: ldmd2 -m64 -of/tmp/dmd/compiler/test/test_results/unit_test_runner /tmp/dmd/compiler/test/tools/unit_test_runner.d -fPIC /tmp/dmd/compiler/test/tools/paths
Executing: /tmp/dmd/generated/linux/release/64/dmd -conf= -m64 -of/tmp/dmd/compiler/test/test_results/dshell_prebuilt.o -c /tmp/dmd/compiler/test/tools/dshell_prebuilt/dshell_prebuilt.d -fPIC
Executing: ldmd2 -m64 -of/tmp/dmd/compiler/test/test_results/sanitize_json /tmp/dmd/compiler/test/tools/sanitize_json.d -fPIC
/tmp/dmd/compiler/test/tools/dshell_prebuilt/dshell_prebuilt.d(6): Error: unable to read module `stdlib`
/tmp/dmd/compiler/test/tools/dshell_prebuilt/dshell_prebuilt.d(6):        Expected 'core/stdc/stdlib.d' or 'core/stdc/stdlib/package.d' in one of the following import paths:
import path[0] = /tmp/dmd/compiler/test/../../druntime/import
import path[1] = /tmp/dmd/compiler/test/../../../phobos
failed to build '/tmp/dmd/compiler/test/test_results/dshell_prebuilt.o'
dmd tests failed! This will not end the PGO build because some data may have been gathered
Merging PGO data
[...]

The compiler is still built successfully, but its performance is not optimal this way. I patched up "dshell_prebuilt.d" to strip everything out of it, the error disappears, the whole DMD test suite seems to compile and the compiler gets a noticeable performance boost.

>

I'm not sure if it's enabled for releases

This doesn't seem to be the case at least for the https://downloads.dlang.org/releases/2.x/2.106.0/dmd.2.106.0.linux.tar.xz tarball. LTO is enabled, but apparently without PGO.

>

but we use it internally at Symmetry IIRC.

It's good to know that DMD with LTO+PGO is already successfully used in production at Symmetry. Would it make sense to also enable this optimization for everyone else?

December 18

On Monday, 18 December 2023 at 01:57:17 UTC, Siarhei Siamashka wrote:

>

On Sunday, 12 November 2023 at 18:03:19 UTC, max haughton wrote:

>

[...]

The PGO code is there, but it seems to fail at compiling "dshell_prebuilt.d" file. And aborts collecting the profiling data prematurely because of this:

[...]

Does druntime need building?

December 18

On Monday, 18 December 2023 at 02:30:21 UTC, max haughton wrote:

>

On Monday, 18 December 2023 at 01:57:17 UTC, Siarhei Siamashka wrote:

>

On Sunday, 12 November 2023 at 18:03:19 UTC, max haughton wrote:

>

[...]

The PGO code is there, but it seems to fail at compiling "dshell_prebuilt.d" file. And aborts collecting the profiling data prematurely because of this:

[...]

Does druntime need building?

That's a good question. As the author of this code, you probably have a much better idea about how it's supposed to work. I tried to come up with some scriptable step by step build instructions:

DMD_TAG=v2.106.0
LDMD=ldmd2-1.32.0

git clone --depth 1 --branch "${DMD_TAG}" https://github.com/dlang/dmd.git || exit 1
git clone --depth 1 --branch "${DMD_TAG}" https://github.com/dlang/phobos.git || exit 1

cd dmd

make -j4 -f posix.mak HOST_DMD=$LDMD ENABLE_RELEASE=1 ENABLE_LTO=1 || exit 1

cd ../phobos

make -j4 -f posix.mak || exit 1

cd ../dmd

cp generated/linux/release/64/dmd "../dmd_${DMD_TAG}_lto"

rm -rf generated

rdmd compiler/src/build.d OS="linux" BUILD="release" MODEL="64" HOST_DMD="$LDMD" CXX="c++" AUTO_BOOTSTRAP="" DOCDIR="" STDDOC="" DOC_OUTPUT_DIR="" MAKE="make" VERBOSE="" ENABLE_RELEASE="1" ENABLE_DEBUG="" ENABLE_ASSERTS="" ENABLE_LTO="1" ENABLE_UNITTEST="" ENABLE_PROFILE="" ENABLE_COVERAGE="" DFLAGS="" dmd-pgo || exit 1

cp generated/linux/release/64/dmd "../dmd_${DMD_TAG}_lto+pgo"

ls -l dmd_*

Running it results in the following:

-rwxr-xr-x 1 ssvb ssvb 7626560 Dec 18 13:30 dmd_v2.106.0_lto
-rwxr-xr-x 1 ssvb ssvb 7994232 Dec 18 13:38 dmd_v2.106.0_lto+pgo

The dmd_v2.106.0_lto file roughly matches the size and performance characteristics of the dmd executable from the https://downloads.dlang.org/releases/2.x/2.106.0/dmd.2.106.0.linux.tar.xz release tarball and dmd_v2.106.0_lto+pgo is its faster PGO-enabled upgrade. I can observe at least 10% compilation time reduction when using the PGO-enabled dmd.

It's rather messy, but this somehow works. There are many questions though. For example, should the "dmd-pgo" target be accessible from the makefile without invoking "rdmd compiler/src/build.d" directly? Is sharing the same directory "generated/linux/release" for the produced non-PGO and PGO binaries actually okay? Is the dmd testsuite a good training set or maybe collecting profiling data during Phobos compilation would be better? The "dshell_prebuilt.d" glitch happens if the PGO-enabled DMD is built before Phobos & druntime and this makes everything fragile and non-intuitive. So the LTO-enabled DMD needs to be built first, then we need to use it to compile druntime, and finally the "generated/linux/release" directory has to be erased before the PGO build is started in order not to clash with it.

Either way, providing faster PGO-enabled binary releases of DMD would make it more competitive in the compilation speed race against LDC: https://forum.dlang.org/post/pugqkvthbicqaigemijj@forum.dlang.org :-)

December 18

On Monday, 18 December 2023 at 01:57:17 UTC, Siarhei Siamashka wrote:

>

This doesn't seem to be the case at least for the https://downloads.dlang.org/releases/2.x/2.106.0/dmd.2.106.0.linux.tar.xz tarball. LTO is enabled, but apparently without PGO.

Formally submitted an issue about this at https://issues.dlang.org/show_bug.cgi?id=24287
So that the https://github.com/dlang/installer maintainers can probably take some action to improve the current situation.

« First   ‹ Prev
1 2