Thread overview | ||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
December 08, 2015 Profile-guided optimization (PGO) | ||||
---|---|---|---|---|
| ||||
Hi all, I have been working on getting rudimentary PGO going in LDC. It's pretty much ready! [1] (does not work on Windows yet... I have to fix LLVM's compile-rt code) I've implemented something very similar to Clang: LDC uses profile information (generated by an instrumented executable built by LDC) to tag each branch in the code with branch weights. The actual optimizations are done by LLVM; at the moment LDC only adds metadata to the IR. At this point, I want your input: commandline option naming, easy to use? (llvm-profdata is needed...), do you get substantial performance boosts, runtime library inclusion or separate lib for profile data file writing, bugs, uninstrumented branches/switches, etc. All comments are welcome (please be kind ;-). Before I announce it in the "Announce" forum, I want to hear your thoughts first. Thanks! Johan [1] http://wiki.dlang.org/LDC_LLVM_profiling_instrumentation#Profile-Guided_Optimization_.28PGO.29_status_in_LDC |
December 08, 2015 Re: Profile-guided optimization (PGO) | ||||
---|---|---|---|---|
| ||||
Posted in reply to Johan Engelen | Hi Johan,
On 8 Dec 2015, at 20:13, Johan Engelen via digitalmars-d-ldc wrote:
> I've implemented something very similar to Clang: LDC uses profile information (generated by an instrumented executable built by LDC)
Did you also try using it with sample profiles acquired by an external profiler yet, as described in the Clang page on PGO?
— David
|
December 08, 2015 Re: Profile-guided optimization (PGO) | ||||
---|---|---|---|---|
| ||||
Posted in reply to Johan Engelen | On Tuesday, 8 December 2015 at 19:13:41 UTC, Johan Engelen wrote: > (does not work on Windows yet... I have to fix LLVM's compile-rt code) I fixed a nasty [*] bug in compile-rt's profile writing code, and now it also works on Windows. (The IR tests fail on Windows because running a compiled executable from LIT fails for some reason on Windows.) [*] https://stackoverflow.com/questions/5537066/strange-0x0d-being-added-to-my-binary-file Now I know what to look for first if I see 0x0D's in my files... |
December 08, 2015 Re: Profile-guided optimization (PGO) | ||||
---|---|---|---|---|
| ||||
Posted in reply to David Nadlinger | On Tuesday, 8 December 2015 at 20:08:15 UTC, David Nadlinger wrote:
>
> Did you also try using it with sample profiles acquired by an external profiler yet, as described in the Clang page on PGO?
Hi David,
No, I have not look at that yet.
Thanks a lot for the testcase you posted on Github. Will sink my teeth in fixing that first.
|
December 08, 2015 Re: Profile-guided optimization (PGO) | ||||
---|---|---|---|---|
| ||||
Posted in reply to Johan Engelen | On 8 Dec 2015, at 23:35, Johan Engelen via digitalmars-d-ldc wrote:
> Thanks a lot for the testcase you posted on Github. Will sink my teeth in fixing that first.
You're welcome – I hope it's enough information to reproduce it, but I don't have a debug build of LLVM on this machine right now.
— David
|
December 10, 2015 Re: Profile-guided optimization (PGO) | ||||
---|---|---|---|---|
| ||||
Posted in reply to Johan Engelen | On Tuesday, 8 December 2015 at 22:35:11 UTC, Johan Engelen wrote:
> Thanks a lot for the testcase you posted on Github. Will sink my teeth in fixing that first.
Speaking of test cases: This might be an obvious and/or stupid suggestion, but did you try building the Phobos unit tests (and maybe also dmd-testsuite/runnable) with PGO? I'd suspect it would give you quite a broad coverage of basic language constructs.
- David
|
December 10, 2015 Re: Profile-guided optimization (PGO) | ||||
---|---|---|---|---|
| ||||
Posted in reply to David Nadlinger | On Tuesday, 8 December 2015 at 20:08:15 UTC, David Nadlinger wrote:
> Did you also try using it with sample profiles acquired by an external profiler yet, as described in the Clang page on PGO?
Let me add that this would probably be something nice to have for the initial release, as users could fall back to using perf, etc. if the instrumentation part is still buggy or incomplete for their code.
- David
|
December 10, 2015 Re: Profile-guided optimization (PGO) | ||||
---|---|---|---|---|
| ||||
Posted in reply to David Nadlinger | Also, for use cases like ours, where the system runs for extended periods of time, and optimizing the init time, which may be minutes is not interesting at all, just being able to run perf while the system is doing something interesting to improve is a big plus.
Liran
> On Dec 10, 2015, at 16:30, David Nadlinger via digitalmars-d-ldc <digitalmars-d-ldc@puremagic.com> wrote:
>
> On Tuesday, 8 December 2015 at 20:08:15 UTC, David Nadlinger wrote:
>> Did you also try using it with sample profiles acquired by an external profiler yet, as described in the Clang page on PGO?
>
> Let me add that this would probably be something nice to have for the initial release, as users could fall back to using perf, etc. if the instrumentation part is still buggy or incomplete for their code.
>
> - David
|
December 10, 2015 Re: Profile-guided optimization (PGO) | ||||
---|---|---|---|---|
| ||||
Posted in reply to David Nadlinger | On Thursday, 10 December 2015 at 14:27:59 UTC, David Nadlinger wrote:
>
> Speaking of test cases: This might be an obvious and/or stupid suggestion, but did you try building the Phobos unit tests (and maybe also dmd-testsuite/runnable) with PGO? I'd suspect it would give you quite a broad coverage of basic language constructs.
Nope didn't do that yet :S :S Looks like it is needed to iron out some remaining bugs.
I underestimated the complexity of D's AST (some objects are placed in multiple locations in the AST?), which gave rise to an assertion fail in your testcase; plus I forgot to add throw statements to the AST tree walker, leading to another assertion fail. Those issues have been fixed now, and now it breaks with the same error you found. It is confusing because I did not (mean to) change any of the codegen, other than adding counter increment instructions and branch instruction metadata (both trivial additions). But I did have to add extra basicblocks for switch statements... perhaps I can search there first.
Hope to have a resolution for your test case quickly.
I also have not tested at all how this works with multiple object files linked together, or other possibly more complicated things. I thought a fun testcase would be to compile DDMD with PGO enabled, compile itself as a profiling run, rebuild with PGO and test if compiling, say, Phobos is quicker/slower.
I am very curious to see what constructs will see a significant performance boost, if any at all.
|
December 10, 2015 Re: Profile-guided optimization (PGO) | ||||
---|---|---|---|---|
| ||||
Posted in reply to Johan Engelen | On Tuesday, 8 December 2015 at 19:13:41 UTC, Johan Engelen wrote:
>
> Before I announce it in the "Announce" forum, I want to hear your thoughts first.
Clearly I was too optimistic about the quality of my work so far, hehe.
|
Copyright © 1999-2021 by the D Language Foundation