Thread overview
LDC with Profile-Guided Optimization (PGO)
Dec 15, 2015
Johan Engelen
Dec 22, 2015
Johan Engelen
Dec 22, 2015
CraigDillabaugh
Dec 22, 2015
Mattcoder
Dec 22, 2015
Mattcoder
Jan 05, 2016
Johan Engelen
Jan 01, 2016
Etienne Cimon
Jan 06, 2016
welkam
December 15, 2015
Hi all,
  I have been working on adding profile-guided optimization (PGO) to LDC [1][2][3].
At this point, I'd like to hear your input and hope you can help with testing!

Unfortunately, to try it out, you will need to build LDC with LLVM3.7 yourself. PGO should work on OS X, Linux, and Windows.

A first implementation is mostly complete now: it can generate an executable that will output profile data, and it can use profile data during a second compilation pass (and it will tell LLVM about branch frequencies). LDC does not do any PGO optimizations (yet): LLVM should do that.

It works like PGO with Clang, with the fprofile-instr-generate and fprofile-instr-use cmdline options [4]:
> ldc2 -fprofile-instr-generate=test.profraw -run test.d
> llvm-profdata merge test.profraw -output test.profdata
> ldc2 -profile-instr-use=test.profdata test.d -of=test
You should now have the executable "test" with an amazing performance boost ;-)

You can inspect the generated code using LDC's -output-ll switch. Functions should be annotated with call frequencies, and most branches should be annotated with branch_weights metadata. For example:
> define void @for_loop() #0 !prof !12
> ...
> !12 = !{!"function_entry_count", i64 234}
for "void for_loop()" that is called 234 times, and
> br i1 %3, label %if, label %else, !prof !17
> ...
> !17 = !{!"branch_weights", i32 5, i32 3}
for "if (condition) {...} else {...}"
The branch_weights have an offset of 1, so the above means that the condition was true 4 times, and false 2 times. If a certain piece of code is never executed, no metadata is added (i.e. you won't see {!"branch_weights", i32 1, i32 1}). Some branches are intentionally not instrumented/annotated if they lead to terminating code (e.g. array boundschecks and auto-generated nullptr checks on this at class method entry).

I hope you will be able to test and comment on the work. I am very interested in hearing about performance gains(/losses/no-change) for your programs. I am curious to learn for what kinds of code it makes a difference in practice.

Thanks!
  Johan

(future work will probably include coverage analysis (llvm-cov) and support for sampling-based profiles, which should fit naturally with the current implementation)

[1] http://wiki.dlang.org/LDC_LLVM_profiling_instrumentation
[2] https://github.com/JohanEngelen/ldc/tree/pgo  (warning: I will rebase soon)
[3] https://github.com/ldc-developers/ldc/pull/1219
[4] http://clang.llvm.org/docs/UsersManual.html#profiling-with-instrumentation

December 22, 2015
On Tuesday, 15 December 2015 at 23:05:38 UTC, Johan Engelen wrote:
> Hi all,
>   I have been working on adding profile-guided optimization (PGO) to LDC [1][2][3].
> At this point, I'd like to hear your input and hope you can help with testing!
>
> Unfortunately, to try it out, you will need to build LDC with LLVM3.7 yourself. PGO should work on OS X, Linux, and Windows.

Would it help if binaries are available?
Or is general interest low?

-Johan
December 22, 2015
On Tuesday, 22 December 2015 at 14:49:51 UTC, Johan Engelen wrote:
> On Tuesday, 15 December 2015 at 23:05:38 UTC, Johan Engelen wrote:
>> Hi all,
>>   I have been working on adding profile-guided optimization (PGO) to LDC [1][2][3].
>> At this point, I'd like to hear your input and hope you can help with testing!
>>
>> Unfortunately, to try it out, you will need to build LDC with LLVM3.7 yourself. PGO should work on OS X, Linux, and Windows.
>
> Would it help if binaries are available?
> Or is general interest low?
>
> -Johan

Maybe many folks reading your post don't know what PGO is? Perhaps you need to do a bit of a sales job to convince folks that it is pretty cool, and worth trying out.

Craig
December 22, 2015
On Tuesday, 22 December 2015 at 17:09:21 UTC, CraigDillabaugh wrote:
> Maybe many folks reading your post don't know what PGO is?

https://en.m.wikipedia.org/wiki/Profile-guided_optimization

Matt.
December 22, 2015
On Tuesday, 22 December 2015 at 14:49:51 UTC, Johan Engelen wrote:
> On Tuesday, 15 December 2015 at 23:05:38 UTC, Johan Engelen Would it help if binaries are available?

Definitely!

Matt.
January 01, 2016
On Tuesday, 22 December 2015 at 14:49:51 UTC, Johan Engelen wrote:
> On Tuesday, 15 December 2015 at 23:05:38 UTC, Johan Engelen wrote:
>> Hi all,
>>   I have been working on adding profile-guided optimization (PGO) to LDC [1][2][3].
>> At this point, I'd like to hear your input and hope you can help with testing!
>>
>> Unfortunately, to try it out, you will need to build LDC with LLVM3.7 yourself. PGO should work on OS X, Linux, and Windows.
>
> Would it help if binaries are available?
> Or is general interest low?
>
> -Johan

Sorry I don't read the forums often. This is definitely going to be a game changer for me, I need PGO to help with Botan performance issues and I'm going to be developing an embedded server on Intel Edison with vibe.d/botan/http2 soon, this compiler could make quite the difference. I'll be testing it one I get my prototype!
January 05, 2016
On Tuesday, 22 December 2015 at 17:36:30 UTC, Mattcoder wrote:
> On Tuesday, 22 December 2015 at 14:49:51 UTC, Johan Engelen wrote:
>> On Tuesday, 15 December 2015 at 23:05:38 UTC, Johan Engelen Would it help if binaries are available?
>
> Definitely!

Sorry for false hope :(. I wouldn't even know where to upload them...

I hope to have PGO in LDC master soon. It's working, just need some code cleanup. Once merged into master, you can play with it in nightly builds. For windows, successfull builds of master with LLVM3.7 (needed for PGO) are uploaded to github: http://wiki.dlang.org/Latest_pre-built_LDC_for_Win64


January 06, 2016
On Tuesday, 22 December 2015 at 14:49:51 UTC, Johan Engelen wrote:
> Would it help if binaries are available?
> Or is general interest low?
>
> -Johan

Reducing steps will always help. I am interested, but not to the point of figuring out how to compile newest LDC myself.