Thread overview | ||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
September 09, 2013 On inlining in D libraries | ||||
---|---|---|---|---|
| ||||
While investigating std.regex performance in Phobos I've found that a lot of stuff never gets inlined (contrary to my expectations). Namely the 3 critical ones were declared like this: struct Bytecode{ uint raw; //bit twiddling helpers @property uint data() const { return raw & 0x003f_ffff; } //ditto @property uint sequence() const { return 2 + (raw >> 22 & 0x3); } //ditto @property IR code() const { return cast(IR)(raw>>24); } ... } And my quick hack to get them inlined - 0-arg templates: https://github.com/D-Programming-Language/phobos/pull/1553 The "stuff" in question turns out to be anything that is not a template and (consequently) is compiled into library. At first I thought it's a horrible bug somewhere in DMD's inliner, but this behavior is the same regardless of compiler. (It could be a bug of the front-end in general) Few days after filing the bug report with minimal test case: http://d.puremagic.com/issues/show_bug.cgi?id=10985 I'm not so sure if that's not an issue of separate compilation to begin with. I *thought* that the intended behavior is a) Have source - compile from source b) Don't have source (*.di files) - link in objects But I don't have much to go on this. Somebody from compiler team could probably shed some light on this. If I'm wrong then 0-arg templates is actually the only way out to get 'explicitly inline' of C++. In C++ that would look like this: //header struct A{ int foo(); } //source int A::foo(){ ... } C++ explicitly inlined: //header struct A{ int foo(){ ... } } In D we don't have this distinction. It has to be decided then if we adopt 0-arg as intended solution, or tweak front-end to always peruse accessible source when inlining. Anyhow as it stands you have one of the following: a) Do nothing. Then using e.g. isAlpha from std.ascii (or pick your favorite one-liner) is useless as it would never outperform a hand-rolled version (that could be 1:1 the same) because the latter will be inlined. b) Pass all of the interesting files from Phobos on the command line to get them fully scanned for inlining (and get compiled anew each time I guess). c) For code under your control - add an empty pair of brackets to anything that has to be inlined. None of the above options is nice. -- Dmitry Olshansky |
September 09, 2013 Re: On inlining in D libraries | ||||
---|---|---|---|---|
| ||||
Posted in reply to Dmitry Olshansky | On Monday, 9 September 2013 at 13:01:51 UTC, Dmitry Olshansky wrote:
> b) Pass all of the interesting files from Phobos on the command line to get them fully scanned for inlining (and get compiled anew each time I guess).
They more or less get compiled anew anyway since there's so many templates it has to run through, as well as the web of dependencies meaning it reads those files thanks to imports too.
Listing the files could be made easy with the dmd -r people have talked about (taking what rdmd does and putting it in the compiler). Then it does it automatically.
I doubt you'll see much impact on compile speed. Importing a phobos module is dog slow already, so it can't get much worse in any case.
|
September 09, 2013 Re: On inlining in D libraries | ||||
---|---|---|---|---|
| ||||
Posted in reply to Adam D. Ruppe | On 9/9/13, Adam D. Ruppe <destructionator@gmail.com> wrote:
> Listing the files could be made easy with the dmd -r people have talked about (taking what rdmd does and putting it in the compiler). Then it does it automatically.
>
> I doubt you'll see much impact on compile speed. Importing a phobos module is dog slow already, so it can't get much worse in any case.
W.r.t -r (recursive build), it's gives you a performance boost since
the compiler doesn't have to be invoked multiple times and do the same
work over and over again (compared to using it from RDMD).
But I've ran into a bug with that pull request, and I haven't reduced the test-case of the failure yet.
|
September 09, 2013 Re: On inlining in D libraries | ||||
---|---|---|---|---|
| ||||
Posted in reply to Dmitry Olshansky | On 09/09/13 15:01, Dmitry Olshansky wrote:
> While investigating std.regex performance in Phobos I've found that a lot of
> stuff never gets inlined (contrary to my expectations).
Is that just with dmd, or with gdc and ldc as well?
|
September 09, 2013 Re: On inlining in D libraries | ||||
---|---|---|---|---|
| ||||
Posted in reply to Adam D. Ruppe | 09-Sep-2013 17:05, Adam D. Ruppe пишет: > On Monday, 9 September 2013 at 13:01:51 UTC, Dmitry Olshansky wrote: >> b) Pass all of the interesting files from Phobos on the command line >> to get them fully scanned for inlining (and get compiled anew each >> time I guess). > > They more or less get compiled anew anyway since there's so many > templates it has to run through, as well as the web of dependencies > meaning it reads those files thanks to imports too. This was my intuition, but currently it won't go beyond templates code-gen wise. It however seems to analyze the whole code. > > Listing the files could be made easy with the dmd -r people have talked > about (taking what rdmd does and putting it in the compiler). Then it > does it automatically. It would still be a hack.. while I'm looking for a fix (or a clarification that we need a hack). If it was my personal problem I'd "solve" it with: dmd ~/dmd2/phobos/std/*.d <blah> maybe even alias it like this. Hm this way I could even inline some of druntime... > > I doubt you'll see much impact on compile speed. Agreed. > Importing a phobos > module is dog slow already, so it can't get much worse in any case. And that could be improved.. once it starts going into finer-grained imports/packages. The general felling is that it'd be *soon*. -- Dmitry Olshansky |
September 09, 2013 Re: On inlining in D libraries | ||||
---|---|---|---|---|
| ||||
Posted in reply to Joseph Rushton Wakeling | 09-Sep-2013 18:26, Joseph Rushton Wakeling пишет: > On 09/09/13 15:01, Dmitry Olshansky wrote: >> While investigating std.regex performance in Phobos I've found that a >> lot of >> stuff never gets inlined (contrary to my expectations). > > Is that just with dmd, or with gdc and ldc as well? For DMD and LDC confirmed. Would be interesting to test GDC but I bet it's the same (does LTO work here btw?). On the bright side of things std.regex is real fast on LDC *when hacked* to inline the critical bits :) -- Dmitry Olshansky |
September 09, 2013 Re: On inlining in D libraries | ||||
---|---|---|---|---|
| ||||
Posted in reply to Dmitry Olshansky | On 09/09/13 16:34, Dmitry Olshansky wrote:
> On the bright side of things std.regex is real fast on LDC *when hacked* to
> inline the critical bits :)
Do you mean when manually inlined, or when the design is tweaked to facilitate inlining?
My experience is that LDC is starting to pull ahead in the speed stakes these days [*], although it does seem to depend a bit on exactly what kind of code you're writing.
[* Caveat: that might be due to me switching to an LLVM 3.3 backend, although I was starting to observe this even when I was still working with 3.2.]
|
September 09, 2013 Re: On inlining in D libraries | ||||
---|---|---|---|---|
| ||||
Posted in reply to Joseph Rushton Wakeling | 09-Sep-2013 18:39, Joseph Rushton Wakeling пишет: > On 09/09/13 16:34, Dmitry Olshansky wrote: >> On the bright side of things std.regex is real fast on LDC *when >> hacked* to >> inline the critical bits :) > > Do you mean when manually inlined, or when the design is tweaked to > facilitate inlining? When I put extra () to indicate that said functions are templates. Then compiler gets its grip on them and finally inlines. Otherwise it generates calls and links in object code from libphobos. Which is the whole reason for the topic - is THAT is the way to go? Shouldn't compiler look into source for inlinable stuff (when source is available)? > > My experience is that LDC is starting to pull ahead in the speed stakes > these days [*], although it does seem to depend a bit on exactly what > kind of code you're writing. > > [* Caveat: that might be due to me switching to an LLVM 3.3 backend, > although I was starting to observe this even when I was still working > with 3.2.] I'm using LLVM 3.3 and fresh git clone of LDC. -- Dmitry Olshansky |
September 09, 2013 Re: On inlining in D libraries | ||||
---|---|---|---|---|
| ||||
Posted in reply to Dmitry Olshansky | On Monday, 9 September 2013 at 14:58:56 UTC, Dmitry Olshansky wrote: > 09-Sep-2013 18:39, Joseph Rushton Wakeling пишет: >> On 09/09/13 16:34, Dmitry Olshansky wrote: >>> On the bright side of things std.regex is real fast on LDC *when >>> hacked* to >>> inline the critical bits :) >> >> Do you mean when manually inlined, or when the design is tweaked to >> facilitate inlining? > > When I put extra () to indicate that said functions are templates. > Then compiler gets its grip on them and finally inlines. > Otherwise it generates calls and links in object code from libphobos. > > Which is the whole reason for the topic - is THAT is the way to go? > Shouldn't compiler look into source for inlinable stuff (when source is available)? > I only know about GDC and GDC doesn't implement cross-module inlining right now. If the modules are compiled in a single run it might work but if the modules are compiled separately then only LTO (not tested with GDC though!) can help. AFAIK the problem is this: There's no high-level way to tell the backend "hey, I have the source code for this function. if you consider inlining call me back and I'll compile it for you". The only hack which could work is _always_ compiling _all_ functions from all modules. But compile times will explode. Another issue is that whether a function will be inlined depends on details like the number of compiled instructions. Those details are only available once the function is compiled, the source code is not enough. Maybe a reasonable compromise could be made with some help from the frontend. The frontent could give us some hints ("Likely inlineable"). Then we could compile all "likely inlineable" functions and let the backend decide if it really wants to inline those. (Another options is inlining in the frontend. DMD does that right now but IIRC it causes problems with the GCC backend and is disabled in GDC). Iain can probably give a better answer here. (Note: there's a low-level way to do this: LTO actually adds intermediate code to the object files. If the linker wants to inline a function, it calls the compiler to compile that intermediate code: http://gcc.gnu.org/onlinedocs/gccint/LTO-Overview.html . In the end working LTO is probably the best solution.) |
September 09, 2013 Re: On inlining in D libraries | ||||
---|---|---|---|---|
| ||||
Posted in reply to Johannes Pfau | 09-Sep-2013 21:42, Johannes Pfau пишет: > On Monday, 9 September 2013 at 14:58:56 UTC, Dmitry Olshansky wrote: >> 09-Sep-2013 18:39, Joseph Rushton Wakeling пишет: >>> On 09/09/13 16:34, Dmitry Olshansky wrote: >>>> On the bright side of things std.regex is real fast on LDC *when >>>> hacked* to >>>> inline the critical bits :) >>> >>> Do you mean when manually inlined, or when the design is tweaked to >>> facilitate inlining? >> >> When I put extra () to indicate that said functions are templates. >> Then compiler gets its grip on them and finally inlines. >> Otherwise it generates calls and links in object code from libphobos. >> >> Which is the whole reason for the topic - is THAT is the way to go? >> Shouldn't compiler look into source for inlinable stuff (when source >> is available)? >> > > I only know about GDC and GDC doesn't implement cross-module inlining > right now. If the modules are compiled in a single run it might work but > if the modules are compiled separately then only LTO (not tested with > GDC though!) can help. > > AFAIK the problem is this: There's no high-level way to tell the backend > "hey, I have the source code for this function. if you consider inlining > call me back and I'll compile it for you". The only hack which could > work is _always_ compiling _all_ functions from all modules. But compile > times will explode. Precisely the problem we have and the current state of things. Compiling everything would be option B). The solution sought after is not how to hack this around but how to make everything work nicely out of the box (for everybody). > Another issue is that whether a function will be inlined depends on > details like the number of compiled instructions. Those details are only > available once the function is compiled, the source code is not enough. > > Maybe a reasonable compromise could be made with some help from the > frontend. The frontent could give us some hints ("Likely inlineable"). > Then we could compile all "likely inlineable" functions and let the > backend decide if it really wants to inline those. > > (Another options is inlining in the frontend. DMD does that right now > but IIRC it causes problems with the GCC backend and is disabled in > GDC). Iain can probably give a better answer here. DMD's AST re-writing inliner is rather lame currently, hence just not worth the trouble I suspect. > > (Note: there's a low-level way to do this: LTO actually adds > intermediate code to the object files. If the linker wants to inline a > function, it calls the compiler to compile that intermediate code: > http://gcc.gnu.org/onlinedocs/gccint/LTO-Overview.html . In the end > working LTO is probably the best solution.) LTO would be the best solution but at the moment it's rather rarely used optimization with obscure issues of its own. It makes me think that generating generic (& sensible) IR instead of object code and doing inlining of that is a cute idea.. but wait that's what LLVM analog of LTO should do. -- Dmitry Olshansky |
Copyright © 1999-2021 by the D Language Foundation