August 26, 2010 Re: Why C++ compiles slowly | ||||
---|---|---|---|---|
| ||||
Posted in reply to Jacob Carlborg | Hello Jacob, > On 2010-08-26 08:13, Walter Bright wrote: > >> Steven Schveighoffer wrote: >> >>> On Tue, 24 Aug 2010 03:58:57 -0400, Walter Bright >>> <newshound2@digitalmars.com> wrote: >>>> Steven Schveighoffer wrote: >>>> >>>>> With profiling enabled, gprof outputs this as the top hitters: >>>>> Flat profile: >>>>> Each sample counts as 0.01 seconds. >>>>> % cumulative self self total >>>>> time seconds seconds calls ms/call ms/call name >>>>> 77.76 6.68 6.68 2952 2.26 2.26 elf_findstr(Outbuffer*, char >>>>> const*, >>>>> char const*) >>>>> 2.10 6.86 0.18 4342 0.04 0.04 searchfixlist >>>> elf_findstr definitely looks like a problem area. I can't look at >>>> it right now, so can you post this to bugzilla please? >>>> >>> http://d.puremagic.com/issues/show_bug.cgi?id=4721 >>> >> Let me know how this works: >> >> http://www.dsource.org/projects/dmd/changeset/628 >> > Shouldn't machobj.c get the same optimization? > Shouldn't something like a table lookup be shared rather than duplicated? -- ... <IXOYE>< |
August 26, 2010 Re: Why C++ compiles slowly | ||||
---|---|---|---|---|
| ||||
Posted in reply to Steven Schveighoffer | Steven Schveighoffer wrote: > Better, now takes 20 seconds vs over 60. The new culprit: > > Flat profile: > > Each sample counts as 0.01 seconds. > % cumulative self self total > time seconds seconds calls ms/call ms/call name > 75.79 6.51 6.51 8103 0.80 0.80 TemplateDeclaration::toJsonBuffer(OutBuffer*) This is most peculiar, as that should have shown up on the previous profile. > I haven't looked at toJsonBuffer at all (btw, why are we calling this function if I'm not outputting json?) That only happens if -X is passed on the command line, or one of the files on the command line has a .json index. |
August 26, 2010 Re: Why C++ compiles slowly | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | On Thu, 26 Aug 2010 12:53:59 -0400, Walter Bright <newshound2@digitalmars.com> wrote:
> Steven Schveighoffer wrote:
>> Better, now takes 20 seconds vs over 60. The new culprit:
>> Flat profile:
>> Each sample counts as 0.01 seconds.
>> % cumulative self self total
>> time seconds seconds calls ms/call ms/call name
>> 75.79 6.51 6.51 8103 0.80 0.80 TemplateDeclaration::toJsonBuffer(OutBuffer*)
>
> This is most peculiar, as that should have shown up on the previous profile.
I did some more testing. I think I compiled the profiled version of the svn trunk dmd wrong. This is what happens when you let idiots debug your code for you ;)
I recompiled it, and here is the new list:
Flat profile:
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls ms/call ms/call name
80.31 11.99 11.99 19000 0.63 0.63 searchfixlist
0.67 12.09 0.10 203173 0.00 0.00 StringTable::search(char const*, unsigned int)
0.60 12.18 0.09 369389 0.00 0.00 Lexer::scan(Token*)
0.54 12.26 0.08 953613 0.00 0.00 ScopeDsymbol::search(Loc, Identifier*, int)
0.47 12.33 0.07 1449798 0.00 0.00 calccodsize
0.40 12.39 0.06 587814 0.00 0.00 code_calloc
0.40 12.45 0.06 41406 0.00 0.00 pinholeopt
0.33 12.50 0.05 901563 0.00 0.00 _aaGetRvalue(AA*, void*)
0.33 12.55 0.05 138329 0.00 0.00 reftoident(int, unsigned long long, Symbol*, unsigned long long, int)
0.33 12.60 0.05 26849 0.00 0.00 ecom(elem**)
0.27 12.64 0.04 230869 0.00 0.00 Type::totym()
0.27 12.68 0.04 62784 0.00 0.00 touchfunc(int)
0.27 12.72 0.04 37623 0.00 0.00 optelem(elem*, int)
0.27 12.76 0.04 28348 0.00 0.00 assignaddrc
It looks like searchfixlist is another linear search, looking back at the other profile, it was the second highest consumer of runtime at 2% of the runtime before your fix, so it catapulted up to 80% of the runtime. It looks like a linked-list search, so it might benefit from a hash table as well? I'm not really sure.
Also, the 2% was on only one file being compiled, with the shortened run time, I was able to build all of dcollections, that's why the # of calls for searchfixlist is much higher.
I'll update the bug.
|
August 26, 2010 Re: Why C++ compiles slowly | ||||
---|---|---|---|---|
| ||||
Posted in reply to Steven Schveighoffer | Steven Schveighoffer wrote:
> I'll update the bug.
Thanks!
|
August 26, 2010 Re: Why C++ compiles slowly | ||||
---|---|---|---|---|
| ||||
Posted in reply to Steven Schveighoffer | Steven Schveighoffer wrote:
> Each sample counts as 0.01 seconds.
> % cumulative self self total
> time seconds seconds calls ms/call ms/call name
> 80.31 11.99 11.99 19000 0.63 0.63 searchfixlist
Just for fun, searchfixlist goes back at least to 1983 or so.
|
August 26, 2010 Re: Why C++ compiles slowly | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | Walter Bright: > Just for fun, searchfixlist goes back at least to 1983 or so. It contains this if (I am not able to indent it well): if (s->Sseg == p->Lseg && (s->Sclass == SCstatic || #if TARGET_LINUX || TARGET_OSX || TARGET_FREEBSD || TARGET_SOLARIS (!(config.flags3 & CFG3pic) && s->Sclass == SCglobal)) && #else s->Sclass == SCglobal) && #endif s->Sxtrnnum == 0 && p->Lflags & CFselfrel) { How do you rewrite that in good D code? A possible way is to split that messy if into two nested ifs. Between first and second if you define a boolean variable in two different ways using a static if. And in the second if you use the boolean variable and the second part of the runtime test. Something like this: if (part1) { static if (versions_test) { bool aux = ... } else { bool aux = ... } if (aux && part2) { // ... } else { // ... } } aux is defined in the middle and not before the first if because performing this runtime test is not necessary when part1 fails. Bye, bearophile |
August 27, 2010 Re: Why C++ compiles slowly | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | Hello Walter, > Steven Schveighoffer wrote: > >> Each sample counts as 0.01 seconds. >> % cumulative self self total >> time seconds seconds calls ms/call ms/call name >> 80.31 11.99 11.99 19000 0.63 0.63 searchfixlist > Just for fun, searchfixlist goes back at least to 1983 or so. > Early or late '83? I ask because *I* go back to '83 or so. :) -- ... <IXOYE>< |
August 27, 2010 Re: Why C++ compiles slowly | ||||
---|---|---|---|---|
| ||||
Posted in reply to BCS | BCS wrote:
> Hello Walter,
>
>> Steven Schveighoffer wrote:
>>
>>> Each sample counts as 0.01 seconds.
>>> % cumulative self self total
>>> time seconds seconds calls ms/call ms/call name
>>> 80.31 11.99 11.99 19000 0.63 0.63 searchfixlist
>> Just for fun, searchfixlist goes back at least to 1983 or so.
>>
>
> Early or late '83? I ask because *I* go back to '83 or so. :)
June 7th, 3:26 PM. Give or take 6 months.
|
August 27, 2010 Re: Why C++ compiles slowly | ||||
---|---|---|---|---|
| ||||
Posted in reply to BCS | On 2010-08-26 16:14, BCS wrote: > Hello Jacob, > >> On 2010-08-26 08:13, Walter Bright wrote: >> >>> Steven Schveighoffer wrote: >>> >>>> On Tue, 24 Aug 2010 03:58:57 -0400, Walter Bright >>>> <newshound2@digitalmars.com> wrote: >>>>> Steven Schveighoffer wrote: >>>>> >>>>>> With profiling enabled, gprof outputs this as the top hitters: >>>>>> Flat profile: >>>>>> Each sample counts as 0.01 seconds. >>>>>> % cumulative self self total >>>>>> time seconds seconds calls ms/call ms/call name >>>>>> 77.76 6.68 6.68 2952 2.26 2.26 elf_findstr(Outbuffer*, char >>>>>> const*, >>>>>> char const*) >>>>>> 2.10 6.86 0.18 4342 0.04 0.04 searchfixlist >>>>> elf_findstr definitely looks like a problem area. I can't look at >>>>> it right now, so can you post this to bugzilla please? >>>>> >>>> http://d.puremagic.com/issues/show_bug.cgi?id=4721 >>>> >>> Let me know how this works: >>> >>> http://www.dsource.org/projects/dmd/changeset/628 >>> >> Shouldn't machobj.c get the same optimization? >> > > Shouldn't something like a table lookup be shared rather than duplicated? Yes, that would be better. -- /Jacob Carlborg |
Copyright © 1999-2021 by the D Language Foundation