Jump to page: 1 24  
Page
Thread overview
top time wasters in DMD, as reported by gprof
Jun 23, 2013
Walter Bright
Jun 24, 2013
Timothee Cour
Jun 24, 2013
Walter Bright
Jun 24, 2013
Timothee Cour
Jun 24, 2013
Iain Buclaw
Jun 24, 2013
dennis luehring
Jun 24, 2013
Dicebot
Jun 24, 2013
dennis luehring
Jun 24, 2013
Richard Webb
Jun 24, 2013
dennis luehring
Jun 26, 2013
SomeDude
Jun 26, 2013
SomeDude
Jun 26, 2013
Jonathan M Davis
Jun 26, 2013
dennis luehring
Jun 24, 2013
dennis luehring
Re: top time wasters in DMD, as reported by gprof - VS2010/VTune results
Jun 25, 2013
dennis luehring
Jun 25, 2013
dennis luehring
Jun 25, 2013
dennis luehring
Jun 25, 2013
Kiith-Sa
Jun 25, 2013
Walter Bright
Jun 24, 2013
Andrej Mitrovic
Jun 24, 2013
Adam Wilson
Jun 24, 2013
Walter Bright
Jun 24, 2013
Walter Bright
Jun 24, 2013
dennis luehring
Jun 26, 2013
John Colvin
Jun 26, 2013
dennis luehring
Jun 24, 2013
Martin Nowak
Jun 24, 2013
dennis luehring
Jun 24, 2013
Walter Bright
Jun 24, 2013
Martin Nowak
Jun 24, 2013
Martin Nowak
Jun 25, 2013
Marco Leise
Jun 25, 2013
Iain Buclaw
Jun 25, 2013
Iain Buclaw
Jun 27, 2013
Peter Alexander
Jun 25, 2013
Wyatt
June 23, 2013
Flat profile:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total
 time   seconds   seconds    calls  ms/call  ms/call  name
  5.19      0.08     0.08   945494     0.00     0.00  _aaGetRvalue(AA*, void*)
  3.90      0.14     0.06   926415     0.00     0.00  Lexer::scan(Token*)
  3.25      0.19     0.05   397097     0.00     0.00  calcHash(char const*, unsigned long)
  2.60      0.23     0.04   373646     0.00     0.00  StringTable::search(char const*, unsigned lon
  1.95      0.26     0.03    33255     0.00     0.00 TemplateInstance::semantic(Scope*, Array<Expr
  1.30      0.28     0.02   968985     0.00     0.00  ScopeDsymbol::search(Loc, Identifier*, int)
  1.30      0.30     0.02   424079     0.00     0.00  calccodsize
  1.30      0.32     0.02   234547     0.00     0.00  Parser::parseShiftExp()
  1.30      0.34     0.02   168933     0.00     0.00  AliasDeclaration::toAlias()
  1.30      0.36     0.02   136371     0.00     0.00  _aaGet(AA**, void*)
  1.30      0.38     0.02   132102     0.00     0.00  TypeStruct::hasPointers()
  1.30      0.40     0.02    32490     0.00     0.00 TemplateDeclaration::matchWithInstance(Templa
  1.30      0.42     0.02    21666     0.00     0.00 FuncDeclaration::functionSemantic()
  1.30      0.44     0.02    17983     0.00     0.00 VarDeclaration::syntaxCopy(Dsymbol*)
  1.30      0.46     0.02    12908     0.00     0.00 TemplateDeclaration::deduceFunctionTemplateMa
  1.30      0.48     0.02     6890     0.00     0.00  assignaddrc
  1.30      0.50     0.02     6563     0.00     0.00  ecom(elem**)
  1.30      0.52     0.02     5675     0.00     0.00 TypeInfo_Idxstr::getHash(void*)
  1.30      0.54     0.02     3780     0.01     0.01 TemplateDeclaration::semantic(Scope*)
  1.30      0.56     0.02     2648     0.01     0.01  operator delete(void*)
  0.97      0.58     0.02    42592     0.00     0.00 IdentifierExp::semantic(Scope*)
  0.97      0.59     0.02    36403     0.00     0.00  BinExp::syntaxCopy()
  0.65      0.60     0.01  2668529     0.00     0.00  operator new(unsigned long)
  0.65      0.61     0.01  2575503     0.00     0.00  Dsymbol::isTemplateMixin()
  0.65      0.62     0.01  2310757     0.00     0.00  Type::toBasetype()
  0.65      0.63     0.01  2096689     0.00     0.00 OutBuffer::reserve(unsigned long)
  0.65      0.64     0.01  1030635     0.00     0.00  Dsymbol::isTemplateInstance()
  0.65      0.65     0.01   708615     0.00     0.00  Module::search(Loc, Identifier*, int)
  0.65      0.66     0.01   683509     0.00     0.00  VarDeclaration::isDataseg()
  0.65      0.67     0.01   662180     0.00     0.00  Module::isModule()
  0.65      0.68     0.01   567921     0.00     0.00  isType(RootObject*)
  0.65      0.69     0.01   536331     0.00     0.00  Mem::realloc(void*, unsigned long)
  0.65      0.70     0.01   454836     0.00     0.00 Parameter::foreach(Array<Parameter>*, int (*)
  0.65      0.71     0.01   375732     0.00     0.00  Type::merge()
  0.65      0.72     0.01   353615     0.00     0.00  Mem::free(void*)
  0.65      0.73     0.01   348546     0.00     0.00  isDsymbol(RootObject*)
  0.65      0.74     0.01   339336     0.00     0.00 exceptionOrCantInterpret(Expression*)
  0.65      0.75     0.01   311493     0.00     0.00  isTuple(RootObject*)
  0.65      0.76     0.01   300069     0.00     0.00  TypeBasic::isscalar()
  0.65      0.77     0.01   270438     0.00     0.00  VarDeclaration::hasPointers()
  0.65      0.78     0.01   248163     0.00     0.00 Identifier::equals(RootObject*)
  0.65      0.79     0.01   234806     0.00     0.00  Parser::parseAddExp()
June 24, 2013
can you give more context ? what was the command line to get this?
I thought lexing was not a bottleneck, yet it seems to be in second place?

On Sun, Jun 23, 2013 at 2:28 PM, Walter Bright <newshound2@digitalmars.com>wrote:

> Flat profile:
>
> Each sample counts as 0.01 seconds.
>   %   cumulative   self              self     total
>  time   seconds   seconds    calls  ms/call  ms/call  name
>   5.19      0.08     0.08   945494     0.00     0.00  _aaGetRvalue(AA*,
> void*)
>   3.90      0.14     0.06   926415     0.00     0.00  Lexer::scan(Token*)
>   3.25      0.19     0.05   397097     0.00     0.00  calcHash(char
> const*, unsigned long)
>   2.60      0.23     0.04   373646     0.00     0.00
>  StringTable::search(char const*, unsigned lon
>   1.95      0.26     0.03    33255     0.00     0.00
> TemplateInstance::semantic(**Scope*, Array<Expr
>   1.30      0.28     0.02   968985     0.00     0.00
>  ScopeDsymbol::search(Loc, Identifier*, int)
>   1.30      0.30     0.02   424079     0.00     0.00  calccodsize
>   1.30      0.32     0.02   234547     0.00     0.00
>  Parser::parseShiftExp()
>   1.30      0.34     0.02   168933     0.00     0.00
>  AliasDeclaration::toAlias()
>   1.30      0.36     0.02   136371     0.00     0.00  _aaGet(AA**, void*)
>   1.30      0.38     0.02   132102     0.00     0.00
>  TypeStruct::hasPointers()
>   1.30      0.40     0.02    32490     0.00     0.00 TemplateDeclaration::
> **matchWithInstance(Templa
>   1.30      0.42     0.02    21666     0.00     0.00 FuncDeclaration::**
> functionSemantic()
>   1.30      0.44     0.02    17983     0.00     0.00
> VarDeclaration::syntaxCopy(**Dsymbol*)
>   1.30      0.46     0.02    12908     0.00     0.00 TemplateDeclaration::
> **deduceFunctionTemplateMa
>   1.30      0.48     0.02     6890     0.00     0.00  assignaddrc
>   1.30      0.50     0.02     6563     0.00     0.00  ecom(elem**)
>   1.30      0.52     0.02     5675     0.00     0.00
> TypeInfo_Idxstr::getHash(void***)
>   1.30      0.54     0.02     3780     0.01     0.01
> TemplateDeclaration::semantic(**Scope*)
>   1.30      0.56     0.02     2648     0.01     0.01  operator
> delete(void*)
>   0.97      0.58     0.02    42592     0.00     0.00
> IdentifierExp::semantic(Scope***)
>   0.97      0.59     0.02    36403     0.00     0.00  BinExp::syntaxCopy()
>   0.65      0.60     0.01  2668529     0.00     0.00  operator
> new(unsigned long)
>   0.65      0.61     0.01  2575503     0.00     0.00
>  Dsymbol::isTemplateMixin()
>   0.65      0.62     0.01  2310757     0.00     0.00  Type::toBasetype()
>   0.65      0.63     0.01  2096689     0.00     0.00
> OutBuffer::reserve(unsigned long)
>   0.65      0.64     0.01  1030635     0.00     0.00
>  Dsymbol::isTemplateInstance()
>   0.65      0.65     0.01   708615     0.00     0.00  Module::search(Loc,
> Identifier*, int)
>   0.65      0.66     0.01   683509     0.00     0.00
>  VarDeclaration::isDataseg()
>   0.65      0.67     0.01   662180     0.00     0.00  Module::isModule()
>   0.65      0.68     0.01   567921     0.00     0.00  isType(RootObject*)
>   0.65      0.69     0.01   536331     0.00     0.00  Mem::realloc(void*,
> unsigned long)
>   0.65      0.70     0.01   454836     0.00     0.00
> Parameter::foreach(Array<**Parameter>*, int (*)
>   0.65      0.71     0.01   375732     0.00     0.00  Type::merge()
>   0.65      0.72     0.01   353615     0.00     0.00  Mem::free(void*)
>   0.65      0.73     0.01   348546     0.00     0.00
>  isDsymbol(RootObject*)
>   0.65      0.74     0.01   339336     0.00     0.00
> exceptionOrCantInterpret(**Expression*)
>   0.65      0.75     0.01   311493     0.00     0.00  isTuple(RootObject*)
>   0.65      0.76     0.01   300069     0.00     0.00  TypeBasic::isscalar()
>   0.65      0.77     0.01   270438     0.00     0.00
>  VarDeclaration::hasPointers()
>   0.65      0.78     0.01   248163     0.00     0.00
> Identifier::equals(RootObject***)
>   0.65      0.79     0.01   234806     0.00     0.00  Parser::parseAddExp()
>


June 24, 2013
On 6/23/2013 8:39 PM, Timothee Cour wrote:
> can you give more context ? what was the command line to get this?

Compile on Linux with -pg. Run with:

   dmd -main -unittest std/algorithm

> I thought lexing was not a bottleneck, yet it seems to be in second place?

The fun with profilers is they tell you things you didn't think were true.
June 24, 2013
On Sun, Jun 23, 2013 at 8:55 PM, Walter Bright <newshound2@digitalmars.com>wrote:

> On 6/23/2013 8:39 PM, Timothee Cour wrote:
>
>> can you give more context ? what was the command line to get this?
>>
>
> Compile on Linux with -pg. Run with:
>
>    dmd -main -unittest std/algorithm
>
>
> > I thought lexing was not a bottleneck, yet it seems to be in second
> place?
>
> The fun with profilers is they tell you things you didn't think were true.
>

more profiliing numbers from Ian Buclaw, for gdc:
http://forum.dlang.org/post/mailman.1469.1339580395.24740.digitalmars-d@puremagic.com
http://iainbuclaw.wordpress.com/2010/09/18/implementing-speed-counters-in-gdc/
http://iainbuclaw.files.wordpress.com/2010/09/d2-time-report2.pdf
That cam up in an old proposal i made a year ago: "AST files instead of DI
interface files for faster compilation and easier distribution".


June 24, 2013
On 24 June 2013 05:53, Timothee Cour <thelastmammoth@gmail.com> wrote:
> On Sun, Jun 23, 2013 at 8:55 PM, Walter Bright <newshound2@digitalmars.com> wrote:
>>
>> On 6/23/2013 8:39 PM, Timothee Cour wrote:
>>>
>>> can you give more context ? what was the command line to get this?
>>
>>
>> Compile on Linux with -pg. Run with:
>>
>>    dmd -main -unittest std/algorithm
>>
>>
>> > I thought lexing was not a bottleneck, yet it seems to be in second place?
>>
>> The fun with profilers is they tell you things you didn't think were true.
>
>
> more profiliing numbers from Ian Buclaw, for gdc:
> http://forum.dlang.org/post/mailman.1469.1339580395.24740.digitalmars-d@puremagic.com
> http://iainbuclaw.wordpress.com/2010/09/18/implementing-speed-counters-in-gdc/
> http://iainbuclaw.files.wordpress.com/2010/09/d2-time-report2.pdf
> That cam up in an old proposal i made a year ago: "AST files instead of DI
> interface files for faster compilation and easier distribution".
>
>


With that report, all modules are compiled with roughly the equivalent of dmd's -O -g -release.  So for std.algorithm, it is show to take 0.52 seconds to compile.  However with -unittest turned on, I know it takes at least a minute for gdc to compile that module (up there with std.datetime) - and consumes about 2GBs of memory. ;-)


--
Iain Buclaw

*(p < e ? p++ : p) = (c & 0x0f) + '0';
June 24, 2013
how does that look using msvc compiling the dmd compiler
as it turns out that msvc make dmd much faster

btw: the memory profile tool i mentioned is also available as performance profiler: http://www.softwareverify.com/cpp-profiler.php

Am 23.06.2013 23:28, schrieb Walter Bright:
> Flat profile:
>
> Each sample counts as 0.01 seconds.
>     %   cumulative   self              self     total
>    time   seconds   seconds    calls  ms/call  ms/call  name
>     5.19      0.08     0.08   945494     0.00     0.00  _aaGetRvalue(AA*, void*)
>     3.90      0.14     0.06   926415     0.00     0.00  Lexer::scan(Token*)
>     3.25      0.19     0.05   397097     0.00     0.00  calcHash(char const*,
> unsigned long)
>     2.60      0.23     0.04   373646     0.00     0.00  StringTable::search(char
> const*, unsigned lon
>     1.95      0.26     0.03    33255     0.00     0.00
> TemplateInstance::semantic(Scope*, Array<Expr
>     1.30      0.28     0.02   968985     0.00     0.00  ScopeDsymbol::search(Loc,
> Identifier*, int)
>     1.30      0.30     0.02   424079     0.00     0.00  calccodsize
>     1.30      0.32     0.02   234547     0.00     0.00  Parser::parseShiftExp()
>     1.30      0.34     0.02   168933     0.00     0.00  AliasDeclaration::toAlias()
>     1.30      0.36     0.02   136371     0.00     0.00  _aaGet(AA**, void*)
>     1.30      0.38     0.02   132102     0.00     0.00  TypeStruct::hasPointers()
>     1.30      0.40     0.02    32490     0.00     0.00
> TemplateDeclaration::matchWithInstance(Templa
>     1.30      0.42     0.02    21666     0.00     0.00
> FuncDeclaration::functionSemantic()
>     1.30      0.44     0.02    17983     0.00     0.00
> VarDeclaration::syntaxCopy(Dsymbol*)
>     1.30      0.46     0.02    12908     0.00     0.00
> TemplateDeclaration::deduceFunctionTemplateMa
>     1.30      0.48     0.02     6890     0.00     0.00  assignaddrc
>     1.30      0.50     0.02     6563     0.00     0.00  ecom(elem**)
>     1.30      0.52     0.02     5675     0.00     0.00
> TypeInfo_Idxstr::getHash(void*)
>     1.30      0.54     0.02     3780     0.01     0.01
> TemplateDeclaration::semantic(Scope*)
>     1.30      0.56     0.02     2648     0.01     0.01  operator delete(void*)
>     0.97      0.58     0.02    42592     0.00     0.00
> IdentifierExp::semantic(Scope*)
>     0.97      0.59     0.02    36403     0.00     0.00  BinExp::syntaxCopy()
>     0.65      0.60     0.01  2668529     0.00     0.00  operator new(unsigned long)
>     0.65      0.61     0.01  2575503     0.00     0.00  Dsymbol::isTemplateMixin()
>     0.65      0.62     0.01  2310757     0.00     0.00  Type::toBasetype()
>     0.65      0.63     0.01  2096689     0.00     0.00
> OutBuffer::reserve(unsigned long)
>     0.65      0.64     0.01  1030635     0.00     0.00  Dsymbol::isTemplateInstance()
>     0.65      0.65     0.01   708615     0.00     0.00  Module::search(Loc,
> Identifier*, int)
>     0.65      0.66     0.01   683509     0.00     0.00  VarDeclaration::isDataseg()
>     0.65      0.67     0.01   662180     0.00     0.00  Module::isModule()
>     0.65      0.68     0.01   567921     0.00     0.00  isType(RootObject*)
>     0.65      0.69     0.01   536331     0.00     0.00  Mem::realloc(void*,
> unsigned long)
>     0.65      0.70     0.01   454836     0.00     0.00
> Parameter::foreach(Array<Parameter>*, int (*)
>     0.65      0.71     0.01   375732     0.00     0.00  Type::merge()
>     0.65      0.72     0.01   353615     0.00     0.00  Mem::free(void*)
>     0.65      0.73     0.01   348546     0.00     0.00  isDsymbol(RootObject*)
>     0.65      0.74     0.01   339336     0.00     0.00
> exceptionOrCantInterpret(Expression*)
>     0.65      0.75     0.01   311493     0.00     0.00  isTuple(RootObject*)
>     0.65      0.76     0.01   300069     0.00     0.00  TypeBasic::isscalar()
>     0.65      0.77     0.01   270438     0.00     0.00  VarDeclaration::hasPointers()
>     0.65      0.78     0.01   248163     0.00     0.00
> Identifier::equals(RootObject*)
>     0.65      0.79     0.01   234806     0.00     0.00  Parser::parseAddExp()
>

June 24, 2013
On Monday, 24 June 2013 at 13:21:23 UTC, dennis luehring wrote:
> how does that look using msvc compiling the dmd compiler
> as it turns out that msvc make dmd much faster

Cosnidering keyword "gprof", chances are extremely high it was run with gcc on some not-so-MS platform.
June 24, 2013
For some reason gprof nerver terminates on my system. So here is the same data recorded with oprofile's operf. It also includes kernel, libc and ld figures.

http://codepad.org/zL3C6eQd

whole phobos as shared library with unittests and BUILD=debug
http://codepad.org/lITSMbhD

BUILD=release didn't compile with some error about .offsetof
June 24, 2013
Am 24.06.2013 16:09, schrieb Dicebot:
> On Monday, 24 June 2013 at 13:21:23 UTC, dennis luehring wrote:
>> how does that look using msvc compiling the dmd compiler
>> as it turns out that msvc make dmd much faster
>
> Cosnidering keyword "gprof", chances are extremely high it was
> run with gcc on some not-so-MS platform.
>

i know - my question was - how does that look using msvc...

because the msvc build tends to be faster - would be a nice compare
if an windows (dmc and msvc) build will show similar hotspots
under (because of msvc) an windows based profiler
June 24, 2013
Am 24.06.2013 16:52, schrieb Martin Nowak:
> For some reason gprof nerver terminates on my system. So here is the
> same data recorded with oprofile's operf. It also includes kernel, libc
> and ld figures.
>
> http://codepad.org/zL3C6eQd
>
> whole phobos as shared library with unittests and BUILD=debug
> http://codepad.org/lITSMbhD
>
> BUILD=release didn't compile with some error about .offsetof

what is the sense of doing profiling in BUILD=debug? or is the optimizer still active?

« First   ‹ Prev
1 2 3 4