Thread overview
How accurate is dmd profile? (and do I need GMD/LDC to use gprof?)
Oct 03, 2021
Chris Katko
Oct 03, 2021
max haughton
Oct 04, 2021
drug
Oct 03, 2021
H. S. Teoh
Oct 04, 2021
Chris Katko
Oct 04, 2021
bauss
Oct 06, 2021
Imperatorn
Oct 06, 2021
H. S. Teoh
October 03, 2021

Does it break down on multi-threaded scenarios?

I'm running dmd (newest) + Allegro (a C game programming library) with DAllegro (a nice templated binder). My executable is multi-threaded (mostly just helper functions / glue logic from libraries/D/etc), and using OpenGL on 64-bit Linux with a very recent DMD release.

The function that dmd's -profile reported was the biggest user of CPU time was a tiny little function that draws a couple background tiles. (<30 at worst case) As opposed to everything else being drawn, tons of opengl primitives, graphical text, text being converted with tons of writelns to console, etc.

It didn't make any sense, so I loaded it up with valgrind and kcachegrind and it said, "no, this function takes 0.00 of total time."

Should I be using DMD's -profile? Does it have known failure modes? Is this failure mode new to people? Is there any way to get normal profiling with gprof or whatever with DMD, or do I need to compile with LDC and GDC?

I'm getting back into D and I recall having both toolchains (LDC and DMD) running. This might have been the reason I kept LDC around and maintained two sets of libraries compiled for both LDC and DMD.

Also "-profile" functions used over 7% of all CPU time. Is that the nature of the profiling, or is D using way more than comparable languages/compilers?

Lastly, is there any way to d mangle D functions in Valgrind/kcachegrind?

Thanks! Have a great weekend!

October 03, 2021

On Sunday, 3 October 2021 at 08:31:14 UTC, Chris Katko wrote:

>

Does it break down on multi-threaded scenarios?

I'm running dmd (newest) + Allegro (a C game programming library) with DAllegro (a nice templated binder). My executable is multi-threaded (mostly just helper functions / glue logic from libraries/D/etc), and using OpenGL on 64-bit Linux with a very recent DMD release.

[...]

Unless your profiler does call stack sampling (which I don't think gprof or dmd does), don't use it. They're not reliable unless you are doing very targeted profiling.

For profiling code, if you're on an Intel, vTune is top dog. Nothing else is as good.

October 03, 2021
On Sun, Oct 03, 2021 at 08:31:14AM +0000, Chris Katko via Digitalmars-d wrote: [...]
> The function that dmd's -profile reported was the biggest user of CPU time was a tiny little function that draws a couple background tiles. (<30 at worst case) As opposed to everything else being drawn, tons of opengl primitives, graphical text, text being converted with tons of writelns to console, etc.
[...]

Be aware that dmd -profile uses *16-bit counters* for tracking function call counts; if your program is CPU-intensive and calls the same function(s) in inner loops more than 65535 times, the counters will wrap around and cause the profile output to be garbled.

I ran into this a couple of years ago when trying to profile some CPU-intensive code with some non-trivial testcases, and found dmd -profile output completely unusable because of this limitation.


T

-- 
Recently, our IT department hired a bug-fix engineer. He used to work for Volkswagen.
October 04, 2021
On Sunday, 3 October 2021 at 21:55:29 UTC, H. S. Teoh wrote:
> Be aware that dmd -profile uses *16-bit counters* for tracking function call counts;

WHAT?! Wouldn't 32/64 bit (architecture native) values be faster for memory accesses anyway??
October 04, 2021
On Sunday, 3 October 2021 at 21:55:29 UTC, H. S. Teoh wrote:
> On Sun, Oct 03, 2021 at 08:31:14AM +0000, Chris Katko via Digitalmars-d wrote: [...]
>> The function that dmd's -profile reported was the biggest user of CPU time was a tiny little function that draws a couple background tiles. (<30 at worst case) As opposed to everything else being drawn, tons of opengl primitives, graphical text, text being converted with tons of writelns to console, etc.
> [...]
>
> Be aware that dmd -profile uses *16-bit counters* for tracking function call counts; if your program is CPU-intensive and calls the same function(s) in inner loops more than 65535 times, the counters will wrap around and cause the profile output to be garbled.
>
> I ran into this a couple of years ago when trying to profile some CPU-intensive code with some non-trivial testcases, and found dmd -profile output completely unusable because of this limitation.
>
>
> T

What the... I had no idea but that alone deems it useless to me.
October 04, 2021
03.10.2021 19:20, max haughton пишет:
> On Sunday, 3 October 2021 at 08:31:14 UTC, Chris Katko wrote:
>> Does it break down on multi-threaded scenarios?
>>
>> I'm running dmd (newest) + Allegro (a C game programming library) with DAllegro (a nice templated binder). My executable is multi-threaded (mostly just helper functions / glue logic from libraries/D/etc), and using OpenGL on 64-bit Linux with a very recent DMD release.
>>
>> [...]
> 
> Unless your profiler does call stack sampling (which I don't think gprof or dmd does), don't use it. They're not reliable unless you are doing very targeted profiling.
> 
> For profiling code, if you're on an Intel, vTune is top dog. Nothing else is as good.

Both sampling and instrumenting profiling can be unreliable. Sampling profiler results are subject to sampling rate for example. Instrumenting profiler can change timings too much. In fact, sampling and instrumentation complement each other.
October 06, 2021
On Monday, 4 October 2021 at 06:10:00 UTC, bauss wrote:
> On Sunday, 3 October 2021 at 21:55:29 UTC, H. S. Teoh wrote:
>> On Sun, Oct 03, 2021 at 08:31:14AM +0000, Chris Katko via Digitalmars-d wrote: [...]
>>> [...]
>> [...]
>>
>> Be aware that dmd -profile uses *16-bit counters* for tracking function call counts; if your program is CPU-intensive and calls the same function(s) in inner loops more than 65535 times, the counters will wrap around and cause the profile output to be garbled.
>>
>> I ran into this a couple of years ago when trying to profile some CPU-intensive code with some non-trivial testcases, and found dmd -profile output completely unusable because of this limitation.
>>
>>
>> T
>
> What the... I had no idea but that alone deems it useless to me.

Maybe we can change it
October 06, 2021
On Wed, Oct 06, 2021 at 05:34:54AM +0000, Imperatorn via Digitalmars-d wrote:
> On Monday, 4 October 2021 at 06:10:00 UTC, bauss wrote:
> > On Sunday, 3 October 2021 at 21:55:29 UTC, H. S. Teoh wrote:
[...]
> > > Be aware that dmd -profile uses *16-bit counters* for tracking function call counts; if your program is CPU-intensive and calls the same function(s) in inner loops more than 65535 times, the counters will wrap around and cause the profile output to be garbled.
> > > 
> > > I ran into this a couple of years ago when trying to profile some CPU-intensive code with some non-trivial testcases, and found dmd -profile output completely unusable because of this limitation.
[...]
> > What the... I had no idea but that alone deems it useless to me.
> 
> Maybe we can change it

That would be very nice.  I believe the code is somewhere in druntime. It would save a lot of grief in the future. :-)


T

-- 
An elephant: A mouse built to government specifications. -- Robert Heinlein