Jump to page: 1 24  
Page
Thread overview
Would you trade 0.1% in performance for a better debugging experience?
Nov 17, 2014
Vladimir Panteleev
Nov 18, 2014
Daniel Kozak
Nov 18, 2014
deadalnix
Nov 18, 2014
ponce
Nov 18, 2014
Kagamin
Nov 18, 2014
Vladimir Panteleev
Dec 04, 2014
Martin Nowak
Nov 18, 2014
Logan Capaldo
Nov 18, 2014
Marco Leise
Nov 18, 2014
Vladimir Panteleev
Nov 18, 2014
Marco Leise
Nov 18, 2014
Vladimir Panteleev
Nov 18, 2014
Marco Leise
Nov 19, 2014
Kagamin
Nov 21, 2014
Marco Leise
Dec 04, 2014
Martin Nowak
Dec 05, 2014
Kagamin
Nov 19, 2014
Kagamin
Dec 04, 2014
Martin Nowak
Dec 04, 2014
Martin Nowak
Nov 18, 2014
Joakim
Nov 19, 2014
w0rp
Nov 19, 2014
Rainer Schuetze
Dec 04, 2014
Dicebot
Dec 04, 2014
Vladimir Panteleev
Dec 04, 2014
Dicebot
Dec 04, 2014
Temtaime
Dec 05, 2014
Sean Kelly
Dec 04, 2014
Martin Nowak
Dec 05, 2014
Kagamin
November 17, 2014
I proposed to build Phobos and Druntime with stack frames enabled:

https://issues.dlang.org/show_bug.cgi?id=13726

Stack frames add three CPU instructions to each function (two in the prolog, and one in the epilog). This creates a linked list which debuggers, profilers, etc. can easily walk to find which function called which. They would allow debugging certain classes of bugs much more easily (e.g. the recurring InvalidMemoryOperationError - there's a thread about it in D.learn just today), and profiling your code with polling (non-instrumenting) profilers.

As I understood, in theory, debug information (DWARF and PDB, probably not CV) should also contain information allowing an accurate stack walk, but it doesn't look like we're currently emitting debug information to that level of accuracy. Not all debuggers/profilers understand this debug information, either (whereas walking the linked list emitted by the stack frames is trivial).

We could also start bundling debug builds of Phobos. However, these will not help in cases where the performance impact of using a full-blown debug build is not acceptable (e.g. if it'll skew the profiling results too much, or if you just want readable stack frames for your D web service in production).

How much will this cost in performance?

I've run two benchmarks, both show a figure around 0.1%. Many performance-sensitive parts of Phobos (std.algorithm) are templated and thus are not affected by the Makefile switches. The GC is built with -inline, which, although it causes the call stack to not contain inlined functions, doesn't cause it to abruptly break off like without stack frames.

Is D the first to build its release stdlib with stack frames?

Nope. Microsoft's C release runtime is built stack frames enabled.

Personally, I think the 0.1% is practically negligible considering the advantages. My proposal was rejected, so I'd like to hear more opinions about this. What do you think?

If you want to run some benchmarks yourself, here are the patches:

https://github.com/CyberShadow/phobos/compare/enable-stack-frames?expand=1
https://github.com/CyberShadow/druntime/compare/enable-stack-frames?expand=1

Or, using Digger:

digger build v2.065.0+CyberShadow/phobos/enable-stack-frames+CyberShadow/druntime/enable-stack-frames

Previous discussion from 2012:
http://forum.dlang.org/post/zebqmrhcigfuockcpsfa@forum.dlang.org
November 18, 2014
Vladimir Panteleev via Digitalmars-d píše v Po 17. 11. 2014 v 23:14 +0000:
> I proposed to build Phobos and Druntime with stack frames enabled:
> 
> https://issues.dlang.org/show_bug.cgi?id=13726
> 
> Stack frames add three CPU instructions to each function (two in the prolog, and one in the epilog). This creates a linked list which debuggers, profilers, etc. can easily walk to find which function called which. They would allow debugging certain classes of bugs much more easily (e.g. the recurring InvalidMemoryOperationError - there's a thread about it in D.learn just today), and profiling your code with polling (non-instrumenting) profilers.
> 
> As I understood, in theory, debug information (DWARF and PDB, probably not CV) should also contain information allowing an accurate stack walk, but it doesn't look like we're currently emitting debug information to that level of accuracy. Not all debuggers/profilers understand this debug information, either (whereas walking the linked list emitted by the stack frames is trivial).
> 
> We could also start bundling debug builds of Phobos. However, these will not help in cases where the performance impact of using a full-blown debug build is not acceptable (e.g. if it'll skew the profiling results too much, or if you just want readable stack frames for your D web service in production).
> 
> How much will this cost in performance?
> 
> I've run two benchmarks, both show a figure around 0.1%. Many performance-sensitive parts of Phobos (std.algorithm) are templated and thus are not affected by the Makefile switches. The GC is built with -inline, which, although it causes the call stack to not contain inlined functions, doesn't cause it to abruptly break off like without stack frames.
> 
> Is D the first to build its release stdlib with stack frames?
> 
> Nope. Microsoft's C release runtime is built stack frames enabled.
> 
> Personally, I think the 0.1% is practically negligible considering the advantages. My proposal was rejected, so I'd like to hear more opinions about this. What do you think?
> 
> If you want to run some benchmarks yourself, here are the patches:
> 
> https://github.com/CyberShadow/phobos/compare/enable-stack-frames?expand=1 https://github.com/CyberShadow/druntime/compare/enable-stack-frames?expand=1
> 
> Or, using Digger:
> 
> digger build v2.065.0+CyberShadow/phobos/enable-stack-frames+CyberShadow/druntime/enable-stack-frames
> 
> Previous discussion from 2012: http://forum.dlang.org/post/zebqmrhcigfuockcpsfa@forum.dlang.org

Short answer: NO.

Long answer:

If I neeed this, I will build version of phobos and druntime myself with this enabled. Or we could distribute both versions and select one by some switch. But I am against having stack frames enabled by default.


November 18, 2014
On Tuesday, 18 November 2014 at 07:33:15 UTC, Daniel Kozak via Digitalmars-d wrote:
> Short answer: NO.
>
> Long answer:
>
> If I neeed this, I will build version of phobos and druntime myself with
> this enabled. Or we could distribute both versions and select one by
> some switch. But I am against having stack frames enabled by default.

Preferences are not an argument.
November 18, 2014
On Monday, 17 November 2014 at 23:14:32 UTC, Vladimir Panteleev wrote:
> Would you trade 0.1% in performance for a better debugging experience?

Yes, of course. In most programs, a day of work can give >= 1% speed-up, and debugging can take many more.


November 18, 2014
I'm thinking about 3 configurations:
1. debug
2. release/conservative: assert=off, bounds checks=on, stack frames
3. release/optimized: bounds checks=off, no stack frames; can also apply additional optimizations for more speed like turning assert into assume.
November 18, 2014
On 11/17/14 6:14 PM, Vladimir Panteleev wrote:
> I proposed to build Phobos and Druntime with stack frames enabled:

Sure, why not 3 versions of phobos/runtime in installation? Space is cheap.

-Steve
November 18, 2014
On Monday, 17 November 2014 at 23:14:32 UTC, Vladimir Panteleev wrote:
> I proposed to build Phobos and Druntime with stack frames enabled:

This is very much worth it in my opinion. Not just for debugging but being able to profile (sometimes in production, without needing to recompile with instrumentation or debug symbols), can definitely make up for the difference in the long run. Fast, easy stack trace taking has a lot of utility, and can be applied where loading symbols to do the same makes it too slow to consider. Stack traces taken like this can be taken quickly, and symbolized later at your leisure offline.
November 18, 2014
Am Mon, 17 Nov 2014 23:14:31 +0000
schrieb "Vladimir Panteleev" <vladimir@thecybershadow.net>:

[…]

From http://eli.thegreenplace.net/2011/09/06/stack-frame-layout-on-x86-64/

-------------------------------------------------------------

Preserving the base pointer

The base pointer rbp (and its predecessor ebp on x86), being a stable "anchor" to the beginning of the stack frame throughout the execution of a function, is very convenient for manual assembly coding and for debugging [5]. However, some time ago it was noticed that compiler-generated code doesn't really need it (the compiler can easily keep track of offsets from rsp), and the DWARF debugging format provides means (CFI) to access stack frames without the base pointer.

This is why some compilers started omitting the base pointer for aggressive optimizations, thus shortening the function prologue and epilogue, and providing an additional register for general-purpose use (which, recall, is quite useful on x86 with its limited set of GPRs).

gcc keeps the base pointer by default on x86, but allows the optimization with the -fomit-frame-pointer compilation flag. How recommended it is to use this flag is a debated issue - you may do some googling if this interests you.

Anyhow, one other "novelty" the AMD64 ABI introduced is making the base pointer explicitly optional, stating:

»The conventional use of %rbp as a frame pointer for the stack
 frame may be avoided by using %rsp (the stack pointer) to
 index into the stack frame. This technique saves two
 instructions in the prologue and epilogue and makes one
 additional general-purpose register (%rbp) available.«

gcc adheres to this recommendation and by default omits the frame pointer on x64, when compiling with optimizations. It gives an option to preserve it by providing the -fno-omit-frame-pointer flag. For clarity's sake, the stack frames showed above were produced without omitting the frame pointer.

-------------------------------------------------------------

Without fully understanding the issue, omitting the frame pointer on GNU amd64 systems is the default and is supposed to work using DWARF debug information. So there should be no need for a stack frame pointer, right?

Are you mostly concerned with Windows then?

-- 
Marco

November 18, 2014
On Tuesday, 18 November 2014 at 13:18:10 UTC, Steven Schveighoffer wrote:
> On 11/17/14 6:14 PM, Vladimir Panteleev wrote:
>> I proposed to build Phobos and Druntime with stack frames enabled:
>
> Sure, why not 3 versions of phobos/runtime in installation? Space is cheap.

You have to consider the combinatorial explosion of our target platforms, and also the impact on the size of the distribution.

- In 2.066.0, phobos.lib is around 9MB.
- *2 for x86/x64.
- *4 for the number of supported platforms.
- *3 for the proposed number of build configurations.

That comes out to over 200 MB. Might be bigger due to debug builds of Phobos taking more space.

Compression should reduce the size of the distribution packages, but we'd need to move to something other than ZIP files if we want to take advantage of similarities between distinct files. Perhaps offer .7z as an option.

I'm not against the idea, though enabling stack frames is a much simpler change.
November 18, 2014
On Tuesday, 18 November 2014 at 16:49:55 UTC, Marco Leise wrote:
> Without fully understanding the issue, omitting the frame
> pointer on GNU amd64 systems is the default and is supposed to
> work using DWARF debug information. So there should be no need
> for a stack frame pointer, right?

Firstly, does DMD generate DWARF debug information in practice that's correct enough to substitute stack frames? As far as I can tell, it doesn't.

Second, there's still the argument that not every debugger and profiler can take advantage of the DWARF debug information. It's certainly nowhere as easy: from the technical point of view, but also from a legal one, considering that (IIRC) most libraries dealing with DWARF debug information are GPL or LGPL, meaning we can't use them in the D runtime. And indeed, for printing the stack trace for an unhandled exception, Druntime currently walks the stack frames:

https://github.com/D-Programming-Language/druntime/blob/master/src/core/runtime.d#L452-L478

There is also the libc backtrace function, which the D runtime apparently used at some point:

http://www.gnu.org/software/libc/manual/html_node/Backtraces.html

It was removed in this commit:

https://github.com/D-Programming-Language/druntime/commit/9dca50fe65402fb3cdfbb689f1aca58dc835dce4

The commit message doesn't seem to mention why the use of the backtrace function was removed. But from reading the function's documentation, it looks like it works in the same way (walking the stack frame pointers, i.e. also reliant on stack frames).
« First   ‹ Prev
1 2 3 4