Jump to page: 1 2
Thread overview
[Issue 18388] std.experimental.logger slow performance
Feb 08, 2018
anonymous4
Sep 05, 2018
Robert Schadek
Sep 06, 2018
Robert Schadek
Sep 07, 2018
Robert Schadek
February 07, 2018
https://issues.dlang.org/show_bug.cgi?id=18388

--- Comment #1 from Arun Chandrasekaran <aruncxy@gmail.com> ---
Platform: Ubuntu 17.10 - 64 bit.

Compiler:

LDC - the LLVM D compiler (1.7.0):
  based on DMD v2.077.1 and LLVM 5.0.1
  built with LDC - the LLVM D compiler (1.7.0)
  Default target: x86_64-unknown-linux-gnu
  Host CPU: skylake
  http://dlang.org - http://wiki.dlang.org/LDC

  Registered Targets:
    aarch64    - AArch64 (little endian)
    aarch64_be - AArch64 (big endian)
    arm        - ARM
    arm64      - ARM64 (little endian)
    armeb      - ARM (big endian)
    nvptx      - NVIDIA PTX 32-bit
    nvptx64    - NVIDIA PTX 64-bit
    ppc32      - PowerPC 32
    ppc64      - PowerPC 64
    ppc64le    - PowerPC 64 LE
    thumb      - Thumb
    thumbeb    - Thumb (big endian)
    x86        - 32-bit X86: Pentium-Pro and above
    x86-64     - 64-bit X86: EM64T and AMD64

--
February 08, 2018
https://issues.dlang.org/show_bug.cgi?id=18388

--- Comment #2 from anonymous4 <dfj1esp02@sneakemail.com> ---
AFAIK, default logger implementations are simplistic and are not meant to be very fast. Fast logging requires a bit of design and is supposed to be 3rd party library, while std.experimental.logger provides a way to integrate it.

--
February 08, 2018
https://issues.dlang.org/show_bug.cgi?id=18388

--- Comment #3 from Arun Chandrasekaran <aruncxy@gmail.com> ---
(In reply to anonymous4 from comment #2)
> AFAIK, default logger implementations are simplistic and are not meant to be very fast. Fast logging requires a bit of design and is supposed to be 3rd party library, while std.experimental.logger provides a way to integrate it.

Thanks, but that's weird. If that's the case, why would anyone want to use std.experimental.logger after all? Everyone will default to using their own logger implementations not derived out of std.experimental.logger, knowing that it is *slow*. I'm pretty sure that's not the goal of Phobos. May be something was overlooked in the design or implementation that causes this slowness?

--
September 05, 2018
https://issues.dlang.org/show_bug.cgi?id=18388

--- Comment #4 from Arun Chandrasekaran <aruncxy@gmail.com> ---
On the same box with ldc 1.12.0-beta1, I see that the performance has improved by around 20%.

04-09-2018 21:06:15 vaalaham ~/code/d/std-log-benchmark
$ time ./std-log-benchmark 8 1000000

real    0m3.958s
user    0m3.457s
sys     0m2.840s
04-09-2018 21:08:47 vaalaham ~/code/d/std-log-benchmark
$

--
September 05, 2018
https://issues.dlang.org/show_bug.cgi?id=18388

Robert Schadek <rburners@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rburners@gmail.com

--- Comment #5 from Robert Schadek <rburners@gmail.com> ---
The performance problem is not with std.experimental.logger, it is with the benchmark.

The amount thread contention is killing all performance.
The benchmark becomes single threaded around three times for each line printed.
2x atomic ops on msgCounter, 1x writing the line to file.

--
September 06, 2018
https://issues.dlang.org/show_bug.cgi?id=18388

--- Comment #6 from Arun Chandrasekaran <aruncxy@gmail.com> ---
> The benchmark becomes single threaded around three times for each line printed.

How is it a benchmark issue? I agree that three API for a single log line is expensive. Isn't that a problem with std.experimental.logger?

I also see that a flush after every line is causing the slowdown as well. Probably do a flush only when the message level is fatal/error/warn and also during termination?

--
September 06, 2018
https://issues.dlang.org/show_bug.cgi?id=18388

--- Comment #7 from Robert Schadek <rburners@gmail.com> ---
I didn't say anything about api calls. That is not the problem.
The problem with the benchmark is that the threads share memory. That means
each write will, given you tested on a multicore cpu, invalidates some of the
CPUs caches. That means the program has to go to RAM, and that is slow, really
slow.
I bet you a drink at next years DConf, that if you check perf you will find
that the CPU is waiting for data from RAM most of the execution time.

You need to flush after each log call. If there is a log buffer and the program crashes, you might not see the log line that indicates the problem.

--
September 06, 2018
https://issues.dlang.org/show_bug.cgi?id=18388

--- Comment #8 from Arun Chandrasekaran <aruncxy@gmail.com> ---
spdlog (C++) does exactly the same thing and it is much faster as I shared the stats earlier. It is apples to apples, IMO. So the contention is same in both C++ and D versions.

In fact, D does much better with atomic operations.

Disabling the log statements with 8 threads and 100000000 iterations, C++ version takes

real    0m3.167s
user    0m25.215s
sys     0m0.004s


whereas D version takes

real    0m2.527s
user    0m20.124s
sys     0m0.000s

You can compare the asm at https://godbolt.org/z/JjfTyw

--
September 06, 2018
https://issues.dlang.org/show_bug.cgi?id=18388

--- Comment #9 from Arun Chandrasekaran <aruncxy@gmail.com> ---
> You need to flush after each log call. If there is a log buffer and the program crashes, you might not see the log line that indicates the problem

That's right, but may be worth doing on a configuration basis (async mode)? Also I'm not sure how formattedWrite is implemented. As I see it writes piece by piece into the LockingWriter.

The log string could probably be constructed first, followed by a single write into the sink. This will avoid multiple chunked writes.

--
September 07, 2018
https://issues.dlang.org/show_bug.cgi?id=18388

--- Comment #10 from Arun Chandrasekaran <aruncxy@gmail.com> ---
s/_logger.tracef/printf/


06-09-2018 18:25:51 vaalaham ~/code/d/std-log-benchmark
$ time ./std-log-benchmark 8 1000000 > /dev/null

real    0m0.495s
user    0m1.254s
sys     0m2.100s
06-09-2018 19:02:22 vaalaham ~/code/d/std-log-benchmark
$

So the slowdown is definitely related to phobos.

--
« First   ‹ Prev
1 2