Review of Andrei's std.benchmark - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » Review of Andrei's std.benchmark

Thread overview

Review of Andrei's std.benchmark
Sep 17, 2012 Jens Mueller
Sep 17, 2012 Andrei Alexandrescu
Sep 18, 2012 Øivind
Sep 18, 2012 Andrei Alexandrescu
Sep 19, 2012 Thiez
Sep 21, 2012 Andrei Alexandrescu
Sep 19, 2012 Manu
Sep 19, 2012 Peter Alexander
Sep 19, 2012 Jacob Carlborg
Sep 21, 2012 Andrei Alexandrescu
Sep 20, 2012 Manu
Sep 20, 2012 Andrei Alexandrescu
Sep 20, 2012 Manu
Sep 21, 2012 Andrei Alexandrescu
Sep 21, 2012 Manu
Sep 21, 2012 Andrei Alexandrescu
Sep 21, 2012 Jacob Carlborg
Sep 21, 2012 Jonathan M Davis
Sep 20, 2012 Jacob Carlborg
Sep 20, 2012 Andrei Alexandrescu
Sep 21, 2012 Andrei Alexandrescu
Sep 21, 2012 Jacob Carlborg
Sep 21, 2012 Andrei Alexandrescu
Sep 21, 2012 Jens Mueller
Sep 22, 2012 Simen Kjaeraas
Sep 21, 2012 Manu
Sep 21, 2012 Andrei Alexandrescu
Sep 20, 2012 foobar
Sep 21, 2012 Andrei Alexandrescu
Sep 21, 2012 Manu
Sep 21, 2012 Andrei Alexandrescu
Sep 19, 2012 Graham Fawcett
Sep 21, 2012 Andrei Alexandrescu
Sep 21, 2012 Nick Sabalausky
Sep 21, 2012 Andrei Alexandrescu
Sep 21, 2012 Tove
Sep 21, 2012 Andrei Alexandrescu
Sep 21, 2012 Johannes Pfau
Sep 21, 2012 Jacob Carlborg
Sep 21, 2012 Manu
Sep 21, 2012 Andrei Alexandrescu
Sep 22, 2012 Nick Sabalausky
Sep 21, 2012 David Piepgrass
Sep 21, 2012 jerro
Sep 21, 2012 Dmitry Olshansky
Sep 21, 2012 Andrei Alexandrescu
Sep 21, 2012 David Piepgrass
Sep 21, 2012 Andrei Alexandrescu
Sep 22, 2012 David Piepgrass
Sep 22, 2012 Nick Sabalausky
Sep 23, 2012 Joseph Rushton Wakeling
Sep 19, 2012 Øivind
Sep 21, 2012 Andrei Alexandrescu
Sep 21, 2012 Jonathan M Davis
Sep 22, 2012 Øivind
Sep 22, 2012 Andrei Alexandrescu
Sep 22, 2012 Øivind
Sep 22, 2012 Øivind
Sep 22, 2012 Simen Kjaeraas
Sep 22, 2012 Simen Kjaeraas
Sep 22, 2012 Jonathan M Davis
Sep 21, 2012 Dmitry Olshansky
Sep 22, 2012 Jacob Carlborg
Sep 19, 2012 Jacob Carlborg
Sep 19, 2012 Jonathan M Davis
Sep 19, 2012 Jacob Carlborg
Sep 22, 2012 Andrei Alexandrescu
Sep 19, 2012 Peter Alexander
Sep 21, 2012 Andrei Alexandrescu
Sep 21, 2012 Peter Alexander
Sep 21, 2012 Dmitry Olshansky

September 17, 2012

Review of Andrei's std.benchmark

Posted by Jens Mueller

Jens Mueller

Hi,

it's my pleasure to announce the begin of the formal review of Andrei's std.benchmark. The review will start today and end in two weeks, on 1st of October. The review is followed by a week of voting which ends on 8th of October.

Quoting Andrei from his request for formal review:
"I reworked the benchmarking framework for backward compatibility,
flexibility, and convenience.

There are a few enhancement possibilities (such as tracking system/user time separately etc), but there is value in keeping things simple and convenient. Right now it really takes only one line of code and observing a simple naming convention to hook a module into the benchmarking framework."

Code: https://github.com/D-Programming-Language/phobos/pull/794 Docs: http://dlang.org/phobos-prerelease/std_benchmark.html

If std.benchmark is accepted it will likely lead to a deprecation of std.datetime's benchmark facilities.

The code is provided as a pull requested and being (as usual) integrated
by the auto tester for Mac OS X, FreeBSD, Linux and Windows (see
(http://d.puremagic.com/test-results/pull-history.ghtml?repoid=3&pullid=794).

In your comments you can/should address the
* design
* implementation
* documentation
* usefulness
of the library.

Provide information regarding the depth (ranging from very brief to in-depth) of your review and conclude explicitly whether std.benchmark should or shouldn't be included in Phobos.

Post all feedback to this thread. Constructive feedback is very much appreciated.

To conclude in more Andrei like words: Happy destruction!

Jens

September 17, 2012

Re: Review of Andrei's std.benchmark

Posted by Andrei Alexandrescu
in reply to Jens Mueller

Andrei Alexandrescu

Posted in reply to Jens Mueller

On 9/17/12 5:13 PM, Jens Mueller wrote:
> If std.benchmark is accepted it will likely lead to a deprecation of
> std.datetime's benchmark facilities.

One note - I moved the benchmark-related stuff from std.datetime unmodified into std.benchmark and left public aliases in place, so no code breakage is imminent. We may deprecate the aliases themselves later.

> To conclude in more Andrei like words: Happy destruction!

Sounds about right! :o)

Andrei

September 18, 2012

Re: Review of Andrei's std.benchmark

Posted by Øivind
in reply to Andrei Alexandrescu

Øivind

Posted in reply to Andrei Alexandrescu

I think the std.benchmark is definitely a useful library addition, but in my mind it currently a bit too limited.

* All tests are run 1000 times. Depending on the length of the test to benchmark, this can be too much. In some cases it would be good to be able to trade the number of runs against accuracy.

* For all tests, the best run is selected, but would it not be reasonable in some cases to get the average value? Maybe excluding the runs that are more than a couple std. deviations away from the mean value..

* Is there a way of specifying a test name other than the function-name when using the 'mixin(scheduleForBenchmarking)' approach to register benchmarks?

* I would also like to be able (if possible) to register two mentioned things (number of runs and result strategy) with the mixin approach (or similar).

* It seems like the baseline for subtraction from subsequent test runs is taken from a call to the test function, passing 1 to it. Shouldn't 0 be passed for this value?

If these can be addressed, I would like it added to the library!

September 18, 2012

Re: Review of Andrei's std.benchmark

Posted by Andrei Alexandrescu
in reply to Øivind

Andrei Alexandrescu

Posted in reply to Øivind

On 9/18/12 5:07 PM, "Øivind" wrote:
> I think the std.benchmark is definitely a useful library addition, but
> in my mind it currently a bit too limited.
>
> * All tests are run 1000 times. Depending on the length of the test to
> benchmark, this can be too much. In some cases it would be good to be
> able to trade the number of runs against accuracy.

It would be a good idea to make that a configurable parameter.

> * For all tests, the best run is selected, but would it not be
> reasonable in some cases to get the average value? Maybe excluding the
> runs that are more than a couple std. deviations away from the mean value..

After extensive tests with a variety of aggregate functions, I can say firmly that taking the minimum time is by far the best when it comes to assessing the speed of a function.

> * Is there a way of specifying a test name other than the function-name
> when using the 'mixin(scheduleForBenchmarking)' approach to register
> benchmarks?

Not currently. Probably a manual registration of an individual benchmark would make sense.

> * I would also like to be able (if possible) to register two mentioned
> things (number of runs and result strategy) with the mixin approach (or
> similar).

Makes sense.

> * It seems like the baseline for subtraction from subsequent test runs
> is taken from a call to the test function, passing 1 to it. Shouldn't 0
> be passed for this value?

I'll look into that.


Thanks,

Andrei

September 19, 2012

Re: Review of Andrei's std.benchmark

Posted by Jacob Carlborg
in reply to Jens Mueller

Jacob Carlborg

Posted in reply to Jens Mueller

On 2012-09-17 23:13, Jens Mueller wrote:

> Post all feedback to this thread. Constructive feedback is very much
> appreciated.
>
> To conclude in more Andrei like words: Happy destruction!

* Why is "scheduleForBenchmarking" a string? Can't it be a template mixin?

* What's the most appropriate way of just timing a block of code? Something like this:

auto time = benchmark!({ /* some code */ })(1);

If that's the case then I suggest setting a default value of "1" for the "n" parameter.

* If I want to format the printed result differently, say in HTML, how would I do that? Should I use the "benchmark" function and iterate the BenchmarkResult array?

* BTW why doesn't benchmark return the BenchmarkResult array?

* Is this module so important to keep it as a top level module? I'm thinking something like a utility package or a time/date package. How about std.util.benchmark?

-- 
/Jacob Carlborg

September 19, 2012

Re: Review of Andrei's std.benchmark

Posted by Jonathan M Davis
in reply to Jacob Carlborg

Jonathan M Davis

Posted in reply to Jacob Carlborg

On Wednesday, September 19, 2012 09:13:40 Jacob Carlborg wrote:
> * Is this module so important to keep it as a top level module? I'm thinking something like a utility package or a time/date package. How about std.util.benchmark?

util is one of the worst package names ever, because it means basically nothing. Any function could go in there.

As for a time/date package, we already have std.datetime (which will hopefully be split into the package std.datetime at some point, but we need something like DIP 15 or 16 before we can do that), and we're moving the benchmarking _out_ of there. If std.datetime were already a package, then maybe putting it in there would make some sense, but benchmarking is arguably fundamentally different from what the rest of std.datetime does. I really so no problem with benchmarking being its own thing, and std.benchmark works just fine for that.

- Jonathan M Davis

September 19, 2012

Re: Review of Andrei's std.benchmark

Posted by Peter Alexander
in reply to Jens Mueller

Peter Alexander

Posted in reply to Jens Mueller

I don't see why `benchmark` takes (almost) all of its parameters as template parameters. It looks quite odd, seems unnecessary, and (if I'm not mistaken) makes certain use cases quite difficult.

For example, suppose I want to benchmark a function several times with different parameters and names, how would I do that?

foreach (i; 0..10)
{
    printBenchmark!( format("Test %d", i), { someFunc(i); } )();
}

This won't work because i isn't known at compile time, and for some use cases it can't be known at compile time.

I wouldn't mind if there was some real benefit to taking these as template arguments, but there doesn't seem to be any value at all -- it just limits usage.

September 19, 2012

Re: Review of Andrei's std.benchmark

Posted by Thiez
in reply to Andrei Alexandrescu

Thiez

Posted in reply to Andrei Alexandrescu

On Tuesday, 18 September 2012 at 22:01:30 UTC, Andrei Alexandrescu wrote:
> After extensive tests with a variety of aggregate functions, I can say firmly that taking the minimum time is by far the best when it comes to assessing the speed of a function.

What if one tries to benchmark a nondeterministic function? In such a case one might well be interested in the best run, worst run, and the average.

September 19, 2012

Re: Review of Andrei's std.benchmark

Posted by Manu
in reply to Andrei Alexandrescu

Manu

Posted in reply to Andrei Alexandrescu

Attachments:

text/html part

On 19 September 2012 01:02, Andrei Alexandrescu < SeeWebsiteForEmail@erdani.org> wrote:

> On 9/18/12 5:07 PM, "Øivind" wrote:
>
>> * For all tests, the best run is selected, but would it not be
>
> reasonable in some cases to get the average value? Maybe excluding the
>> runs that are more than a couple std. deviations away from the mean value..
>>
>
> After extensive tests with a variety of aggregate functions, I can say firmly that taking the minimum time is by far the best when it comes to assessing the speed of a function.

The fastest execution time is rarely useful to me, I'm almost always much
more interested in the slowest execution time.
In realtime software, the slowest time is often the only important factor,
everything must be designed to tolerate this possibility.
I can also imagine other situations where multiple workloads are competing
for time, the average time may be more useful in that case.

Side question:
Running a test over and over pre-populates the cache with all associated
data after the first cycle... The cache needs to be randomised between each
cycle to get realistic results.

September 19, 2012

Re: Review of Andrei's std.benchmark

Posted by Peter Alexander
in reply to Manu

Peter Alexander

Posted in reply to Manu

> The fastest execution time is rarely useful to me, I'm almost always much
> more interested in the slowest execution time.
> In realtime software, the slowest time is often the only important factor,
> everything must be designed to tolerate this possibility.
> I can also imagine other situations where multiple workloads are competing
> for time, the average time may be more useful in that case.

The problem with slowest is that you end up with the occasional OS hiccup or GC collection which throws the entire benchmark off. I see your point, but unless you can prevent the OS from interrupting, the time would be meaningless.

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation