View mode: basic / threaded / horizontal-split · Log in · Help
September 17, 2012
Review of Andrei's std.benchmark
Hi,

it's my pleasure to announce the begin of the formal review of Andrei's
std.benchmark. The review will start today and end in two weeks, on 1st
of October. The review is followed by a week of voting which ends on 8th
of October.

Quoting Andrei from his request for formal review:
"I reworked the benchmarking framework for backward compatibility,
flexibility, and convenience.

There are a few enhancement possibilities (such as tracking
system/user time separately etc), but there is value in keeping things
simple and convenient. Right now it really takes only one line of code
and observing a simple naming convention to hook a module into the
benchmarking framework."

Code: https://github.com/D-Programming-Language/phobos/pull/794
Docs: http://dlang.org/phobos-prerelease/std_benchmark.html

If std.benchmark is accepted it will likely lead to a deprecation of
std.datetime's benchmark facilities.

The code is provided as a pull requested and being (as usual) integrated
by the auto tester for Mac OS X, FreeBSD, Linux and Windows (see
(http://d.puremagic.com/test-results/pull-history.ghtml?repoid=3&pullid=794).

In your comments you can/should address the
* design
* implementation
* documentation
* usefulness
of the library.

Provide information regarding the depth (ranging from very brief to
in-depth) of your review and conclude explicitly whether std.benchmark
should or shouldn't be included in Phobos.

Post all feedback to this thread. Constructive feedback is very much
appreciated.

To conclude in more Andrei like words: Happy destruction!

Jens
September 17, 2012
Re: Review of Andrei's std.benchmark
On 9/17/12 5:13 PM, Jens Mueller wrote:
> If std.benchmark is accepted it will likely lead to a deprecation of
> std.datetime's benchmark facilities.

One note - I moved the benchmark-related stuff from std.datetime 
unmodified into std.benchmark and left public aliases in place, so no 
code breakage is imminent. We may deprecate the aliases themselves later.

> To conclude in more Andrei like words: Happy destruction!

Sounds about right! :o)


Andrei
September 18, 2012
Re: Review of Andrei's std.benchmark
I think the std.benchmark is definitely a useful library 
addition, but in my mind it currently a bit too limited.

* All tests are run 1000 times. Depending on the length of the 
test to benchmark, this can be too much. In some cases it would 
be good to be able to trade the number of runs against accuracy.

* For all tests, the best run is selected, but would it not be 
reasonable in some cases to get the average value? Maybe 
excluding the runs that are more than a couple std. deviations 
away from the mean value..

* Is there a way of specifying a test name other than the 
function-name when using the 'mixin(scheduleForBenchmarking)' 
approach to register benchmarks?

* I would also like to be able (if possible) to register two 
mentioned things (number of runs and result strategy) with the 
mixin approach (or similar).

* It seems like the baseline for subtraction from subsequent test 
runs is taken from a call to the test function, passing 1 to it. 
Shouldn't 0 be passed for this value?

If these can be addressed, I would like it added to the library!
September 18, 2012
Re: Review of Andrei's std.benchmark
On 9/18/12 5:07 PM, "Øivind" wrote:
> I think the std.benchmark is definitely a useful library addition, but
> in my mind it currently a bit too limited.
>
> * All tests are run 1000 times. Depending on the length of the test to
> benchmark, this can be too much. In some cases it would be good to be
> able to trade the number of runs against accuracy.

It would be a good idea to make that a configurable parameter.

> * For all tests, the best run is selected, but would it not be
> reasonable in some cases to get the average value? Maybe excluding the
> runs that are more than a couple std. deviations away from the mean value..

After extensive tests with a variety of aggregate functions, I can say 
firmly that taking the minimum time is by far the best when it comes to 
assessing the speed of a function.

> * Is there a way of specifying a test name other than the function-name
> when using the 'mixin(scheduleForBenchmarking)' approach to register
> benchmarks?

Not currently. Probably a manual registration of an individual benchmark 
would make sense.

> * I would also like to be able (if possible) to register two mentioned
> things (number of runs and result strategy) with the mixin approach (or
> similar).

Makes sense.

> * It seems like the baseline for subtraction from subsequent test runs
> is taken from a call to the test function, passing 1 to it. Shouldn't 0
> be passed for this value?

I'll look into that.


Thanks,

Andrei
September 19, 2012
Re: Review of Andrei's std.benchmark
On 2012-09-17 23:13, Jens Mueller wrote:

> Post all feedback to this thread. Constructive feedback is very much
> appreciated.
>
> To conclude in more Andrei like words: Happy destruction!

* Why is "scheduleForBenchmarking" a string? Can't it be a template mixin?

* What's the most appropriate way of just timing a block of code? 
Something like this:

auto time = benchmark!({ /* some code */ })(1);

If that's the case then I suggest setting a default value of "1" for the 
"n" parameter.

* If I want to format the printed result differently, say in HTML, how 
would I do that? Should I use the "benchmark" function and iterate the 
BenchmarkResult array?

* BTW why doesn't benchmark return the BenchmarkResult array?

* Is this module so important to keep it as a top level module? I'm 
thinking something like a utility package or a time/date package. How 
about std.util.benchmark?

-- 
/Jacob Carlborg
September 19, 2012
Re: Review of Andrei's std.benchmark
On Wednesday, September 19, 2012 09:13:40 Jacob Carlborg wrote:
> * Is this module so important to keep it as a top level module? I'm
> thinking something like a utility package or a time/date package. How
> about std.util.benchmark?

util is one of the worst package names ever, because it means basically 
nothing. Any function could go in there.

As for a time/date package, we already have std.datetime (which will hopefully 
be split into the package std.datetime at some point, but we need something 
like DIP 15 or 16 before we can do that), and we're moving the benchmarking 
_out_ of there. If std.datetime were already a package, then maybe putting it 
in there would make some sense, but benchmarking is arguably fundamentally 
different from what the rest of std.datetime does. I really so no problem with 
benchmarking being its own thing, and std.benchmark works just fine for that.

- Jonathan M Davis
September 19, 2012
Re: Review of Andrei's std.benchmark
I don't see why `benchmark` takes (almost) all of its parameters 
as template parameters. It looks quite odd, seems unnecessary, 
and (if I'm not mistaken) makes certain use cases quite difficult.

For example, suppose I want to benchmark a function several times 
with different parameters and names, how would I do that?

foreach (i; 0..10)
{
    printBenchmark!( format("Test %d", i), { someFunc(i); } )();
}

This won't work because i isn't known at compile time, and for 
some use cases it can't be known at compile time.

I wouldn't mind if there was some real benefit to taking these as 
template arguments, but there doesn't seem to be any value at all 
-- it just limits usage.
September 19, 2012
Re: Review of Andrei's std.benchmark
On Tuesday, 18 September 2012 at 22:01:30 UTC, Andrei 
Alexandrescu wrote:
> After extensive tests with a variety of aggregate functions, I 
> can say firmly that taking the minimum time is by far the best 
> when it comes to assessing the speed of a function.

What if one tries to benchmark a nondeterministic function? In 
such a case one might well be interested in the best run, worst 
run, and the average.
September 19, 2012
Re: Review of Andrei's std.benchmark
On 19 September 2012 01:02, Andrei Alexandrescu <
SeeWebsiteForEmail@erdani.org> wrote:

> On 9/18/12 5:07 PM, "Øivind" wrote:
>
>> * For all tests, the best run is selected, but would it not be
>
> reasonable in some cases to get the average value? Maybe excluding the
>> runs that are more than a couple std. deviations away from the mean
>> value..
>>
>
> After extensive tests with a variety of aggregate functions, I can say
> firmly that taking the minimum time is by far the best when it comes to
> assessing the speed of a function.


The fastest execution time is rarely useful to me, I'm almost always much
more interested in the slowest execution time.
In realtime software, the slowest time is often the only important factor,
everything must be designed to tolerate this possibility.
I can also imagine other situations where multiple workloads are competing
for time, the average time may be more useful in that case.

Side question:
Running a test over and over pre-populates the cache with all associated
data after the first cycle... The cache needs to be randomised between each
cycle to get realistic results.
September 19, 2012
Re: Review of Andrei's std.benchmark
> The fastest execution time is rarely useful to me, I'm almost 
> always much
> more interested in the slowest execution time.
> In realtime software, the slowest time is often the only 
> important factor,
> everything must be designed to tolerate this possibility.
> I can also imagine other situations where multiple workloads 
> are competing
> for time, the average time may be more useful in that case.

The problem with slowest is that you end up with the occasional 
OS hiccup or GC collection which throws the entire benchmark off. 
I see your point, but unless you can prevent the OS from 
interrupting, the time would be meaningless.
« First   ‹ Prev
1 2 3 4 5
Top | Discussion index | About this forum | D home