avgtime - Small D util for your everyday benchmarking needs (page 3)

On 3/23/12 5:51 AM, Don Clugston wrote: >> No, it's easy. Student t is in std.mathspecial. > > Aargh, I didn't get around to copying it in. But this should do it. [snip] Shouldn't put this stuff in std.numeric, or create a std.stat module? I think also some functions for t-test would be useful. Andrei

March 23, 2012

Re: avgtime - Small D util for your everyday benchmarking needs

Posted by Juan Manuel Cabo
in reply to Andrei Alexandrescu

Permalink

Juan Manuel Cabo

Posted in reply to Andrei Alexandrescu

Permalink

On Friday, 23 March 2012 at 15:33:18 UTC, Andrei Alexandrescu wrote:
> On 3/23/12 3:02 AM, Juan Manuel Cabo wrote:
>> On Friday, 23 March 2012 at 05:16:20 UTC, Andrei Alexandrescu wrote:
>> [.....]
>>>> (man, the gaussian curve is everywhere, it never ceases to
>>>> perplex me).
>>>
>>> I'm actually surprised. I'm working on benchmarking lately and the
>>> distributions I get are very concentrated around the minimum.
>>>
>>> Andrei
>>
>>
>> Well, the shape of the curve depends a lot on
>> how the random noise gets inside the measurement.
> [snip]
>
> Hmm, well the way I see it, the observed measurements have the following composition:
>
> X = T + Q + N
>
> where T > 0 (a constant) is the "real" time taken by the processing, Q > 0 is the quantization noise caused by the limited resolution of the clock (can be considered 0 if the resolution is much smaller than the actual time), and N is noise caused by a variety of factors (other processes, throttling, interrupts, networking, memory hierarchy effects, and many more). The challenge is estimating T given a bunch of X samples.
>
> N can be probably approximated to a Gaussian, although for short timings I noticed it's more like bursts that just cause outliers. But note that N is always positive (therefore not 100% Gaussian), i.e. there's no way to insert some noise that makes the code seem artificially faster. It's all additive.
>
> Taking the mode of the distribution will estimate T + mode(N), which is informative because after all there's no way to eliminate noise. However, if the focus is improving T, we want an estimate as close to T as possible. In the limit, taking the minimum over infinitely many measurements of X would yield T.
>
>
> Andrei

In general, I agree with your reasoning. And I appreciate you
taking the time to put it so eloquently!!

But I think that your considering T as a constant, and
preferring the minimum misses something. This might work
very well for benchmarking mostly CPU bound processes,
but all those other things that you consider noise
(disk I/O, network, memory hierarchy, etc.) are part
of the elements that make an algorithm or program faster
than other, and I would consider them inside T for
some applications.

Consider the case depicted in this wonderful (ranty) article
that was posted elsewhere in this thread:
    http://zedshaw.com/essays/programmer_stats.html
In a part of the article, the guy talks about a
system that worked fast most of the time, but would halt
for a good 1 or 2 minutes sometimes.

The minimum time for such a system might be a few ms, but
the standard deviation would be big. This properly shifts
the average time away from the minimum.

If programA does the same task than programB with less I/O,
or with better memory layout, etc. its average will be
better, and maybe its timings won't be so spread out. But
the minimum will be the same.

So, in the end, I'm just happy that I could share this
little avgtime with you all, and as usual there is
no one-answer fits all. For some applications, the
minimum will be enough. For others, it's esential to look
at how spread the sample is.

On the symmetry/asymmetry of the distribution topic:
I realize as you said that T never gets faster than
a certain point.
But, depending on the nature of the program under test,
the good utilization of disk I/O, network, memory,
motherboard buses, etc. is what you want inside the
test too, and those come with gaussian like noises
which might dominate over T or not.

A program that avoids that other big noise is a better
program (all else the same), so I would tend to consider
the whole.

Thanks for the eloquency/insightfulness in your post!
I'll consider adding chi-squared confidence intervals
in the future. (and open to more info or if another
distribution might be better).

--jm

On Friday, 23 March 2012 at 10:51:37 UTC, Don Clugston wrote: >> >> No, it's easy. Student t is in std.mathspecial. > > Aargh, I didn't get around to copying it in. But this should do it. > > /** Inverse of Student's t distribution > * > [.....] Great!!! Thank you soo much Don!!! --jm

On Friday, 23 March 2012 at 05:26:54 UTC, Nick Sabalausky wrote:>> > > Wow, that's just fantastic! Really, this should be a standard system tool. > > I think this guy would be proud: > http://zedshaw.com/essays/programmer_stats.html Thanks for the good vibes!!!!! Hahahhah, that article is so ffffing hillarious! I love the maddox tone. --jm

Andrei Alexandrescu wrote: > In the limit, taking the minimum over infinitely many measurements of X would yield T. True, if the thoretical variance of the distribution of T is close to zero. But horrible wrong, if T depends on an algorithm that is fast only under amortized analysis, because the worst case scenario will be hidden. -manfred

"Juan Manuel Cabo" <juanmanuel.cabo@gmail.com> wrote in message news:bqrlhcggehbrzyuhzjuy@forum.dlang.org... > On Friday, 23 March 2012 at 06:51:48 UTC, James Miller wrote: > >> Dude, this is awesome. I tend to just use time, but if I was doing anything more complicated, I'd use this. I would suggest changing the name while you still can. avgtime is not that informative a name given that it now does more than just "Average" times. >> >> -- >> James Miller > > >> Dude, this is awesome. > > Thanks!! I appreciate your feedback! > >> I would suggest changing the name while you still can. > > Suggestions welcome!! > "timestats"?

Am Fri, 23 Mar 2012 09:02:01 +0100 schrieb "Juan Manuel Cabo" <juanmanuel.cabo@gmail.com>: > On Friday, 23 March 2012 at 05:16:20 UTC, Andrei Alexandrescu > wrote: > [.....] > >> (man, the gaussian curve is everywhere, it never ceases to > >> perplex me). > > > > I'm actually surprised. I'm working on benchmarking lately and the distributions I get are very concentrated around the minimum. > > > > Andrei > > > Well, the shape of the curve depends a lot on > how the random noise gets inside the measurement. > > I like 'ls -lR' because the randomness comes > from everywhere, and its quite bell shaped. > I guess there is a lot of I/O mess (even if > I/O is all cached, there are lots of opportunities > for kernel mutexes to mess everything I guess). > > When testing "/bin/sleep 0.5", it will be quite > a pretty boring histogram. > > And I guess than when testing something thats only > CPU bound and doesn't make too much syscalls, > the shape is more concentrated in a few values. > > > On the other hand, I'm getting some weird bimodal > (two peaks) curves sometimes, like the one I put on > the README.md. > It's definitely because of my laptop's CPU throttling, > because it went away when I disabled it (for the curious > ones, in ubuntu 64bit, here is a way to disable > throttling (WARNING: might get hot until you undo or reboot): > > echo 1600000 > /sys/devices/system/cpu/cpu0/cpufreq/scaling_min_freq > > echo 1600000 > /sys/devices/system/cpu/cpu1/cpufreq/scaling_min_freq > > (yes my cpu is 1.6GHz, but it rocks). > > > --jm On Gnome I use the cpufreq-applet to change the the frequency governor from 'ondemand' to 'performance'. That's better than manually setting a minimum frequency. (Alternatively you can set it through the /sys interface.) - Unless this governor is not compiled into the kernel. -- Marco

On 3/23/12 5:42 PM, Manfred Nowak wrote: > Andrei Alexandrescu wrote: > >> In the limit, taking the minimum over infinitely many >> measurements of X would yield T. > > True, if the thoretical variance of the distribution of T is close to > zero. But horrible wrong, if T depends on an algorithm that is fast > only under amortized analysis, because the worst case scenario will be > hidden. Wait, doesn't a benchmark always measure an algorithm with the same input? For collecting a chart of various inputs, there should be various benchmarks. Andrei

Andrei Alexandrescu wrote: > Wait, doesn't a benchmark always measure an algorithm with the same input? The fact that you formulate as a question indicates that you are unsure about the wright answer---me too, but 1) surely one can define a benchmark to have this property. But if one uses this definition, the used input would belong to the benchmark as a description. I have never seen a description of a benchmark including the input, but because I am more interested in theory I may have simply missed such descriptions. 2) if a probilistic algorithms is used, the meaning of input becomes unclear, because the state of the machine influences T. 3) if a heuristic is used by the benchmarked algorithm, then a made up family of benchmarks can "prove" T= O(n*n) for quick sort. -manfred

On 3/23/12 4:11 PM, Juan Manuel Cabo wrote: > On Friday, 23 March 2012 at 06:51:48 UTC, James Miller wrote: > >> Dude, this is awesome. I tend to just use time, but if I was doing >> anything more complicated, I'd use this. I would suggest changing the >> name while you still can. avgtime is not that informative a name given >> that it now does more than just "Average" times. >> >> -- >> James Miller > > >> Dude, this is awesome. > > Thanks!! I appreciate your feedback! > >> I would suggest changing the name while you still can. > > Suggestions welcome!! > > --jm > give_me_d_average

Forums