March 22, 2012 avgtime - Small D util for your everyday benchmarking needs | ||||
---|---|---|---|---|
| ||||
This is a small util I wrote in D which is like the unix 'time' command but can repeat the command N times and show median, average, standard deviation, minimum and maximum. As you all know, it is not proper to conclude that a program is faster than another program by running them just once. It's BOOST and is in github: https://github.com/jmcabo/avgtime Example: avgtime -r 10 -q ls -lR /etc ------------------------ Total time (ms): 933.742 Repetitions : 10 Median time : 90.505 Avg time : 93.3742 Std dev. : 4.66808 Minimum : 88.732 Maximum : 101.225 The -q argument pipes stderr and stdout of the program under test to /dev/null I put more info in the github page. HAVE FUN!! --jm |
March 22, 2012 Re: avgtime - Small D util for your everyday benchmarking needs | ||||
---|---|---|---|---|
| ||||
Posted in reply to Juan Manuel Cabo | On Thursday, 22 March 2012 at 00:32:31 UTC, Juan Manuel Cabo wrote:
> This is a small util I wrote in D which is like the unix
> 'time' command but can repeat the command N times and show
> median, average, standard deviation, minimum and maximum.
>
> As you all know, it is not proper to conclude that
> a program is faster than another program by running
> them just once.
>
> It's BOOST and is in github:
>
> https://github.com/jmcabo/avgtime
>
> Example:
>
>
> avgtime -r 10 -q ls -lR /etc
>
> ------------------------
> Total time (ms): 933.742
> Repetitions : 10
> Median time : 90.505
> Avg time : 93.3742
> Std dev. : 4.66808
> Minimum : 88.732
> Maximum : 101.225
>
> The -q argument pipes stderr and stdout of the program
> under test to /dev/null
>
> I put more info in the github page.
>
>
> HAVE FUN!!
>
> --jm
Awesome, I do have a tiny feature request for the next version... a commandline switch to enable automatically discarding the first run as an outlier.
/Tove
|
March 22, 2012 Re: avgtime - Small D util for your everyday benchmarking needs | ||||
---|---|---|---|---|
| ||||
Posted in reply to Tove | On Thursday, 22 March 2012 at 01:37:19 UTC, Tove wrote:
>
> Awesome, I do have a tiny feature request for the next version... a commandline switch to enable automatically discarding the first run as an outlier.
>
> /Tove
Done, I just put it in github. (-d switch).
But maybe you should be looking at the median
to ignore outliers.
I also added a -p switch to print all the times:
./avgtime -d -q -p -r10 ls -lR /usr/share/doc
------------------------
Total time (ms): 3986.69
Repetitions : 10
Median time : 397.62
Avg time : 398.669
Std dev. : 2.95832
Minimum : 395.633
Maximum : 406.274
Sorted times :
[395.633, 396.261, 396.273, 397.413, 397.425, 397.815,
399.321, 399.719, 400.551, 406.274]
--jm
|
March 22, 2012 Re: avgtime - Small D util for your everyday benchmarking needs | ||||
---|---|---|---|---|
| ||||
Posted in reply to Juan Manuel Cabo | "Juan Manuel Cabo" <juanmanuel.cabo@gmail.com> wrote in message news:zgjczrnyknqsiylhntui@forum.dlang.org... > This is a small util I wrote in D which is like the unix 'time' command but can repeat the command N times and show median, average, standard deviation, minimum and maximum. > > As you all know, it is not proper to conclude that > a program is faster than another program by running > them just once. > > It's BOOST and is in github: > > https://github.com/jmcabo/avgtime > > Example: > > > avgtime -r 10 -q ls -lR /etc > > ------------------------ > Total time (ms): 933.742 > Repetitions : 10 > Median time : 90.505 > Avg time : 93.3742 > Std dev. : 4.66808 > Minimum : 88.732 > Maximum : 101.225 > > The -q argument pipes stderr and stdout of the program under test to /dev/null > > I put more info in the github page. > > > HAVE FUN!! > Oooh, that sounds fantastic! |
March 22, 2012 Re: avgtime - Small D util for your everyday benchmarking needs | ||||
---|---|---|---|---|
| ||||
Posted in reply to Juan Manuel Cabo | Juan Manuel Cabo wrote:
> like the unix 'time' command
`version linux' is missing.
-manfred
|
March 22, 2012 Re: avgtime - Small D util for your everyday benchmarking needs | ||||
---|---|---|---|---|
| ||||
Posted in reply to Juan Manuel Cabo | On 3/21/12 7:32 PM, Juan Manuel Cabo wrote: > avgtime -r 10 -q ls -lR /etc > > ------------------------ > Total time (ms): 933.742 > Repetitions : 10 > Median time : 90.505 > Avg time : 93.3742 > Std dev. : 4.66808 > Minimum : 88.732 > Maximum : 101.225 Sweet! You may want to also print the mode of the distribution, which is the time of the maximum sample density. http://en.wikipedia.org/wiki/Mode_(statistics) (Warning: nontrivial but informative.) Andrei |
March 23, 2012 Re: avgtime - Small D util for your everyday benchmarking needs | ||||
---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | On Thursday, 22 March 2012 at 22:22:31 UTC, Andrei Alexandrescu wrote: > > Sweet! You may want to also print the mode of the distribution, which is the time of the maximum sample density. http://en.wikipedia.org/wiki/Mode_(statistics) (Warning: nontrivial but informative.) > > > Andrei Thanks for your feedback! > Sweet! You may want to also print the mode of the distribution, [....] Done!. Just pushed it to github. I made a histogram too!! (man, the gaussian curve is everywhere, it never ceases to perplex me). The histogram bins are the most significant digits (three "automatic" levels of precision, with rounding and casting tricks). But I think the most important change is that I'm now showing the 95% and 99% confidence intervals. (For the confidence intervals to mean anything, please everyone, remember to control your variables (don't defrag and benchmark :-) !!) so that apples are still apples and don't become oranges, and make sure N>30). More info on histogram and confidence intervals in the usage help. avgtime -q -h -r400 ls /etc ------------------------ Total time (ms): 2751.96 Repetitions : 400 Sample mode : 6.9 (79 ocurrences) Median time : 6.945 Avg time : 6.8799 Std dev. : 0.93927 Minimum : 3.7 Maximum : 16.36 95% conf.int. : [6.78786, 6.97195] e = 0.0920468 99% conf.int. : [6.75893, 7.00087] e = 0.12097 Histogram : msecs: count normalized bar 3.7: 2 # 3.8: 4 ## 3.9: 1 4.0: 1 4.2: 4 ## 4.3: 1 4.4: 1 4.5: 2 # 4.6: 3 # 4.7: 2 # 4.8: 3 # 4.9: 3 # 5.2: 1 5.3: 2 # 6.1: 1 6.2: 1 6.3: 4 ## 6.4: 6 ### 6.5: 14 ####### 6.6: 21 ########## 6.7: 31 ############### 6.8: 50 ######################### 6.9: 79 ######################################## 7.0: 48 ######################## 7.1: 29 ############## 7.2: 22 ########### 7.3: 13 ###### 7.4: 8 #### 7.5: 7 ### 7.6: 12 ###### 7.7: 6 ### 7.8: 6 ### 7.9: 2 # 8.0: 3 # 8.1: 1 8.2: 1 8.7: 1 8.8: 1 9.1: 1 11.5: 1 16.3: 1 --jm |
March 23, 2012 Re: avgtime - Small D util for your everyday benchmarking needs | ||||
---|---|---|---|---|
| ||||
Posted in reply to Juan Manuel Cabo | On 3/22/12 11:53 PM, Juan Manuel Cabo wrote:
> On Thursday, 22 March 2012 at 22:22:31 UTC, Andrei Alexandrescu wrote:
>>
>> Sweet! You may want to also print the mode of the distribution, which
>> is the time of the maximum sample density.
>> http://en.wikipedia.org/wiki/Mode_(statistics) (Warning: nontrivial
>> but informative.)
>>
>>
>> Andrei
>
> Thanks for your feedback!
>
>> Sweet! You may want to also print the mode of the distribution, [....]
>
> Done!. Just pushed it to github. I made a histogram too!!
> (man, the gaussian curve is everywhere, it never ceases to
> perplex me).
I'm actually surprised. I'm working on benchmarking lately and the distributions I get are very concentrated around the minimum.
Andrei
|
March 23, 2012 Re: avgtime - Small D util for your everyday benchmarking needs | ||||
---|---|---|---|---|
| ||||
Posted in reply to Juan Manuel Cabo | "Juan Manuel Cabo" <juanmanuel.cabo@gmail.com> wrote in message news:mytcmgglyntqsoybjcfz@forum.dlang.org... > On Thursday, 22 March 2012 at 22:22:31 UTC, Andrei Alexandrescu wrote: >> >> Sweet! You may want to also print the mode of the distribution, which is the time of the maximum sample density. http://en.wikipedia.org/wiki/Mode_(statistics) (Warning: nontrivial but informative.) >> >> >> Andrei > > Thanks for your feedback! > >> Sweet! You may want to also print the mode of the distribution, [....] > > Done!. Just pushed it to github. I made a histogram too!! (man, the gaussian curve is everywhere, it never ceases to perplex me). The histogram bins are the most significant digits (three "automatic" levels of precision, with rounding and casting tricks). > > But I think the most important change is that I'm now showing > the 95% and 99% confidence intervals. (For the confidence intervals > to mean anything, please everyone, remember to control > your variables (don't defrag and benchmark :-) !!) so that apples > are still apples and don't become oranges, and make sure N>30). > > More info on histogram and confidence intervals in the usage help. > > > avgtime -q -h -r400 ls /etc > > ------------------------ > Total time (ms): 2751.96 > Repetitions : 400 > Sample mode : 6.9 (79 ocurrences) > Median time : 6.945 > Avg time : 6.8799 > Std dev. : 0.93927 > Minimum : 3.7 > Maximum : 16.36 > 95% conf.int. : [6.78786, 6.97195] e = 0.0920468 > 99% conf.int. : [6.75893, 7.00087] e = 0.12097 > Histogram : > msecs: count normalized bar > 3.7: 2 # > 3.8: 4 ## > 3.9: 1 > 4.0: 1 > 4.2: 4 ## > 4.3: 1 > 4.4: 1 > 4.5: 2 # > 4.6: 3 # > 4.7: 2 # > 4.8: 3 # > 4.9: 3 # > 5.2: 1 > 5.3: 2 # > 6.1: 1 > 6.2: 1 > 6.3: 4 ## > 6.4: 6 ### > 6.5: 14 ####### > 6.6: 21 ########## > 6.7: 31 ############### > 6.8: 50 ######################### > 6.9: 79 ######################################## > 7.0: 48 ######################## > 7.1: 29 ############## > 7.2: 22 ########### > 7.3: 13 ###### > 7.4: 8 #### > 7.5: 7 ### > 7.6: 12 ###### > 7.7: 6 ### > 7.8: 6 ### > 7.9: 2 # > 8.0: 3 # > 8.1: 1 > 8.2: 1 > 8.7: 1 > 8.8: 1 > 9.1: 1 > 11.5: 1 > 16.3: 1 > > Wow, that's just fantastic! Really, this should be a standard system tool. I think this guy would be proud: http://zedshaw.com/essays/programmer_stats.html |
March 23, 2012 Re: avgtime - Small D util for your everyday benchmarking needs | ||||
---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | Andrei Alexandrescu wrote:
> You may want to also print the mode of the distribution, nontrivial but informative
In case of this implementation and according to the given link: trivial and noninformative, because
| For samples, if it is known that they are drawn from a symmetric | distribution, the sample mean can be used as an estimate of the | population mode.
and the program computes the variance as if the values of the sample follow a normal distribution, which is symmetric.
Therefore the mode of the sample is of interest only, when the variance is calculated wrongly.
-manfred
|
Copyright © 1999-2021 by the D Language Foundation