Jump to page: 1 24  
Page
Thread overview
avgtime - Small D util for your everyday benchmarking needs
Mar 22, 2012
Juan Manuel Cabo
Mar 22, 2012
Tove
Mar 22, 2012
Juan Manuel Cabo
Mar 22, 2012
Nick Sabalausky
Mar 22, 2012
Manfred Nowak
Mar 23, 2012
Juan Manuel Cabo
Mar 27, 2012
Juan Manuel Cabo
Mar 23, 2012
Juan Manuel Cabo
Mar 23, 2012
Juan Manuel Cabo
Mar 23, 2012
Juan Manuel Cabo
Mar 23, 2012
Manfred Nowak
Mar 25, 2012
Manfred Nowak
Mar 24, 2012
Marco Leise
Mar 23, 2012
Nick Sabalausky
Mar 23, 2012
Juan Manuel Cabo
Mar 23, 2012
James Miller
Mar 23, 2012
Juan Manuel Cabo
Mar 24, 2012
Nick Sabalausky
Mar 27, 2012
Ary Manzana
Mar 27, 2012
Juan Manuel Cabo
Mar 27, 2012
Ary Manzana
Mar 27, 2012
Juan Manuel Cabo
Mar 23, 2012
Manfred Nowak
Mar 23, 2012
Juan Manuel Cabo
Mar 23, 2012
James Miller
Mar 23, 2012
Don Clugston
Mar 23, 2012
Don Clugston
Mar 23, 2012
Juan Manuel Cabo
Mar 27, 2012
Don Clugston
Mar 29, 2012
Brad Anderson
March 22, 2012
This is a small util I wrote in D which is like the unix
'time' command but can repeat the command N times and show
median, average, standard deviation, minimum and maximum.

As you all know, it is not proper to conclude that
a program is faster than another program by running
them just once.

It's BOOST and is in github:

    https://github.com/jmcabo/avgtime

Example:


    avgtime -r 10 -q ls -lR /etc

    ------------------------
    Total time (ms): 933.742
    Repetitions    : 10
    Median time    : 90.505
    Avg time       : 93.3742
    Std dev.       : 4.66808
    Minimum        : 88.732
    Maximum        : 101.225

The -q argument pipes stderr and stdout of the program
under test to /dev/null

I put more info in the github page.


HAVE FUN!!

--jm


March 22, 2012
On Thursday, 22 March 2012 at 00:32:31 UTC, Juan Manuel Cabo wrote:
> This is a small util I wrote in D which is like the unix
> 'time' command but can repeat the command N times and show
> median, average, standard deviation, minimum and maximum.
>
> As you all know, it is not proper to conclude that
> a program is faster than another program by running
> them just once.
>
> It's BOOST and is in github:
>
>     https://github.com/jmcabo/avgtime
>
> Example:
>
>
>     avgtime -r 10 -q ls -lR /etc
>
>     ------------------------
>     Total time (ms): 933.742
>     Repetitions    : 10
>     Median time    : 90.505
>     Avg time       : 93.3742
>     Std dev.       : 4.66808
>     Minimum        : 88.732
>     Maximum        : 101.225
>
> The -q argument pipes stderr and stdout of the program
> under test to /dev/null
>
> I put more info in the github page.
>
>
> HAVE FUN!!
>
> --jm

Awesome, I do have a tiny feature request for the next version... a commandline switch to enable automatically discarding the first run as an outlier.

/Tove

March 22, 2012
On Thursday, 22 March 2012 at 01:37:19 UTC, Tove wrote:
>
> Awesome, I do have a tiny feature request for the next version... a commandline switch to enable automatically discarding the first run as an outlier.
>
> /Tove

Done, I just put it in github. (-d switch).
But maybe you should be looking at the median
to ignore outliers.


I also added a -p switch to print all the times:

    ./avgtime -d -q -p -r10  ls -lR /usr/share/doc

    ------------------------
    Total time (ms): 3986.69
    Repetitions    : 10
    Median time    : 397.62
    Avg time       : 398.669
    Std dev.       : 2.95832
    Minimum        : 395.633
    Maximum        : 406.274
    Sorted times   :
        [395.633, 396.261, 396.273, 397.413, 397.425, 397.815,
        399.321, 399.719, 400.551, 406.274]


--jm


March 22, 2012
"Juan Manuel Cabo" <juanmanuel.cabo@gmail.com> wrote in message news:zgjczrnyknqsiylhntui@forum.dlang.org...
> This is a small util I wrote in D which is like the unix 'time' command but can repeat the command N times and show median, average, standard deviation, minimum and maximum.
>
> As you all know, it is not proper to conclude that
> a program is faster than another program by running
> them just once.
>
> It's BOOST and is in github:
>
>     https://github.com/jmcabo/avgtime
>
> Example:
>
>
>     avgtime -r 10 -q ls -lR /etc
>
>     ------------------------
>     Total time (ms): 933.742
>     Repetitions    : 10
>     Median time    : 90.505
>     Avg time       : 93.3742
>     Std dev.       : 4.66808
>     Minimum        : 88.732
>     Maximum        : 101.225
>
> The -q argument pipes stderr and stdout of the program under test to /dev/null
>
> I put more info in the github page.
>
>
> HAVE FUN!!
>

Oooh, that sounds fantastic!


March 22, 2012
Juan Manuel Cabo wrote:

> like the unix 'time' command

`version linux' is missing.

-manfred
March 22, 2012
On 3/21/12 7:32 PM, Juan Manuel Cabo wrote:
> avgtime -r 10 -q ls -lR /etc
>
> ------------------------
> Total time (ms): 933.742
> Repetitions : 10
> Median time : 90.505
> Avg time : 93.3742
> Std dev. : 4.66808
> Minimum : 88.732
> Maximum : 101.225

Sweet! You may want to also print the mode of the distribution, which is the time of the maximum sample density. http://en.wikipedia.org/wiki/Mode_(statistics) (Warning: nontrivial but informative.)


Andrei

March 23, 2012
On Thursday, 22 March 2012 at 22:22:31 UTC, Andrei Alexandrescu wrote:
>
> Sweet! You may want to also print the mode of the distribution, which is the time of the maximum sample density. http://en.wikipedia.org/wiki/Mode_(statistics) (Warning: nontrivial but informative.)
>
>
> Andrei

Thanks for your feedback!

> Sweet! You may want to also print the mode of the distribution, [....]

Done!. Just pushed it to github. I made a histogram too!!
(man, the gaussian curve is everywhere, it never ceases to
perplex me). The histogram bins are the most significant digits
(three "automatic" levels of precision, with rounding and
casting tricks).

But I think the most important change is that I'm now showing
the 95% and 99% confidence intervals. (For the confidence intervals
to mean anything, please everyone, remember to control
your variables (don't defrag and benchmark :-) !!) so that apples
are still apples and don't become oranges, and make sure N>30).

More info on histogram and confidence intervals in the
usage help.


    avgtime -q -h -r400 ls /etc

    ------------------------
    Total time (ms): 2751.96
    Repetitions    : 400
    Sample mode    : 6.9 (79 ocurrences)
    Median time    : 6.945
    Avg time       : 6.8799
    Std dev.       : 0.93927
    Minimum        : 3.7
    Maximum        : 16.36
    95% conf.int.  : [6.78786, 6.97195]  e = 0.0920468
    99% conf.int.  : [6.75893, 7.00087]  e = 0.12097
    Histogram      :
        msecs: count  normalized bar
          3.7:     2  #
          3.8:     4  ##
          3.9:     1
          4.0:     1
          4.2:     4  ##
          4.3:     1
          4.4:     1
          4.5:     2  #
          4.6:     3  #
          4.7:     2  #
          4.8:     3  #
          4.9:     3  #
          5.2:     1
          5.3:     2  #
          6.1:     1
          6.2:     1
          6.3:     4  ##
          6.4:     6  ###
          6.5:    14  #######
          6.6:    21  ##########
          6.7:    31  ###############
          6.8:    50  #########################
          6.9:    79  ########################################
          7.0:    48  ########################
          7.1:    29  ##############
          7.2:    22  ###########
          7.3:    13  ######
          7.4:     8  ####
          7.5:     7  ###
          7.6:    12  ######
          7.7:     6  ###
          7.8:     6  ###
          7.9:     2  #
          8.0:     3  #
          8.1:     1
          8.2:     1
          8.7:     1
          8.8:     1
          9.1:     1
         11.5:     1
         16.3:     1


--jm



March 23, 2012
On 3/22/12 11:53 PM, Juan Manuel Cabo wrote:
> On Thursday, 22 March 2012 at 22:22:31 UTC, Andrei Alexandrescu wrote:
>>
>> Sweet! You may want to also print the mode of the distribution, which
>> is the time of the maximum sample density.
>> http://en.wikipedia.org/wiki/Mode_(statistics) (Warning: nontrivial
>> but informative.)
>>
>>
>> Andrei
>
> Thanks for your feedback!
>
>> Sweet! You may want to also print the mode of the distribution, [....]
>
> Done!. Just pushed it to github. I made a histogram too!!
> (man, the gaussian curve is everywhere, it never ceases to
> perplex me).

I'm actually surprised. I'm working on benchmarking lately and the distributions I get are very concentrated around the minimum.

Andrei


March 23, 2012
"Juan Manuel Cabo" <juanmanuel.cabo@gmail.com> wrote in message news:mytcmgglyntqsoybjcfz@forum.dlang.org...
> On Thursday, 22 March 2012 at 22:22:31 UTC, Andrei Alexandrescu wrote:
>>
>> Sweet! You may want to also print the mode of the distribution, which is the time of the maximum sample density. http://en.wikipedia.org/wiki/Mode_(statistics) (Warning: nontrivial but informative.)
>>
>>
>> Andrei
>
> Thanks for your feedback!
>
>> Sweet! You may want to also print the mode of the distribution, [....]
>
> Done!. Just pushed it to github. I made a histogram too!! (man, the gaussian curve is everywhere, it never ceases to perplex me). The histogram bins are the most significant digits (three "automatic" levels of precision, with rounding and casting tricks).
>
> But I think the most important change is that I'm now showing
> the 95% and 99% confidence intervals. (For the confidence intervals
> to mean anything, please everyone, remember to control
> your variables (don't defrag and benchmark :-) !!) so that apples
> are still apples and don't become oranges, and make sure N>30).
>
> More info on histogram and confidence intervals in the usage help.
>
>
>     avgtime -q -h -r400 ls /etc
>
>     ------------------------
>     Total time (ms): 2751.96
>     Repetitions    : 400
>     Sample mode    : 6.9 (79 ocurrences)
>     Median time    : 6.945
>     Avg time       : 6.8799
>     Std dev.       : 0.93927
>     Minimum        : 3.7
>     Maximum        : 16.36
>     95% conf.int.  : [6.78786, 6.97195]  e = 0.0920468
>     99% conf.int.  : [6.75893, 7.00087]  e = 0.12097
>     Histogram      :
>         msecs: count  normalized bar
>           3.7:     2  #
>           3.8:     4  ##
>           3.9:     1
>           4.0:     1
>           4.2:     4  ##
>           4.3:     1
>           4.4:     1
>           4.5:     2  #
>           4.6:     3  #
>           4.7:     2  #
>           4.8:     3  #
>           4.9:     3  #
>           5.2:     1
>           5.3:     2  #
>           6.1:     1
>           6.2:     1
>           6.3:     4  ##
>           6.4:     6  ###
>           6.5:    14  #######
>           6.6:    21  ##########
>           6.7:    31  ###############
>           6.8:    50  #########################
>           6.9:    79  ########################################
>           7.0:    48  ########################
>           7.1:    29  ##############
>           7.2:    22  ###########
>           7.3:    13  ######
>           7.4:     8  ####
>           7.5:     7  ###
>           7.6:    12  ######
>           7.7:     6  ###
>           7.8:     6  ###
>           7.9:     2  #
>           8.0:     3  #
>           8.1:     1
>           8.2:     1
>           8.7:     1
>           8.8:     1
>           9.1:     1
>          11.5:     1
>          16.3:     1
>
>

Wow, that's just fantastic! Really, this should be a standard system tool.

I think this guy would be proud: http://zedshaw.com/essays/programmer_stats.html


March 23, 2012
Andrei Alexandrescu wrote:

> You may want to also print the mode of the distribution, nontrivial but informative

In case of this implementation and according to the given link: trivial and noninformative, because

| For samples, if it is known that they are drawn from a symmetric | distribution, the sample mean can be used as an estimate of the | population mode.

and the program computes the variance as if the values of the sample follow a normal distribution, which is symmetric.

Therefore the mode of the sample is of interest only, when the variance is calculated wrongly.

-manfred
« First   ‹ Prev
1 2 3 4