April 15, 2012
On 2012-04-15 04:58, Andrei Alexandrescu wrote:
> There have been quite a few good comments, but no review manager offer.
> Could someone please take this role?
>
> Again, it would be great to get std.benchmark in sooner rather than
> later because it can be used by subsequent submissions (many of which
> allege efficiency as a major advantage) to show they improve over the
> state of the art.
>
> Thanks,
>
> Andrei
>

As far as I know, there are already several other projects/modules in the pipeline before std.benchmark. If we are to review them in the order they are submitted.

-- 
/Jacob Carlborg
April 15, 2012
Instead of the originally proposed layout

================================================================
Benchmark                               relative ns/iter  iter/s
================================================================
---[ module_one ]-----------------------------------------------
file write                                        140.2K    7.1K
file read                                 517.8%   27.1K   36.9K
array creation                             1.2Kx  116.0     8.6M
================================================================

I would like to see something more similar to the following:

================================================================
Benchmark                                Time    Performance
                                                ---------------
name                                     s/iter  iter/s  factor
-----------------------[ module_one ]--- ------- ------- -------
file write                               140.20µ   7.13k
file read                                 27.10µ  36.90k   5.17
array creation                           116.00n   8.62M   1.21k
================================================================

Transition to the new layout can be achieved as follows:

        * Consider layout guidelines given in

           http://mirrors.ctan.org/info/german/tabsatz/tabsatz.pdf

        * Reorder columns considering their semantic and introduce
          groups.

        * Rename 'relative' to 'factor' and avoid usage of 'x' for
          'times'.

        * Divide column 'ns/iter' by a billion (see Vladimir's
          post).

        * Right align module name (abbreviate if it is too long).

        * Avoid '%' but rely on the prefixes specified in the SI:

http://en.wikipedia.org/w/index.php?title=International_System_of_Units&oldid=487057838#Units_and_prefixes

          Then, using a fixed width of 7 characters we have 3
          significant digits, at least:

           123450000         -> 123.45M
            12345000         ->  12.35M
             1234500         ->   1.23M
              123450         -> 123.45k
               12345         ->  12.35k
                1234.5       ->   1.23k
                 123.45      -> 123.45
                  12.345     ->  12.35
                   1.2345    ->   1.23
                   0.12345   -> 123.45m
                   0.012345  ->  12.35m
                   0.0012345 ->   1.23m

Cheers,
Famous

April 15, 2012
Maybe the layout is not destroyed, this time:

> ================================================================
> Benchmark                                Time    Performance
>                                                  ---------------
> name                                     s/iter  iter/s  factor
> -----------------------[ module_one ]--- ------- ------- -------
> file write                               140.20µ   7.13k
> file read                                 27.10µ  36.90k   5.17
> array creation                           116.00n   8.62M   1.21k
> ================================================================



April 15, 2012
Andrei Alexandrescu wrote:
> On 4/10/12 5:40 AM, Jens Mueller wrote:
> >How come that the times based relative report and the percentage based relative report are mixed in one result? And how do I choose which one I'd like to see in the output.
> 
> It's in the names. If the name of a benchmark starts with benchmark_relative_, then that benchmark is considered relative to the last non-relative benchmark. Using a naming convention allows complete automation in benchmarking a module.
> 
> I figure it's fine that all results appear together because the absence of data in the relative column clarifies which is which.

I know. That wasn't my question. How do I choose between percentage vs. factors for a relative benchmark, e.g. 200% vs. 2?

> >When benchmarking you can measure different things at the same time. In this regard the current proposal is limited. It just measures wall clock time. I believe extending the StopWatch to measure e.g. user CPU time is a useful addition.
> 
> Generally I fear piling too much on StopWatch because every feature adds its own noise. But there's value in collecting the result of times(). What would be the Windows equivalent?

I'm not a Windows user but GetProcessTimes
seems to be the Windows equivalent.
http://msdn.microsoft.com/en-us/library/windows/desktop/ms683223%28v=vs.85%29.aspx

> >In general, allowing user defined measurements would be great.
> >E.g. to measure the time spend in user mode.
> >() =>  {
> >          tms t;
> >          times(&t);
> >          return t.tms_utime;
> >       }
> >
> >Note, that this code does not need to be portable. You can also use
> >version() else static assert.
> >
> >Things that come to mind that I'd like to measure.
> >Time measurements:
> >   * User CPU time
> >   * System CPU time
> >   * Time spent in memory allocations
> >Count measurements:
> >   * Memory usage
> >   * L1/L2/L3 cache misses
> >   * Number of executed instructions
> >   * Number of memory allocations
> >
> >Of course wall clock time is the ultimate measure when benchmarking. But often you need to investigate further (doing more measurements).
> >
> >Do you think adding this is worthwhile?
> 
> Absolutely. I just fear about expanding the charter of the framework too much. Let's see:
> 
> * Memory usage is, I think, difficult in Windows.
> * Estimating cache misses and executed instructions is significant research
> * Number of memory allocations requires instrumenting druntime

I think that std.benchmark shouldn't do this. But I think we should figure out whether adding a user-defined measurement function is possible. Making sure that be we have a design that is flexible enough to capture different measurements.

Jens
April 15, 2012
Andrei Alexandrescu wrote:
> There have been quite a few good comments, but no review manager offer. Could someone please take this role?

I will do this.
But I will need to get more familiar with the process. And add it to
http://prowiki.org/wiki4d/wiki.cgi?ReviewQueue for future review
managers.

Jens
April 16, 2012
On Sunday, April 15, 2012 13:41:28 Jacob Carlborg wrote:
> On 2012-04-15 04:58, Andrei Alexandrescu wrote:
> > There have been quite a few good comments, but no review manager offer. Could someone please take this role?
> > 
> > Again, it would be great to get std.benchmark in sooner rather than later because it can be used by subsequent submissions (many of which allege efficiency as a major advantage) to show they improve over the state of the art.
> > 
> > Thanks,
> > 
> > Andrei
> 
> As far as I know, there are already several other projects/modules in the pipeline before std.benchmark. If we are to review them in the order they are submitted.

There are, but none of them are being reviewed at the moment or being pushed for a review. And there _is_ an argument for std.benchmark being high priority based on how it could affect future reviews. We should probably start reviewing _something_ soon though. We haven't had very good momentum on that of late.

- Jonathan M Davis
April 16, 2012
On Sunday, 15 April 2012 at 16:23:32 UTC, Jens Mueller wrote:
> Andrei Alexandrescu wrote:
>> There have been quite a few good comments, but no review manager
>> offer. Could someone please take this role?
>
> I will do this.
> But I will need to get more familiar with the process. And add it to
> http://prowiki.org/wiki4d/wiki.cgi?ReviewQueue for future review
> managers.


The review process is based on Boost's:

http://www.boost.org/community/reviews.html#Review_Manager

-Lars
April 16, 2012
Lars T. Kyllingstad wrote:
> On Sunday, 15 April 2012 at 16:23:32 UTC, Jens Mueller wrote:
> >Andrei Alexandrescu wrote:
> >>There have been quite a few good comments, but no review manager offer. Could someone please take this role?
> >
> >I will do this.
> >But I will need to get more familiar with the process. And add it
> >to
> >http://prowiki.org/wiki4d/wiki.cgi?ReviewQueue for future review
> >managers.
> 
> 
> The review process is based on Boost's:
> 
> http://www.boost.org/community/reviews.html#Review_Manager

The page is shorter than expected. Last time I checked I found something way longer. Since the Phobos' review process is based on Boost's where does deviate?

Jens
April 17, 2012
On 17.04.2012 1:00, Jens Mueller wrote:
> Lars T. Kyllingstad wrote:
>> On Sunday, 15 April 2012 at 16:23:32 UTC, Jens Mueller wrote:
>>> Andrei Alexandrescu wrote:
>>>> There have been quite a few good comments, but no review manager
>>>> offer. Could someone please take this role?
>>>
>>> I will do this.
>>> But I will need to get more familiar with the process. And add it
>>> to
>>> http://prowiki.org/wiki4d/wiki.cgi?ReviewQueue for future review
>>> managers.
>>
>>
>> The review process is based on Boost's:
>>
>> http://www.boost.org/community/reviews.html#Review_Manager
>
> The page is shorter than expected. Last time I checked I found something
> way longer. Since the Phobos' review process is based on Boost's where
> does deviate?
>

It's peer review followed by voting and that's all about it.
I don't think you should seek any formal standards, documents, guidelines, etc.

The manager role is rather simple:

1. Manger picks module from review queue and posts an announcement that formal review of it starts today. Post should contain relevant information about module and links to source/documentation. Also importantly it sets the exact time the review ends and the exact time voting ends. (usually 2 weeks review, 1 week for voting)

2. When review ends Manager ether opens a vote thread.
If it's obvious that module needs futher work (on explicit request from author) Manager may choose to prolong or postpone review thus putting it back into queue.

3. When vote period ends count up the votes and declare the result.

And that's it.
There are no hard rule on which module has the priority in queue.
It is loosely calculated on basis of how long it was ready for review and how important the functionality is.

P.S. If you are not sure are up to it just say the word and I'll volunteer instead. We need to push this forward as the review queue is going to overflow real soon ;)

-- 
Dmitry Olshansky
April 17, 2012
Dmitry Olshansky wrote:
> On 17.04.2012 1:00, Jens Mueller wrote:
> >Lars T. Kyllingstad wrote:
> >>On Sunday, 15 April 2012 at 16:23:32 UTC, Jens Mueller wrote:
> >>>Andrei Alexandrescu wrote:
> >>>>There have been quite a few good comments, but no review manager offer. Could someone please take this role?
> >>>
> >>>I will do this.
> >>>But I will need to get more familiar with the process. And add it
> >>>to
> >>>http://prowiki.org/wiki4d/wiki.cgi?ReviewQueue for future review
> >>>managers.
> >>
> >>
> >>The review process is based on Boost's:
> >>
> >>http://www.boost.org/community/reviews.html#Review_Manager
> >
> >The page is shorter than expected. Last time I checked I found something way longer. Since the Phobos' review process is based on Boost's where does deviate?
> >
> 
> It's peer review followed by voting and that's all about it. I don't think you should seek any formal standards, documents, guidelines, etc.
> 
> The manager role is rather simple:
> 
> 1. Manger picks module from review queue and posts an announcement that formal review of it starts today. Post should contain relevant information about module and links to source/documentation. Also importantly it sets the exact time the review ends and the exact time voting ends. (usually 2 weeks review, 1 week for voting)
> 
> 2. When review ends Manager ether opens a vote thread.
> If it's obvious that module needs futher work (on explicit request
> from author) Manager may choose to prolong or postpone review thus
> putting it back into queue.
> 
> 3. When vote period ends count up the votes and declare the result.
> 
> And that's it.
> There are no hard rule on which module has the priority in queue.
> It is loosely calculated on basis of how long it was ready for
> review and how important the functionality is.

Many thanks for your explanation.

> P.S. If you are not sure are up to it just say the word and I'll volunteer instead. We need to push this forward as the review queue is going to overflow real soon ;)

Andrei is already preparing. Review will start soon.

Jens