May 31, 2014
On Saturday, 31 May 2014 at 14:01:52 UTC, Andrei Alexandrescu wrote:
> On 5/30/14, 11:36 PM, Russel Winder via Digitalmars-d wrote:
>> As well as the average (mean), you must provide standard deviation and
>> degrees of freedom so that a proper error analysis and t-tests are
>> feasible. Or put it another way: even if you quote a mean with knowing
>> how many in the sample and what the spread is you cannot judge the error
>> and so cannot make deductions or inferences.
>
> No. Elapsed time in a benchmark does not follow a Student or Gaussian distribution. Use the mode or (better) the minimum. -- Andrei

Well... It depends on what you're looking to do with the result. As you say though, micro-benchmarks of code-quality should always be judged on the minimum of a large sample.
May 31, 2014
On 5/31/14, 7:10 AM, Russel Winder via Digitalmars-d wrote:
> On Sat, 2014-05-31 at 07:02 -0700, Andrei Alexandrescu via Digitalmars-d
> wrote:
>> On 5/30/14, 11:36 PM, Russel Winder via Digitalmars-d wrote:
>>> As well as the average (mean), you must provide standard deviation and
>>> degrees of freedom so that a proper error analysis and t-tests are
>>> feasible. Or put it another way: even if you quote a mean with knowing
>>> how many in the sample and what the spread is you cannot judge the error
>>> and so cannot make deductions or inferences.
>>
>> No. Elapsed time in a benchmark does not follow a Student or Gaussian
>> distribution. Use the mode or (better) the minimum. -- Andrei
>
> We almost certainly need to unpack that more. I agree that behind my
> comment was an implicit assumption of a normal distribution of results.
> This is an easy assumption to make even if it is wrong. So is it
> provably wrong? What is the distribution? If we know that then there is
> knowledge of the parameters which then allow for statistical inference
> and deduction.

Well there's quantization noise which has uniform distribution. Then all other sources of noise are additive (no noise may make code run faster). So I speculate that the pdf is a half Gaussian mixed with a uniform distribution. Taking the mode (which is very close to the minimum in my measurements) would be the most accurate way to go. Taking the average would end up in some weird point on the half-Gaussian slope.

Andrei

May 31, 2014
On Saturday, 31 May 2014 at 05:12:54 UTC, Marco Leise wrote:
> Run this with: -O3 -frelease -fno-assert -fno-bounds-check -march=native
> This way GCC and LLVM will recognize that you alternately add
> p0 and p1 to the sum and partially unroll the loop, thereby
> removing the condition. It takes 1.4xxxx nanoseconds per step
> on my not so new 2.0 Ghz notebook, so I assume your PC will
> easily reach parity with your original C++ version.
>
>
>
> import std.stdio;
> import core.time;
>
> alias ℕ = size_t;
>
> void main()
> {
> 	run!plus(1_000_000_000);
> }
>
> double plus(ℕ steps)
> {
> 	enum p0 = 0.0045;
> 	enum p1 = 1.00045452 - p0;
>
> 	double sum = 1.346346;
> 	foreach (i; 0 .. steps)
> 		sum += i%2 ? p1 : p0;
> 	return sum;
> }
>
> void run(alias func)(ℕ steps)
> {
> 	auto t1 = TickDuration.currSystemTick;
> 	auto output = func(steps);
> 	auto t2 = TickDuration.currSystemTick;
> 	auto nanotime = 1_000_000_000.0 / steps * (t2 - t1).length / TickDuration.ticksPerSec;
> 	writefln("Last: %s", output);
> 	writefln("Time per op: %s", nanotime);
> 	writeln();
> }


Thank you for the help. Which OS is running on your notebook ? For I compiled your source code with your settings with the GCC compiler. The run took 3.1xxxx nanoseconds per step. For the DMD compiler the run took 5.xxxx nanoseconds. So I think the problem could be specific to the linux versions of the GCC and the DMD compilers.


Thomas
May 31, 2014
On Sat, 2014-05-31 at 10:29 -0700, Andrei Alexandrescu via Digitalmars-d
wrote:
[…]
> 
> Well there's quantization noise which has uniform distribution. Then all other sources of noise are additive (no noise may make code run faster). So I speculate that the pdf is a half Gaussian mixed with a uniform distribution. Taking the mode (which is very close to the minimum in my measurements) would be the most accurate way to go. Taking the average would end up in some weird point on the half-Gaussian slope.

I sense you are taking the piss.

-- 
Russel. ============================================================================= Dr Russel Winder      t: +44 20 7585 2200   voip: sip:russel.winder@ekiga.net 41 Buckmaster Road    m: +44 7770 465 077   xmpp: russel@winder.org.uk London SW11 1EN, UK   w: www.russel.org.uk  skype: russel_winder

May 31, 2014
On 5/31/14, 11:49 AM, Russel Winder via Digitalmars-d wrote:
> On Sat, 2014-05-31 at 10:29 -0700, Andrei Alexandrescu via Digitalmars-d
> wrote:
> […]
>>
>> Well there's quantization noise which has uniform distribution. Then all
>> other sources of noise are additive (no noise may make code run faster).
>> So I speculate that the pdf is a half Gaussian mixed with a uniform
>> distribution. Taking the mode (which is very close to the minimum in my
>> measurements) would be the most accurate way to go. Taking the average
>> would end up in some weird point on the half-Gaussian slope.
>
> I sense you are taking the piss.

I don't know the idiom - what does it mean? Something nice I hope :o). -- Andrei

May 31, 2014
On 5/31/14, 2:42 PM, Andrei Alexandrescu wrote:
> On 5/31/14, 11:49 AM, Russel Winder via Digitalmars-d wrote:
>> On Sat, 2014-05-31 at 10:29 -0700, Andrei Alexandrescu via Digitalmars-d
>> wrote:
>> […]
>>>
>>> Well there's quantization noise which has uniform distribution. Then all
>>> other sources of noise are additive (no noise may make code run faster).
>>> So I speculate that the pdf is a half Gaussian mixed with a uniform
>>> distribution. Taking the mode (which is very close to the minimum in my
>>> measurements) would be the most accurate way to go. Taking the average
>>> would end up in some weird point on the half-Gaussian slope.
>>
>> I sense you are taking the piss.
>
> I don't know the idiom - what does it mean? Something nice I hope :o).
> -- Andrei

Found it: http://en.wikipedia.org/wiki/Taking_the_piss. Not sure how to take it in context; I am being serious, and basing myself on measurements taken while designing and implementing https://github.com/facebook/folly/blob/master/folly/docs/Benchmark.md.

Andrei


June 01, 2014
On Saturday, 31 May 2014 at 13:59:40 UTC, Andrei Alexandrescu
wrote:
> On 5/30/14, 10:32 PM, dennis luehring wrote:
>> -do not benchmark anything without millions of loops - use the average
>> as the result
>
> Use the minimum unless networking is involved. -- Andrei

cache??
June 01, 2014
Am Sat, 31 May 2014 17:44:23 +0000
schrieb "Thomas" <t.leichner@arcor.de>:

> Thank you for the help. Which OS is running on your notebook ? For I compiled your source code with your settings with the GCC compiler. The run took 3.1xxxx nanoseconds per step. For the DMD compiler the run took 5.xxxx nanoseconds. So I think the problem could be specific to the linux versions of the GCC and the DMD compilers.
> 
> 
> Thomas

Gentoo Linux 64-bit. Aside from the 64-bit maybe, I can't make
out a good reason why the runtime should depend on the OS so
much.
Are you sure you don't run on a PC from 2000 and did you use
the compiler flags I gave on top of my post? Did you disable
CPU power saving and was no other process running at the same
time?
By the way I get very similar results when using the LDC
compiler.

-- 
Marco

June 01, 2014
On Sat, 2014-05-31 at 14:45 -0700, Andrei Alexandrescu via Digitalmars-d
wrote:
[…]
> Found it: http://en.wikipedia.org/wiki/Taking_the_piss. Not sure how to take it in context; I am being serious, and basing myself on measurements taken while designing and implementing https://github.com/facebook/folly/blob/master/folly/docs/Benchmark.md.

My apologies for being a abrupt and ill-considered and hence potentially rude. Long story.

I'll cogitate on the ideas this morning and see what I can chip in constructively to take things along.

I will also ask Aleksey Shipilëv what underpinnings he is using for JMH
to see if there is some useful cross-fertilization.

-- 
Russel. ============================================================================= Dr Russel Winder      t: +44 20 7585 2200   voip: sip:russel.winder@ekiga.net 41 Buckmaster Road    m: +44 7770 465 077   xmpp: russel@winder.org.uk London SW11 1EN, UK   w: www.russel.org.uk  skype: russel_winder

June 02, 2014
On Sunday, 1 June 2014 at 03:33:36 UTC, Marco Leise wrote:
> Am Sat, 31 May 2014 17:44:23 +0000
> schrieb "Thomas" <t.leichner@arcor.de>:
>
>> Thank you for the help. Which OS is running on your notebook ? For I compiled your source code with your settings with the GCC compiler. The run took 3.1xxxx nanoseconds per step. For the DMD compiler the run took 5.xxxx nanoseconds. So I think the problem could be specific to the linux versions of the GCC and the DMD compilers.
>> 
>> 
>> Thomas
>
> Gentoo Linux 64-bit. Aside from the 64-bit maybe, I can't make
> out a good reason why the runtime should depend on the OS so
> much.
> Are you sure you don't run on a PC from 2000 and did you use
> the compiler flags I gave on top of my post? Did you disable
> CPU power saving and was no other process running at the same
> time?
> By the way I get very similar results when using the LDC
> compiler.

My PC is 5 years old. Of course I used your flags. Besides I am not an idiot, I am programming since 20 years and used 6 different programming languages. I did't post that just for fun, for I am evaluating D as language for numerical programming.

Thomas