March 08, 2009 Re: D compiler benchmarks | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Bill Baxter | Bill Baxter wrote:
> On Mon, Mar 9, 2009 at 3:15 AM, Georg Wrede <georg.wrede@iki.fi> wrote:
>> Robert Clipsham wrote:
>>> Georg Wrede wrote:
>>>> Robert Clipsham wrote:
>>>>> Hi all,
>>>>>
>>>>> I have set up some benchmarks for dmd, ldc and gdc at
>>>>> http://dbench.octarineparrot.com/.
>>>>>
>>>>> There are currently only 6 tests all from
>>>>> http://shootout.alioth.debian.org/gp4/d.php. My knowledge of phobos is not
>>>>> great enough to port the others to tango (I've chosen tango as ldc does not
>>>>> support phobos currently, so it make sense to choose tango as all compilers
>>>>> support it). If you would like to contribute new tests or improve on the
>>>>> current ones let me know and I'll include them next time I run them.
>>>>>
>>>>> All source code can be found at
>>>>> http://hg.octarineparrot.com/dbench/file/tip.
>>>>>
>>>>> Let me know if you have any ideas for how I can improve the benchmarks,
>>>>> I currently plan to add compile times, size of the final executable and
>>>>> memory usage (if anyone knows an easy way to get the memory usage of a
>>>>> process in D, let me know :D).
>>>> The first run should not be included in the average.
>>> Could you explain your reasoning for this? I can't see why it shouldn't be
>>> included personally.
>> Suppose you have run the same program very recently before the test. Then
>> the executable will be in memory already, any other files it may want to
>> access are in memory too.
>>
>> This makes execution much faster than if it were the first time ever this
>> program is run.
>>
>> If things were deterministic, then you wouldn't run several times and
>> average the results, right?
>
> Also I think standard practice for benchmarks is not to average but to
> take the minimum time.
> To the extent that things are not deterministic it is generally
> because of factors outside of your program's control -- virtual memory
> page fault kicking in, some other process stealing cycles, etc. Or
> put another way, there is no way for the measured run time of your
> program to come out artificially too low, but there are lots of ways
> it could come out too high. The reason you average measurements in
> other scenarios is because of an expectation that the measurements
> form a normal distribution around the true value. That is not the
> case for measurements of computer program running times. Measurements
> will basically always be higher than the true intrinsic run-time for
> your program.
>
> --bb
By minimum time, do you mean the fastest time or the slowest time?
| |||
March 08, 2009 Re: D compiler benchmarks | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Georg Wrede | Georg Wrede wrote:
> Suppose you have run the same program very recently before the test. Then the executable will be in memory already, any other files it may want to access are in memory too.
>
> This makes execution much faster than if it were the first time ever this program is run.
>
> If things were deterministic, then you wouldn't run several times and average the results, right?
Ok, I will rerun the tests later today and disregard the first test. I may also take the minimum value rather than taking an average (thanks to Bill Baxter for this idea).
| |||
March 08, 2009 Re: D compiler benchmarks | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Robert Clipsham | Robert Clipsham Wrote:
> Georg Wrede wrote:
> > Suppose you have run the same program very recently before the test. Then the executable will be in memory already, any other files it may want to access are in memory too.
> >
> > This makes execution much faster than if it were the first time ever this program is run.
> >
> > If things were deterministic, then you wouldn't run several times and average the results, right?
>
> Ok, I will rerun the tests later today and disregard the first test. I may also take the minimum value rather than taking an average (thanks to Bill Baxter for this idea).
As you're re-inventing functionality that's in the benchmarks game measurement scripts, let me suggest that there are 2 phases involved:
1) record measurements
2) analyze measurements
As long as you keep the measurements in the order they were made in and keep the measurements for each different configuration in their own file, you can decide to do different selections from those measurements at some later date.
You can throw away the first measurement or not, you can take the fastest or the median, you can ... without doing new measurements.
As you are only trying to measure a couple of language implementations, measure them across a dozen different input values rather than one or two - leaving the computer churning overnight will help keep your home warm :-)
| |||
Copyright © 1999-2021 by the D Language Foundation
Permalink
Reply