April 25, 2012
On Wed, Apr 25, 2012 at 10:34:38AM +0200, SomeDude wrote: [...]
> import std.stdio, std.container, std.range;
> 
> void main() {
>     enum int range = 100;
>     enum int n = 1_000_000;
> 
>     auto t = redBlackTree!int(0);
> 
>     for (int i = 0; i < n; i++) {
>         if (i > range)
>             t.removeFront();
>         t.insert(i);
>     }
> 
>     writeln(walkLength(t[]));
>     //writeln(t[]);
> }
> 
> runs in about 1793 ms.
> The strange thing is, if I comment out the writeln line, runtimes
> are in average *slower* by about 20 ms, with timings varying a
> little bit more than when the writeln is included.
[...]

First of all, differences as small as 20ms really should be considered as background noise. The exact measurements depend on a lot of system-specific and environment-specific factors, such as OS memory usage, CPU cache behaviour, disk activity & speed, the exact points of context switches, etc.. If you really want to check for substantial performance differences, you need to magnify your test case so that differences are measured >5 seconds.

Second, on my AMD hexacore 64-bit Linux system, the running time consistently measures between 0.57 or 0.58 seconds for both cases. The exact figure changes between runs, and as far as I can tell, there's no discernible difference between the two.

Third, to make any timing differences stand out from the background noise, I increased n to 20_000_000, and both versions of the program consistently runs in about 11 seconds each time. There's no discernible difference between the two.

What all this means is that a single call to writeln does not make enough difference to be measurable compared to the rest of the program. It doesn't mean that the version with writeln is "faster", just that the difference is too small and you're probably just seeing background noise. If you put the writeln inside a loop, on the other hand, you'll see a big difference, because now its cost is magnified by the number of times the loop runs. (Say if you put it inside a foreach(i;0..1000) at the end of the program, you'll see the difference when you comment it out.)

So I'd chalk it up to inherent measurement inaccuracies.


T

-- 
He who sacrifices functionality for ease of use, loses both and deserves neither. -- Slashdotter
April 27, 2012
On 2012-04-25, SomeDude <lovelydear@mailmetrash.com> wrote:
> On Wednesday, 25 April 2012 at 15:35:44 UTC, Steven Schveighoffer wrote:
>> On Wed, 25 Apr 2012 07:27:29 -0400, SomeDude <lovelydear@mailmetrash.com> wrote:
>>
>>> On Wednesday, 25 April 2012 at 08:34:40 UTC, SomeDude wrote:
>>>
>>> Noone reproduces this ?
>>
>> On my linux box, it runs in about 580ms, with or without the writeln.
>>
>> This is what I would expect.
>>
>> But things can be strange when dealing with GC timings.  Have you tried profiling the code to see where the time is being added?
>>
>> -Steve
>
> Did you compile with  dmd -O -inline -release ?

I get fairly consistent result on Linux with dmd, but one oddity with gdc.

  With the first writeln in and the second commented out:
  gdmd -O slow.d
  time ./slow
   ... 0.342 total
  gdmd -O -release slow.d
  time ./slow
   ... 2.866 total

If I put both writelns in, I don't see the fast version.

% gdc --version
gdc (Debian 4.6.2-4) 4.6.2

David
April 27, 2012
Am Fri, 27 Apr 2012 04:18:47 +0000 (UTC)
schrieb David Brown <davidb@davidb.org>:

> On 2012-04-25, SomeDude <lovelydear@mailmetrash.com> wrote:
> > On Wednesday, 25 April 2012 at 15:35:44 UTC, Steven Schveighoffer wrote:
> >> On Wed, 25 Apr 2012 07:27:29 -0400, SomeDude <lovelydear@mailmetrash.com> wrote:
> >>
> >>> On Wednesday, 25 April 2012 at 08:34:40 UTC, SomeDude wrote:
> >>>
> >>> Noone reproduces this ?
> >>
> >> On my linux box, it runs in about 580ms, with or without the writeln.
> >>
> >> This is what I would expect.
> >>
> >> But things can be strange when dealing with GC timings.  Have you tried profiling the code to see where the time is being added?
> >>
> >> -Steve
> >
> > Did you compile with  dmd -O -inline -release ?
> 
> I get fairly consistent result on Linux with dmd, but one oddity with gdc.
> 
>   With the first writeln in and the second commented out:
>   gdmd -O slow.d
>   time ./slow
>    ... 0.342 total
>   gdmd -O -release slow.d
>   time ./slow
>    ... 2.866 total
> 
> If I put both writelns in, I don't see the fast version.
> 
> % gdc --version
> gdc (Debian 4.6.2-4) 4.6.2
> 
> David

I cannot reproduce this with the same GCC version and the latest GitHub sources for GDC. Is that behavior consistent after clean recompiles? I get the ~3 seconds result no matter how I compile.

-- 
Marco

April 27, 2012
On Wednesday, 25 April 2012 at 17:37:33 UTC, H. S. Teoh wrote:
>
> First of all, differences as small as 20ms really should be considered
> as background noise. The exact measurements depend on a lot of
> system-specific and environment-specific factors, such as OS memory
> usage, CPU cache behaviour, disk activity & speed, the exact points of
> context switches, etc.. If you really want to check for substantial
> performance differences, you need to magnify your test case so that
> differences are measured >5 seconds.
>
> Second, on my AMD hexacore 64-bit Linux system, the running time
> consistently measures between 0.57 or 0.58 seconds for both cases. The
> exact figure changes between runs, and as far as I can tell, there's no
> discernible difference between the two.
>
> Third, to make any timing differences stand out from the background
> noise, I increased n to 20_000_000, and both versions of the program
> consistently runs in about 11 seconds each time. There's no discernible
> difference between the two.
>
> What all this means is that a single call to writeln does not make
> enough difference to be measurable compared to the rest of the program.
> It doesn't mean that the version with writeln is "faster", just that the
> difference is too small and you're probably just seeing background
> noise. If you put the writeln inside a loop, on the other hand, you'll
> see a big difference, because now its cost is magnified by the number of
> times the loop runs. (Say if you put it inside a foreach(i;0..1000) at
> the end of the program, you'll see the difference when you comment it
> out.)
>
> So I'd chalk it up to inherent measurement inaccuracies.
>
>
> T

Thanks, indeed, for n = 5_000_000, I observe the expected result, i.e the writeln version being almost 1 second slower than without.
Below 2_000_000, though, I consistently see the opposite on my machine, i.e the version with writeln being slightly faster. If it was background noise, it would be about equal, no ?

April 27, 2012
On Fri, 27 Apr 2012 10:09:18 -0400, SomeDude <lovelydear@mailmetrash.com> wrote:

> On Wednesday, 25 April 2012 at 17:37:33 UTC, H. S. Teoh wrote:
>>
>> First of all, differences as small as 20ms really should be considered
>> as background noise. The exact measurements depend on a lot of
>> system-specific and environment-specific factors, such as OS memory
>> usage, CPU cache behaviour, disk activity & speed, the exact points of
>> context switches, etc.. If you really want to check for substantial
>> performance differences, you need to magnify your test case so that
>> differences are measured >5 seconds.
>>
>> Second, on my AMD hexacore 64-bit Linux system, the running time
>> consistently measures between 0.57 or 0.58 seconds for both cases. The
>> exact figure changes between runs, and as far as I can tell, there's no
>> discernible difference between the two.
>>
>> Third, to make any timing differences stand out from the background
>> noise, I increased n to 20_000_000, and both versions of the program
>> consistently runs in about 11 seconds each time. There's no discernible
>> difference between the two.
>>
>> What all this means is that a single call to writeln does not make
>> enough difference to be measurable compared to the rest of the program.
>> It doesn't mean that the version with writeln is "faster", just that the
>> difference is too small and you're probably just seeing background
>> noise. If you put the writeln inside a loop, on the other hand, you'll
>> see a big difference, because now its cost is magnified by the number of
>> times the loop runs. (Say if you put it inside a foreach(i;0..1000) at
>> the end of the program, you'll see the difference when you comment it
>> out.)
>>
>> So I'd chalk it up to inherent measurement inaccuracies.
>>
>>
>> T
>
> Thanks, indeed, for n = 5_000_000, I observe the expected result, i.e the writeln version being almost 1 second slower than without.
> Below 2_000_000, though, I consistently see the opposite on my machine, i.e the version with writeln being slightly faster. If it was background noise, it would be about equal, no ?

writeln should not make a difference.  Even walkLength should not, Especially over 100 nodes.

What you are likely seeing is some weird issue with GC cleanup.

Have you tried measuring the code timings just inside main instead of the full execution of the program including runtime startup and shutdown?

-Steve
April 27, 2012
On Friday, 27 April 2012 at 14:14:37 UTC, Steven Schveighoffer wrote:
> Have you tried measuring the code timings just inside main instead of the full execution of the program including runtime startup and shutdown?
>
> -Steve

OK, it seems you are right. It turns out using Measure-Command{...} was a rather poor way of measuring runtimes because of startup/shutdown.

Using StopWatch, I now get consistent results, i.e nearly the same times with or without the writeln, but always slightly faster without. So all is sorted now.
Thx again.
1 2
Next ›   Last »