Thread overview
performance cost of sample conversion
Sep 07, 2017
Ali Çehreli
Sep 07, 2017
Johan Engelen
September 07, 2017
if I have a non-double buffer and temporarily convert to double then convert back, do I save many cycles rather than just using a double buffer? I know it will bea lot more memory, but I'm specifically talking about the cycles in converting to and from vs no conversion.

Using a double for everything gives the highest precision and makes things much easier but is that the way to go or does it costs quite a bit in performance?
September 06, 2017
On 09/06/2017 07:06 PM, Psychological Cleanup wrote:
> if I have a non-double buffer and temporarily convert to double then
> convert back, do I save many cycles rather than just using a double
> buffer? I know it will bea lot more memory, but I'm specifically talking
> about the cycles in converting to and from vs no conversion.
>
> Using a double for everything gives the highest precision and makes
> things much easier but is that the way to go or does it costs quite a
> bit in performance?

You have to measure. Here's a start:

import std.conv;
import std.range;
import std.datetime;
import std.stdio;

double workWithDouble(double d) {
    return d * d / 7;
}

void workWithFloats(float[] floats) {
    foreach (ref f; floats) {
        f = workWithDouble(f).to!float;
    }
}

void workWithDoubles(double[] doubles) {
    foreach (ref d; doubles) {
        d = workWithDouble(d);
    }
}

void main() {
    foreach (n; [ 1_000, 1_000_000, 10_000_000 ]) {
        const beg = -1f;
        const end = 1f;
        const step = (end - beg) / n;
        auto floats = iota(beg, end, step).array;
        auto doubles = iota(double(beg), end, step).array;
        {
            auto sw = StopWatch(AutoStart.yes);
            workWithDoubles(doubles);
            writefln("%10s no   conversion: %10s usecs", n, sw.peek().usecs);
        }
        {
            auto sw = StopWatch(AutoStart.yes);
            workWithFloats(floats);
            writefln("%10s with conversion: %10s usecs", n, sw.peek().usecs);
        }
    }
}

Conversion seems to be more costly:

      1000 no   conversion:         27 usecs
      1000 with conversion:         40 usecs
   1000000 no   conversion:       1715 usecs
   1000000 with conversion:       5412 usecs
  10000000 no   conversion:      16280 usecs
  10000000 with conversion:      47190 usecs

Ali

September 07, 2017
On Thursday, 7 September 2017 at 05:45:58 UTC, Ali Çehreli wrote:
> On 09/06/2017 07:06 PM, Psychological Cleanup wrote:
>> if I have a non-double buffer and temporarily convert to double then
>> convert back, do I save many cycles rather than just using a double
>> buffer? I know it will bea lot more memory, but I'm specifically talking
>> about the cycles in converting to and from vs no conversion.
>>
>> Using a double for everything gives the highest precision and makes
>> things much easier but is that the way to go or does it costs quite a
>> bit in performance?
>
> You have to measure. Here's a start:
>
> import std.conv;
> import std.range;
> import std.datetime;
> import std.stdio;
>
> double workWithDouble(double d) {
>     return d * d / 7;
> }
>
> void workWithFloats(float[] floats) {
>     foreach (ref f; floats) {
>         f = workWithDouble(f).to!float;
>     }
> }
>
> void workWithDoubles(double[] doubles) {
>     foreach (ref d; doubles) {
>         d = workWithDouble(d);
>     }
> }
>
> void main() {
>     foreach (n; [ 1_000, 1_000_000, 10_000_000 ]) {
>         const beg = -1f;
>         const end = 1f;
>         const step = (end - beg) / n;
>         auto floats = iota(beg, end, step).array;
>         auto doubles = iota(double(beg), end, step).array;
>         {
>             auto sw = StopWatch(AutoStart.yes);
>             workWithDoubles(doubles);
>             writefln("%10s no   conversion: %10s usecs", n, sw.peek().usecs);
>         }
>         {
>             auto sw = StopWatch(AutoStart.yes);
>             workWithFloats(floats);
>             writefln("%10s with conversion: %10s usecs", n, sw.peek().usecs);
>         }
>     }
> }
>
> Conversion seems to be more costly:
>
>       1000 no   conversion:         27 usecs
>       1000 with conversion:         40 usecs
>    1000000 no   conversion:       1715 usecs
>    1000000 with conversion:       5412 usecs
>   10000000 no   conversion:      16280 usecs
>   10000000 with conversion:      47190 usecs
>
> Ali

Thanks. my results

dmd x86 debug
    asserts on the line `auto floats = iota(beg, end, step).array;`


dmd x64 debug
      1000 no   conversion:         15 usecs
      1000 with conversion:          5 usecs
   1000000 no   conversion:       2824 usecs
   1000000 with conversion:       5689 usecs
  10000000 no   conversion:      24148 usecs
  10000000 with conversion:      56335 usecs

dmd release x86
      1000 no   conversion:          1 usecs
      1000 with conversion:          1 usecs
   1000000 no   conversion:       1903 usecs
   1000000 with conversion:       1262 usecs
  10000000 no   conversion:      19156 usecs
  10000000 with conversion:      12831 usecs

dmd release x64
      1000 no   conversion:          4 usecs
      1000 with conversion:         17 usecs
   1000000 no   conversion:       4531 usecs
   1000000 with conversion:       4516 usecs
  10000000 no   conversion:      45928 usecs
  10000000 with conversion:      46080 usecs

ldc x86 debug
      1000 no   conversion:          3 usecs
      1000 with conversion:         32 usecs
   1000000 no   conversion:       3563 usecs
   1000000 with conversion:      19240 usecs
  10000000 no   conversion:      35986 usecs
  10000000 with conversion:     192025 usecs

ldc x64 debug
      1000 no   conversion:          2 usecs
      1000 with conversion:         10 usecs
   1000000 no   conversion:       2855 usecs
   1000000 with conversion:      10309 usecs
  10000000 no   conversion:      28254 usecs
  10000000 with conversion:     101380 usecs

ldc x86 release
      1000 no   conversion:          0 usecs
      1000 with conversion:          0 usecs
   1000000 no   conversion:       1280 usecs
   1000000 with conversion:        532 usecs
  10000000 no   conversion:      10403 usecs
  10000000 with conversion:       5752 usecs

ldc x64 release
      1000 no   conversion:          0 usecs
      1000 with conversion:          1 usecs
   1000000 no   conversion:        887 usecs
   1000000 with conversion:        550 usecs
  10000000 no   conversion:      10730 usecs
  10000000 with conversion:       5482 usecs

The results are strange, sometimes the conversion wins.

September 07, 2017
On Thursday, 7 September 2017 at 05:45:58 UTC, Ali Çehreli wrote:
>
> You have to measure.

Indeed.

> Here's a start:

The program has way too many things pre-defined, and the semantics are such that workWithDoubles can be completely eliminated... So you are not measuring what you want to be measuring.
Make stuff depend on argc, and print the result of calculations or do something else such that the calculation must be performed. When measuring without LTO, probably attaching @weak onto the workWith* functions will work too. (pragma(inline, false) does not prevent reasoning about the function)

-Johan