D Mir: standard deviation speed (page 2)

July 15, 2020

Re: D Mir: standard deviation speed

Posted by jmh530
in reply to tastyminerals

Permalink

jmh530

Posted in reply to tastyminerals

Permalink

On Wednesday, 15 July 2020 at 05:57:56 UTC, tastyminerals wrote:
> [snip]
>
> Here is a (WIP) project as of now.
> Line 160 in https://github.com/tastyminerals/mir_benchmarks_2/blob/master/source/basic_ops.d
>
> std of [60, 60] matrix 0.0389492 (> 0.001727)
> std of [300, 300] matrix 1.03592 (> 0.043452)
> std of [600, 600] matrix 4.2875 (> 0.182177)
> std of [800, 800] matrix 7.9415 (> 0.345367)

I changed the dflags-ldc to "-mcpu-native -O" and compiled with `dub run --compiler=ldc2`. I got similar results as yours for both in the initial run.

I changed sd to

@fmamath private double sd(T)(Slice!(T*, 1) flatMatrix)
{
    pragma(inline, false);
    if (flatMatrix.empty)
        return 0.0;
    double n = cast(double) flatMatrix.length;
    double mu = flatMatrix.mean;
    return (flatMatrix.map!(a => (a - mu) ^^ 2)
            .sum!"precise" / n).sqrt;
}

and got

std of [10, 10] matrix 0.0016321
std of [20, 20] matrix 0.0069788
std of [300, 300] matrix 2.42063
std of [60, 60] matrix 0.0828711
std of [600, 600] matrix 9.72251
std of [800, 800] matrix 18.1356

And the biggest change by far was the sum!"precise" instead of sum!"fast".

When I ran your benchStd function with
ans = matrix.flattened.standardDeviation!(double, "online", "fast");
I got
std of [10, 10] matrix 1e-07
std of [20, 20] matrix 0
std of [300, 300] matrix 0
std of [60, 60] matrix 1e-07
std of [600, 600] matrix 0
std of [800, 800] matrix 0

I got the same result with Summator.naive. That almost seems too low.

The default is Summator.appropriate, which is resolved to Summator.pairwise in this case. It is faster than Summator.precise, but still slower than Summator.naive or Summator.fast. Your welfordSD should line up with Summator.naive.

When I change that to
ans = matrix.flattened.standardDeviation!(double, "online", "precise");
I get
Running .\mir_benchmarks_2.exe
std of [10, 10] matrix 0.0031737
std of [20, 20] matrix 0.0153603
std of [300, 300] matrix 4.15738
std of [60, 60] matrix 0.171211
std of [600, 600] matrix 17.7443
std of [800, 800] matrix 34.2592

I also tried changing your welfordSD function based on the stuff I mentioned above, but it did not make a large difference.

On Wednesday, 15 July 2020 at 11:23:00 UTC, jmh530 wrote: > On Wednesday, 15 July 2020 at 05:57:56 UTC, tastyminerals wrote: >> [snip] >> >> Here is a (WIP) project as of now. >> Line 160 in https://github.com/tastyminerals/mir_benchmarks_2/blob/master/source/basic_ops.d >> >> std of [60, 60] matrix 0.0389492 (> 0.001727) >> std of [300, 300] matrix 1.03592 (> 0.043452) >> std of [600, 600] matrix 4.2875 (> 0.182177) >> std of [800, 800] matrix 7.9415 (> 0.345367) > > I changed the dflags-ldc to "-mcpu-native -O" and compiled with `dub run --compiler=ldc2`. I got similar results as yours for both in the initial run. > > I changed sd to > > @fmamath private double sd(T)(Slice!(T*, 1) flatMatrix) @fastmath violates all summation algorithms except `"fast"`. The same bug is in the original author's post.

On Wednesday, 15 July 2020 at 11:26:19 UTC, 9il wrote: > [snip] >> >> @fmamath private double sd(T)(Slice!(T*, 1) flatMatrix) > > @fastmath violates all summation algorithms except `"fast"`. > The same bug is in the original author's post. I hadn't realized that @fmamath was the problem, rather than @fastmath overall. @fmamathis used on many mir.math.stat functions, though admittedly not in the accumulators.

On Wednesday, 15 July 2020 at 11:37:23 UTC, jmh530 wrote: > On Wednesday, 15 July 2020 at 11:26:19 UTC, 9il wrote: >> [snip] >>> >>> @fmamath private double sd(T)(Slice!(T*, 1) flatMatrix) >> >> @fastmath violates all summation algorithms except `"fast"`. >> The same bug is in the original author's post. > > I hadn't realized that @fmamath was the problem, rather than @fastmath overall. @fmamathis used on many mir.math.stat functions, though admittedly not in the accumulators. Ah, no, my bad! You write @fmamath, I have read it as @fastmath. @fmamath is OK here.

On Wednesday, 15 July 2020 at 11:41:35 UTC, 9il wrote: > [snip] > > Ah, no, my bad! You write @fmamath, I have read it as @fastmath. @fmamath is OK here. I've mixed up @fastmath and @fmamath as well. No worries.

On Wednesday, 15 July 2020 at 07:51:31 UTC, 9il wrote: > On Wednesday, 15 July 2020 at 07:34:59 UTC, tastyminerals wrote: >> On Wednesday, 15 July 2020 at 06:57:21 UTC, 9il wrote: >>> On Wednesday, 15 July 2020 at 06:55:51 UTC, 9il wrote: >>>> On Wednesday, 15 July 2020 at 06:00:46 UTC, tastyminerals wrote: >>>>> On Wednesday, 15 July 2020 at 02:08:48 UTC, 9il wrote: >>>>>> [...] >>>>> >>>>> Good to know. So, it's fine to use it with sum!"fast" but better avoid it for general purposes. >>>> >>>> They both are more precise by default. >>> >>> This was a reply to the other your post in the thread, sorry. Mir algorithms are more precise by default then the algorithms you have provided. >> >> Right. Is this why standardDeviation is significantly slower? > > Yes. It allows you to pick a summation option, you can try others then default in benchmarks. Indeed, I played around with VarianceAlgo and Summation options, and they impact the end result a lot. ans = matrix.flattened.standardDeviation!(VarianceAlgo.naive, Summation.appropriate); std of [300, 300] matrix 0.375903 std of [60, 60] matrix 0.0156448 std of [600, 600] matrix 1.54429 std of [800, 800] matrix 3.03954 ans = matrix.flattened.standardDeviation!(VarianceAlgo.online, Summation.appropriate); std of [300, 300] matrix 1.12404 std of [60, 60] matrix 0.041968 std of [600, 600] matrix 5.01617 std of [800, 800] matrix 8.64363 The Summation.fast behaves strange though. I wonder what happened here? ans = matrix.flattened.standardDeviation!(VarianceAlgo.naive, Summation.fast); std of [300, 300] matrix 1e-06 std of [60, 60] matrix 9e-07 std of [600, 600] matrix 1.2e-06 std of [800, 800] matrix 9e-07 ans = matrix.flattened.standardDeviation!(VarianceAlgo.online, Summation.fast); std of [300, 300] matrix 9e-07 std of [60, 60] matrix 9e-07 std of [600, 600] matrix 1.1e-06 std of [800, 800] matrix 1e-06

Forums