D-Ers,
I have been getting counterintuitive results on avx/no-avx timing
experiments. Storyline to date (notes at end):
Experiment #1) Real float data type (i.e. non-complex numbers),
speed comparison.
a) moving from non-avx --> avx shows non-realistic speed up of 15-25 X.
b) this is weird, but story continues ...
Experiment #2) Real double data type (non-complex numbers),
a) moving from non-avx --> avx again shows amazing gains, but the
gains are about half of those seen in Experiment #1, so maybe
this looks plausible?
Experiment #3) Complex!float datatypes:
a) now going from non-avx to avx shows a serious performance LOSS
of 40% to breaking even at best. What is happening here?
Experiment #4) Complex!double:
a) non-avx --> avx shows performancegains again about 2X (so the
gains appear to be reasonable).
The main question I have is:
"What is going on with the Complex!float performance?" One might expect
floats to have a better perfomance than doubles as we saw with the
real-value data (becuase of vector packaging, memory bandwidth, etc).
But, Complex!float shows MUCH WORSE avx performance than Complex!Double
(by a factor of almost 4).
// Table of Computation Times
//
// self math std math
// explicit no-explicit explicit no-explicit
// align align align align
// 0.12 0.21 0.15 0.21 ; # Float with AVX
// 3.23 3.24 3.30 3.22 ; # Float without AVX
// 0.31 0.42 0.31 0.42 ; # Double with AVX
// 3.25 3.24 3.24 3.27 ; # Double without AVX
// 6.42 6.62 6.61 6.59 ; # Complex!float with AVX
// 4.04 4.17 6.68 5.82 ; # Complex!float without AVX
// 1.67 1.69 1.73 1.71 ; # Complex!double with AVX
// 3.34 3.42 3.28 3.31 # Complex!double without AVX
Notes:
-
Based on forum hints from ldc experts, I got good guidance
on enabling avx ( i.e. compiling modules on command line, using
--fast-math and -mcpu=haswell on command line). -
From Mir-glas experts I received hints to try to implement own version
of the complex math. (this is what the "self-math" column refers to).
I understand that detail of the computations are not included here, (I
can do that if there is interest, and if I figure out an effective way to present
it in a forum.)
But, I thought I might begin with a simple question, "Is there some well-known
issue that I am missing here". Have others been done this road as well?
Thanks for any and all input.
Best Regards,
James
PS Sorry for the inelegant table ... I do not believe there is a way
to include the beautiful bars charts on this forum. Please correct me
if there is a way...)