AVX for math code ... avx instructions later disappearing ? - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » Learn » AVX for math code ... avx instructions later disappearing ?

Thread overview

AVX for math code ... avx instructions later disappearing ?
Sep 26, 2021 james.p.leblanc
Sep 26, 2021 kinke
Sep 26, 2021 james.p.leblanc

September 26, 2021

AVX for math code ... avx instructions later disappearing ?

Posted by james.p.leblanc

james.p.leblanc

Dear D-ers,

I enjoyed reading some details of incorporating AVX into math code
from Johan Engelen's programming blog post:

http://johanengelen.github.io/ldc/2016/10/11/Math-performance-LDC.html

Basically, one can use the ldc compiler to insert avx code, nice!

In playing with some variants of his example code, I realize
that there are issues I do not understand. For example, the following
code successfully incorporates the avx instructions:

// File here is called dotFirst.d
import ldc.attributes : fastmath;
@fastmath

double dot( double[] a, double[] b)
{
    double s = 0.0;
    foreach (size_t i; 0 .. a.length) {
        s += a[i] * b[i];
    }
    return s;
}

double[8] x =[0.0, 1.1, 2.2, 3.3, 4.4, 5.5, 6.6, 7.7, ];
double[8] y =[0.0, 1.1, 2.2, 3.3, 4.4, 5.5, 6.6, 7.7, ];

void main()
{
    double z = 0.0;
    z = dot(x, y);
}

If we run:

ldc2 -c -output-s -O3 -release dotFirst.d -mcpu=haswell
echo "Results of grep ymm dotFirst.s:"
grep ymm dotFirst.s

The "grep" shows a number of vector instructions, such as:

vfmadd132pd 160(%rcx,%rdi,8), %ymm5, %ymm1

However, subtle changes in the code (such as moving the dot product
function to a module, or even moving the array declarations to before
the dot product function, and the avx instructions will disappear!

import ldc.attributes : fastmath;
@fastmath

double[8] x =[0.0, 1.1, 2.2, 3.3, 4.4, 5.5, 6.6, 7.7, ];
double[8] y =[0.0, 1.1, 2.2, 3.3, 4.4, 5.5, 6.6, 7.7, ];

double dot( double[] a, double[] b)
{
    double s = 0.0;
    foreach (size_t i; 0 .. a.length) {
...

Now a grep will not find a single ymm.

It is understood that ldc needs proper alignment to be able to do the vector
instructions...

But my question is: how is proper alignment guaranteed? (Most importantly
how guaranteed among code using modules)?? (There are related stack alignment
issues -- 16?)

Best Regards,
James

PS I have come across scattered bits of (sometimes contradictory) information on
avx/simd for dlang. Is there a canonical source for vector info?

September 26, 2021

Re: AVX for math code ... avx instructions later disappearing ?

Posted by kinke
in reply to james.p.leblanc

kinke

Posted in reply to james.p.leblanc

On Sunday, 26 September 2021 at 18:08:46 UTC, james.p.leblanc wrote:

>

or even moving the array declarations to before
the dot product function, and the avx instructions will disappear!

That's because the @fastmath UDA applies to the next declaration only, which is the x array in your 2nd example (where it obviously has no effect). Either use @fastmath: with the colon to apply it to the entire scope, or use -ffast-math in the LDC cmdline.

Similarly, when moving the function to another module and you don't include that module in the cmdline, it's only imported and not compiled and won't show up in the resulting assembly.

Wrt. stack alignment, there aren't any issues with LDC AFAIK (not limited to 16 or whatever like DMD).

September 26, 2021

Re: AVX for math code ... avx instructions later disappearing ?

Posted by james.p.leblanc
in reply to kinke

james.p.leblanc

Posted in reply to kinke

On Sunday, 26 September 2021 at 19:00:54 UTC, kinke wrote:

>

On Sunday, 26 September 2021 at 18:08:46 UTC, james.p.leblanc wrote:

>

or even moving the array declarations to before
the dot product function, and the avx instructions will disappear!

That's because the @fastmath UDA applies to the next declaration only, which is the x array in your 2nd example (where it obviously has no effect). Either use @fastmath: with the colon to apply it to the entire scope, or use -ffast-math in the LDC cmdline.

Similarly, when moving the function to another module and you don't include that module in the cmdline, it's only imported and not compiled and won't show up in the resulting assembly.

Wrt. stack alignment, there aren't any issues with LDC AFAIK (not limited to 16 or whatever like DMD).

Kinke,

Thanks very much for your response. There were many issues that I
had been misunderstanding in my attempts. The provided explanation
helped me understand the broader scope of what is happening.

(I never even thought about the @fastmath UDA aspect! ... a bit
embarrassing for me!) Using the -ffast-math in the LDC
cmdline seems to be a most elegant solution.

Much appreciated!
Regards,
James

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation