Thread overview
SSE and AVX with D
May 15, 2012
Pavel Umnikov
May 15, 2012
Pavel Umnikov
May 15, 2012
jerro
May 15, 2012
Walter Bright
May 15, 2012
jerro
May 15, 2012
Hello everyone,

I am just recently jumped to D Language from C++ and want to rewrite my current engine from scratch using this language. My math and physics libraries were written utilizing many SSE functions(and AVX if such CPU is presented). Can I use SSE/AVX code in D? SSE/AVX direct intrinsics or Assember inlining with SSE/AVX?

Thanks!
May 15, 2012
On 15-05-2012 16:27, Pavel Umnikov wrote:
> Hello everyone,
>
> I am just recently jumped to D Language from C++ and want to rewrite my
> current engine from scratch using this language. My math and physics
> libraries were written utilizing many SSE functions(and AVX if such CPU
> is presented). Can I use SSE/AVX code in D? SSE/AVX direct intrinsics or
> Assember inlining with SSE/AVX?
>
> Thanks!

Have a look at these:

* http://dlang.org/phobos/core_cpuid.html
* http://dlang.org/iasm.html
* http://dlang.org/simd.html

-- 
- Alex
May 15, 2012
On Tuesday, 15 May 2012 at 14:28:51 UTC, Alex Rønne Petersen wrote:
> On 15-05-2012 16:27, Pavel Umnikov wrote:
>> Hello everyone,
>>
>> I am just recently jumped to D Language from C++ and want to rewrite my
>> current engine from scratch using this language. My math and physics
>> libraries were written utilizing many SSE functions(and AVX if such CPU
>> is presented). Can I use SSE/AVX code in D? SSE/AVX direct intrinsics or
>> Assember inlining with SSE/AVX?
>>
>> Thanks!
>
> Have a look at these:
>
> * http://dlang.org/phobos/core_cpuid.html
> * http://dlang.org/iasm.html
> * http://dlang.org/simd.html

Thank you, Alex!
May 15, 2012
On Tuesday, 15 May 2012 at 14:32:20 UTC, Pavel Umnikov wrote:
> On Tuesday, 15 May 2012 at 14:28:51 UTC, Alex Rønne Petersen wrote:
>> On 15-05-2012 16:27, Pavel Umnikov wrote:
>>> Hello everyone,
>>>
>>> I am just recently jumped to D Language from C++ and want to rewrite my
>>> current engine from scratch using this language. My math and physics
>>> libraries were written utilizing many SSE functions(and AVX if such CPU
>>> is presented). Can I use SSE/AVX code in D? SSE/AVX direct intrinsics or
>>> Assember inlining with SSE/AVX?
>>>
>>> Thanks!
>>
>> Have a look at these:
>>
>> * http://dlang.org/phobos/core_cpuid.html
>> * http://dlang.org/iasm.html
>> * http://dlang.org/simd.html
>
> Thank you, Alex!

Note that core.simd currently only defines SSE intrinsics for
instructions of the form

INSTRUCTION xmm1, xmm2/m128

which means that instructions such as shufps are not supported.
You could take a look at gdc, which provides gcc builtins
through module gcc.builtins. To find the builtin names you can
take a look at gcc implementation of xmmintrin.h. GDC also
produces faster code than DMD, especially for floating point
code. It does not yet support AVX, though.

If you want to  use AVX for operations that don't have an
operator, currently your only choice (AFAIK) is to use LDC
and an ugly workaround that I used at
https://github.com/jerro/pfft. You write your"intrinsics"
in c and use clang to compile them to .bc (or write a .ll
file manually if you know the llvm assembly language). Then
you compile your D code with LLVM using the flags -output-bc
and -single-obj. You merge the resulting .bc file with the
"intrinsics" file using llvm-link, then optimize it using
opt and convert them to assembly using llc. Here is an
example:

https://github.com/jerro/pfft/blob/master/build-ldc2.sh

I have only tried this on linux.
May 15, 2012
On 5/15/2012 9:39 AM, jerro wrote:
> Note that core.simd currently only defines SSE intrinsics for
> instructions of the form
>
> INSTRUCTION xmm1, xmm2/m128
>
> which means that instructions such as shufps are not supported.
> You could take a look at gdc, which provides gcc builtins
> through module gcc.builtins. To find the builtin names you can
> take a look at gcc implementation of xmmintrin.h. GDC also
> produces faster code than DMD, especially for floating point
> code. It does not yet support AVX, though.
>
> If you want to use AVX for operations that don't have an
> operator, currently your only choice (AFAIK) is to use LDC
> and an ugly workaround that I used at
> https://github.com/jerro/pfft. You write your"intrinsics"
> in c and use clang to compile them to .bc (or write a .ll
> file manually if you know the llvm assembly language). Then
> you compile your D code with LLVM using the flags -output-bc
> and -single-obj. You merge the resulting .bc file with the
> "intrinsics" file using llvm-link, then optimize it using
> opt and convert them to assembly using llc. Here is an
> example:
>
> https://github.com/jerro/pfft/blob/master/build-ldc2.sh
>
> I have only tried this on linux.

You can use the inline assembler for shufps, also for AVX.
May 15, 2012
> You can use the inline assembler for shufps, also for AVX.

Of course you can, I forgot to mention that. I do that in
parts of pfft when it is compiled using DMD (but only for
SSE). But because of the overhead of copying values from the
stack to registers and back to the stack or calling a function
it only makes sense to do that when the chunk of code you are
replacing with inline assmbly takes longer than a few cycles.
This forces you to write larger chunks of code in inline
assembly, which is not always practical.