Thread overview | ||||||
---|---|---|---|---|---|---|
|
July 18, 2013 Simd instructions | ||||
---|---|---|---|---|
| ||||
As a final step to compute the product of two complex numbers I perform a simd operation on double2: x3 = [x3.array[1] - x2.array[1], x3.array[0] + x2.array[0]]; But ldc2 compiles that quite badly (I don't know who's to blame, if necessary I will open a LLVM bug report), so I have tried to use an instruction addsubpd. To do it I have imported ldc.gccbuiltins_x86 and then I use: x3 = __builtin_ia32_addsubpd(x3, x2); but ldc2 gives me: LLVM ERROR: Cannot select: intrinsic %llvm.x86.sse3.addsub.pd Stack dump: 0. Running pass 'X86 DAG->DAG Instruction Selection' on function '@"\01__D12complex_mul217__T8compMul6Vk12Z8compMul6FNaNbNfKG12NhG2dKG12NhG2dKG12NhG2dZv"' Can you help me? Bye, bearophile |
July 18, 2013 Re: Simd instructions | ||||
---|---|---|---|---|
| ||||
Posted in reply to bearophile | Try adding flag -mattr=sse3. |
July 18, 2013 Re: Simd instructions | ||||
---|---|---|---|---|
| ||||
Posted in reply to jerro | jerro:
> Try adding flag -mattr=sse3.
Now it's accepted, thank you. So is LDC2 assuming a very old CPU? :-)
Bye,
bearophile
|
July 23, 2013 Re: Simd instructions | ||||
---|---|---|---|---|
| ||||
Posted in reply to bearophile | On Thursday, 18 July 2013 at 21:30:12 UTC, bearophile wrote: > jerro: > >> Try adding flag -mattr=sse3. > > Now it's accepted, thank you. So is LDC2 assuming a very old CPU? :-) > > Bye, > bearophile Hi, the behaviour was changed because you can't create a generic package if you optimize for your CPU. But the change created other problems, see issue #414 (https://github.com/ldc-developers/ldc/issues/414). With LLVM 3.3, the auto vectorizer is not enabled. You have to specify -vectorize on the command line. Maybe you want to try that with your original code. Kai |
Copyright © 1999-2021 by the D Language Foundation