On Tuesday, 27 June 2023 at 09:31:50 UTC, Sergey wrote:
>On Monday, 26 June 2023 at 20:22:24 UTC, Cecil Ward wrote:
>Anyone up for it?
Hi. Every problem from this set of benchmarks has D solution https://programming-language-benchmarks.vercel.app/
...
Nice link.
Also GDC and LDC have their special things for vectorizations. And maybe Bruce Carneal could share some examples/benchmarks of them.
The TL;DR from my auto vectorization explorations is that GDC and LDC are very close in performance with GDC having a slight edge when doing some patterned "gathers" (like pulling elements from an RGGB Bayer pattern) and in exploiting the per-lane capabilities of later AVX-512 although I expect LDC will match it there pretty soon. Anecdotally it appears that they'll both do very well on SVE2 and RVV platforms.
Once you get things to unit stride form LDC and GDC are both very good. They both handle ternary expression style conditionals well, for example. You may wish to use @restrict pointers to signal independence.
Also, if you're serious about programmer friendly data parallelism on CPUs definitely take a look at mir. dcompute (LDC only) is worth a hard look for CUDA/OpenCL deployments.
Godbolt is your friend here. Have fun!