A look at Chapel, D, and Julia using kernel matrix calculations

May 22, 2020

data pulverizer

May 22, 2020

May 22, 2020

May 22, 2020

May 22, 2020

May 22, 2020

May 23, 2020

May 24, 2020

May 24, 2020

May 24, 2020

May 24, 2020

May 24, 2020

May 24, 2020

May 25, 2020

May 22, 2020

May 29, 2020

Hi, this article grew out of a Dlang Learn thread (https://forum.dlang.org/thread/motdqixwsqmabzkdoslp@forum.dlang.org). It looks at Kernel Matrix Calculations in Chapel, D, and Julia and has a more general discussion of all three languages. Comments welcome. https://github.com/dataPulverizer/KernelMatrixBenchmark Thanks

On Friday, 22 May 2020 at 01:58:07 UTC, data pulverizer wrote: > Hi, > > this article grew out of a Dlang Learn thread (https://forum.dlang.org/thread/motdqixwsqmabzkdoslp@forum.dlang.org). It looks at Kernel Matrix Calculations in Chapel, D, and Julia and has a more general discussion of all three languages. Comments welcome. > > https://github.com/dataPulverizer/KernelMatrixBenchmark > > Thanks Very well done, an interesting read. I like the comment make about the lack of examples for how to link to C code. Mike Parker had some excellent tutorials on this, but I couldn't find them after a quick search.

On 22/05/2020 3:12 PM, CraigDillabaugh wrote: > Very well done, an interesting read. I like the comment make about the lack of examples for how to link to C code. Mike Parker had some excellent tutorials on this, but I couldn't find them after a quick search. https://dlang.org/blog/category/d-and-c/

On Friday, 22 May 2020 at 01:58:07 UTC, data pulverizer wrote: > Hi, > > this article grew out of a Dlang Learn thread (https://forum.dlang.org/thread/motdqixwsqmabzkdoslp@forum.dlang.org). It looks at Kernel Matrix Calculations in Chapel, D, and Julia and has a more general discussion of all three languages. Comments welcome. > > https://github.com/dataPulverizer/KernelMatrixBenchmark > > Thanks Nice post. You said "adding SIMD support could easily put D ahead or on par with Julia at the larger data size". It's not clear precisely what you mean. Does this package help? https://code.dlang.org/packages/intel-intrinsics

On Friday, 22 May 2020 at 13:46:21 UTC, bachmeier wrote: > On Friday, 22 May 2020 at 01:58:07 UTC, data pulverizer wrote: >> https://github.com/dataPulverizer/KernelMatrixBenchmark > > Nice post. You said "adding SIMD support could easily put D ahead or on par with Julia at the larger data size". It's not clear precisely what you mean. Does this package help? > > https://code.dlang.org/packages/intel-intrinsics Sorry it wasn't clear, I have amended the statement. I meant adding SIMD support to my matrix object could put D's performance at the largest data set on par or ahead of Julia since Julia edges D out on that data set and has SIMD support whereas my matrix does not, so I'm betting that that is the "x-factor" in Julia's performance at that scale. I've removed "easily" because it's too strong a word - more of an "educated" speculation. Probably something to look at next. I need to do some reading on SIMD. Thanks for the link, it's code that will get me started.

On Friday, 22 May 2020 at 01:58:07 UTC, data pulverizer wrote: > Comments welcome. Thx for the article. - You mention the lack of multi-dim array support in Phobos; AFAIK, that's fully intentional, and the de-facto solution is http://docs.algorithm.dlang.io/latest/mir_ndslice.html. As you suspect SIMD potential being left on the table by LDC, you can firstly use -mcpu=native to enable advanced instructions supported by your CPU, and secondly use -fsave-optimization-record to inspect LLVM's optimization remarks (e.g., why a loop isn't auto-vectorized etc.). -O5 is identical to -O3, which is identical to -O.

Nice article. It shows that you put a lot of work into it. The one thing I want to point out is that --O5 is the same as --O3. From ldc2 --help Setting the optimization level: -O - Equivalent to -O3 --O0 - No optimizations (default) --O1 - Simple optimizations --O2 - Good optimizations --O3 - Aggressive optimizations --O4 - Equivalent to -O3 --O5 - Equivalent to -O3 --Os - Like -O2 with extra optimizations for size --Oz - Like -Os but reduces code size further

On Friday, 22 May 2020 at 14:13:50 UTC, kinke wrote: > On Friday, 22 May 2020 at 01:58:07 UTC, data pulverizer wrote: >> Comments welcome. > > Thx for the article. - You mention the lack of multi-dim array support in Phobos; AFAIK, that's fully intentional, and the de-facto solution is http://docs.algorithm.dlang.io/latest/mir_ndslice.html. > > As you suspect SIMD potential being left on the table by LDC, you can firstly use -mcpu=native to enable advanced instructions supported by your CPU, and secondly use -fsave-optimization-record to inspect LLVM's optimization remarks (e.g., why a loop isn't auto-vectorized etc.). -O5 is identical to -O3, which is identical to -O. docs.algorithm.dlang.io is outdated. The new docs home is http://mir-algorithm.libmir.org/mir_ndslice.html

May 24, 2020

Re: A look at Chapel, D, and Julia using kernel matrix calculations

Posted by data pulverizer
in reply to kinke

Permalink

data pulverizer

Posted in reply to kinke

Permalink

On Friday, 22 May 2020 at 14:13:50 UTC, kinke wrote:
> On Friday, 22 May 2020 at 01:58:07 UTC, data pulverizer wrote:
>> Comments welcome.
>
> Thx for the article. - You mention the lack of multi-dim array support in Phobos; AFAIK, that's fully intentional, and the de-facto solution is http://docs.algorithm.dlang.io/latest/mir_ndslice.html.

I've now updated the blog with this information (the new docs home: http://mir-algorithm.libmir.org/mir_ndslice.html).

> As you suspect SIMD potential being left on the table by LDC, you can firstly use -mcpu=native to enable advanced instructions supported by your CPU, ...

If special instructions are enabled does the compiler automatically take advantage of these or does the programmer need to do anything?

> ... and secondly use -fsave-optimization-record to inspect LLVM's optimization remarks (e.g., why a loop isn't auto-vectorized etc.).

By this are you saying that SIMD happens automatically with `-mcpu=native` flag?

> ... -O5 is identical to -O3, which is identical to -O.

Yes I saw that when I was writing the code and tried it and found it to be true but there's something psychologically comforting about using -O5 rather than -O. I've updated the article to reflect your comments. I'm in the process of updating the D code and will change the flags once I'm done. Thanks

On Saturday, 23 May 2020 at 14:18:54 UTC, 9il wrote: > The new docs home is > http://mir-algorithm.libmir.org/mir_ndslice.html I'm hoping to keep writing articles like this and hope I can get round to doing one/some about one/some of Mir's modules. By now it's probably quite mature.

Forums