MIR vs. Numpy

Nov 18, 2020

Tobias Schmidt

Nov 18, 2020

Bastiaan Veelo

Nov 18, 2020

Nov 18, 2020

Nov 18, 2020

Nov 18, 2020

Nov 18, 2020

Nov 20, 2020

Nov 18, 2020

Dear all, to compare MIR and Numpy in the HPC context, we implemented a multigrid solver in Python using Numpy and in D using Mir and perforemd some benchmarks with them. You can find our code and results here: https://github.com/typohnebild/numpy-vs-mir Feedback is very welcome. Please feel free to open issues, pull requests or simply post your thoughts below. Kind regards, Tobias

On Wednesday, 18 November 2020 at 10:05:06 UTC, Tobias Schmidt wrote: > Dear all, > > to compare MIR and Numpy in the HPC context, we implemented a multigrid solver in Python using Numpy and in D using Mir and perforemd some benchmarks with them. > > You can find our code and results here: > https://github.com/typohnebild/numpy-vs-mir Nice numbers. I’m not a Python guy but I was under the impression that Numpy actually is written in C, so that when you benchmark Numpy you’re mostly benchmarking C, not Python. Therefore I had expected the Numpy performance to be much closer to D’s. An important factor I think, which I’m not sure you have discussed (didn’t look too closely), is the compiler backend that was used to compile D and Numpy. Then again, as a user one is mostly interested in the out-of-the-box performance, which this seems to be a good measure of. — Bastiaan.

On Wednesday, 18 November 2020 at 10:05:06 UTC, Tobias Schmidt wrote: > Dear all, > > to compare MIR and Numpy in the HPC context, we implemented a multigrid solver in Python using Numpy and in D using Mir and perforemd some benchmarks with them. > > You can find our code and results here: > https://github.com/typohnebild/numpy-vs-mir > > Feedback is very welcome. Please feel free to open issues, pull requests or simply post your thoughts below. > > Kind regards, > Tobias Very nice write up. It's been a while since I've used numba, so I was a little confused on the numba 1 and numba 8 runs. It also looks like you are compiling on ldc with -mcpu=native --boundscheck=off. Why not -O as well?

November 18, 2020

Re: MIR vs. Numpy

Posted by John Colvin
in reply to Bastiaan Veelo

Permalink

John Colvin

Posted in reply to Bastiaan Veelo

Permalink

On Wednesday, 18 November 2020 at 13:01:42 UTC, Bastiaan Veelo wrote:
> On Wednesday, 18 November 2020 at 10:05:06 UTC, Tobias Schmidt wrote:
>> Dear all,
>>
>> to compare MIR and Numpy in the HPC context, we implemented a multigrid solver in Python using Numpy and in D using Mir and perforemd some benchmarks with them.
>>
>> You can find our code and results here:
>> https://github.com/typohnebild/numpy-vs-mir
>
> Nice numbers. I’m not a Python guy but I was under the impression that Numpy actually is written in C, so that when you benchmark Numpy you’re mostly benchmarking C, not Python. Therefore I had expected the Numpy performance to be much closer to D’s. An important factor I think, which I’m not sure you have discussed (didn’t look too closely), is the compiler backend that was used to compile D and Numpy. Then again, as a user one is mostly interested in the out-of-the-box performance, which this seems to be a good measure of.
>
> — Bastiaan.

A lot of numpy is in C, C++, fortran, asm etc....

But when you chain a bunch of things together, you are going via python. The language boundary (and python being slow) means that internal iteration in native code is a requirement for performance, which leads to eager allocation for composability via python, which then hurts performance. Numpy makes a very good effort, but is always constrained by this. Clever schemes with laziness where operations in python are actually just composing operations for execution later/on-demand can work as an alternative, but a) that's hard and b) even if you can completely avoid calling back in to python during iteration you would still need JIT to really unlock the performance.

Julia fixes this by having all/most in one language which is JIT'd

D can do the same with templates AOT, like C++/Eigen does but more flexible and less terrifying code. That's (one part of) what mir provides.

On Wednesday, 18 November 2020 at 13:14:37 UTC, jmh530 wrote: > On Wednesday, 18 November 2020 at 10:05:06 UTC, Tobias Schmidt wrote: > > It also looks like you are compiling on ldc with -mcpu=native --boundscheck=off. Why not -O as well? -O is added by DUB

On Wednesday, 18 November 2020 at 10:05:06 UTC, Tobias Schmidt wrote: > Dear all, > > to compare MIR and Numpy in the HPC context, we implemented a multigrid solver in Python using Numpy and in D using Mir and perforemd some benchmarks with them. > > You can find our code and results here: > https://github.com/typohnebild/numpy-vs-mir > > Feedback is very welcome. Please feel free to open issues, pull requests or simply post your thoughts below. > > Kind regards, > Tobias Thank you a lot! It is a huge benefit for Mir and D to have so quality benchmarks. Python's sweep_3D access memory only once for one element computation, while old D's sweep_slice access it 7 times. A PR [1] for new version of sweep_slice was added, I expect it will be at least twice faster. The new sweep_slice uses a more D'sh approach and single memory access to the computation element. [1] https://github.com/typohnebild/numpy-vs-mir/pull/1 Cheers, Ilya

On Wednesday, 18 November 2020 at 15:20:19 UTC, 9il wrote: > On Wednesday, 18 November 2020 at 13:14:37 UTC, jmh530 wrote: >> On Wednesday, 18 November 2020 at 10:05:06 UTC, Tobias Schmidt wrote: >> >> It also looks like you are compiling on ldc with -mcpu=native --boundscheck=off. Why not -O as well? > > -O is added by DUB Just -O? LDC is quite impressive with lto and cross-module-inlining turned on

Thanks for all of your feedback! On Wednesday, 18 November 2020 at 13:14:37 UTC, jmh530 wrote: > It's been a while since I've used numba, so I was a little confused on the numba 1 and numba 8 runs. The number was meant as the number of used threads in our runs. The prefix 'numba' is indicating if numba was used (numba) or not (nonumba). We have added a section to clarify this. Thanks for the hint.

Forums