How to sum multidimensional arrays? - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » Learn » How to sum multidimensional arrays?

Thread overview

How to sum multidimensional arrays?
Feb 27, 2020 p.shkadzko
Feb 27, 2020 p.shkadzko
Feb 27, 2020 jmh530
Feb 27, 2020 9il
Feb 27, 2020 jmh530
Feb 27, 2020 bachmeier
Feb 27, 2020 p.shkadzko
Feb 28, 2020 9il
Feb 27, 2020 9il
Feb 27, 2020 p.shkadzko
Feb 27, 2020 Andrea Fontana
Feb 28, 2020 AB
Feb 28, 2020 bachmeier
Feb 29, 2020 p.shkadzko
Mar 01, 2020 AB

February 27, 2020

How to sum multidimensional arrays?

Posted by p.shkadzko

p.shkadzko

I'd like to sum 2D arrays. Let's create 2 random 2D arrays and sum them.

```
import std.random : Xorshift, unpredictableSeed, uniform;
import std.range : generate, take, chunks;
import std.array : array;

static T[][] rndMatrix(T)(T max, in int rows, in int cols)
{
    Xorshift rnd;
    rnd.seed(unpredictableSeed);
    const amount = rows * cols;
    return generate(() => uniform(0, max, rnd)).take(amount).array.chunks(cols).array;
}

void main() {
    int[][] m1 = rndMatrix(10, 2, 3);
    int[][] m2 = rndMatrix(10, 2, 3);

    auto c = m1[] + m2[];
}
```

This won't work because the compiler will throw "Error: array operation m[] + m2[] without destination memory not allowed". Looking at https://forum.dlang.org/thread/wnjepbggivhutgbyjagm@forum.dlang.org, I modified the code to:


```
void main() {
    int[][] m1 = rndMatrix(10, 2, 3);
    int[][] m2 = rndMatrix(10, 2, 3);

    int[][] c;
    c.length = m[0].length;
    c[1].length = m[1].length;
	
    c[] = m1[] + m2[];
}
```
Well then I am getting

"/dlang/dmd/linux/bin64/../../src/druntime/import/core/internal/array/operations.d(165): Error: static assert:  "Binary + not supported for types int[] and int[]."

Right, then I am trying the following

```
void main() {
    int[][] m1 = rndMatrix(10, 2, 3);
    int[][] m2 = rndMatrix(10, 2, 3);

    auto c = zip(m[], m2[]).map!((a, b) => a + b);

}
```

Doesn't work either because

"Error: template D main.__lambda1 cannot deduce function from argument types !()(Tuple!(int[], int[])), candidates are:
onlineapp.d(21):        __lambda1
(...)


So, I have to flatten first, then zip + sum and then reshape back to the original.

```
auto c = zip(m.joiner, m2.joiner).map!(t => t[0] + t[1]).array.chunks(3).array;
```

This works but it does not look very efficient considering we flatten and then calling array twice. It will get even worse with 3D arrays. Is there a better way without relying on mir.ndslice?

February 27, 2020

Re: How to sum multidimensional arrays?

Posted by p.shkadzko
in reply to p.shkadzko

p.shkadzko

Posted in reply to p.shkadzko

On Thursday, 27 February 2020 at 14:15:26 UTC, p.shkadzko wrote:
> This works but it does not look very efficient considering we flatten and then calling array twice. It will get even worse with 3D arrays.

And yes, benchmarks show that summing 2D arrays like in the example above is significantly slower than in numpy. But that is to be expected... I guess.

D -- sum of two 5000 x 6000 2D arrays: 3.4 sec.
numpy -- sum of two 5000 x 6000 2D arrays: 0.0367800739913946 sec.

February 27, 2020

Re: How to sum multidimensional arrays?

Posted by bachmeier
in reply to p.shkadzko

bachmeier

Posted in reply to p.shkadzko

On Thursday, 27 February 2020 at 14:15:26 UTC, p.shkadzko wrote:

[...]

> This works but it does not look very efficient considering we flatten and then calling array twice. It will get even worse with 3D arrays. Is there a better way without relying on mir.ndslice?

Is there a reason you can't create a struct around a double[] like this?

struct Matrix {
  double[] data;
}

Then to add Matrix A to Matrix B, you use A.data[] + B.data[]. But since I'm not sure what exactly you're doing, maybe that won't work.

February 27, 2020

Re: How to sum multidimensional arrays?

Posted by 9il
in reply to p.shkadzko

9il

Posted in reply to p.shkadzko

On Thursday, 27 February 2020 at 14:15:26 UTC, p.shkadzko wrote:
> Is there a better way without relying on mir.ndslice?

ndslice Poker Face

/+dub.sdl:
dependency "mir-algorithm" version="~>3.7.17"
dependency "mir-random" version="~>2.2.10"
+/
import mir.ndslice;
import mir.random: threadLocal;
import mir.random.variable: uniformVar;
import mir.random.algorithm: randomSlice;
import mir.random.engine.xorshift;

void main() {
    Slice!(int*, 2) m1 = threadLocal!Xorshift.randomSlice(uniformVar!int(0, 10), [2, 3]);
    Slice!(int*, 2) m2 = threadLocal!Xorshift.randomSlice(uniformVar!int(0, 10), [2, 3]);
    Slice!(int*, 2) c = slice(m1 + m2);
}

February 27, 2020

Re: How to sum multidimensional arrays?

Posted by jmh530
in reply to p.shkadzko

jmh530

Posted in reply to p.shkadzko

On Thursday, 27 February 2020 at 15:28:01 UTC, p.shkadzko wrote:
> On Thursday, 27 February 2020 at 14:15:26 UTC, p.shkadzko wrote:
>> This works but it does not look very efficient considering we flatten and then calling array twice. It will get even worse with 3D arrays.
>
> And yes, benchmarks show that summing 2D arrays like in the example above is significantly slower than in numpy. But that is to be expected... I guess.
>
> D -- sum of two 5000 x 6000 2D arrays: 3.4 sec.
> numpy -- sum of two 5000 x 6000 2D arrays: 0.0367800739913946 sec.

What's the performance of mir like?

The code below seems to work without issue.

/+dub.sdl:
dependency "mir-algorithm" version="~>3.7.17"
dependency "mir-random" version="~>2.2.10"
+/
import std.stdio : writeln;
import mir.random : Random, unpredictableSeed;
import mir.random.variable: UniformVariable;
import mir.random.algorithm: randomSlice;

auto rndMatrix(T)(T max, in int rows, in int cols)
{
    auto gen = Random(unpredictableSeed);
    auto rv = UniformVariable!T(0.0, max);
    return randomSlice(gen, rv, rows, cols);
}

void main() {
    auto m1 = rndMatrix(10.0, 2, 3);
    auto m2 = rndMatrix(10.0, 2, 3);
    auto m3 = m1 + m2;

    writeln(m1);
    writeln(m2);
    writeln(m3);
}

February 27, 2020

Re: How to sum multidimensional arrays?

Posted by 9il
in reply to jmh530

9il

Posted in reply to jmh530

On Thursday, 27 February 2020 at 16:31:49 UTC, jmh530 wrote:
> On Thursday, 27 February 2020 at 15:28:01 UTC, p.shkadzko wrote:
>> On Thursday, 27 February 2020 at 14:15:26 UTC, p.shkadzko wrote:
>>> This works but it does not look very efficient considering we flatten and then calling array twice. It will get even worse with 3D arrays.
>>
>> And yes, benchmarks show that summing 2D arrays like in the example above is significantly slower than in numpy. But that is to be expected... I guess.
>>
>> D -- sum of two 5000 x 6000 2D arrays: 3.4 sec.
>> numpy -- sum of two 5000 x 6000 2D arrays: 0.0367800739913946 sec.
>
> What's the performance of mir like?
>
> The code below seems to work without issue.
>
> /+dub.sdl:
> dependency "mir-algorithm" version="~>3.7.17"
> dependency "mir-random" version="~>2.2.10"
> +/
> import std.stdio : writeln;
> import mir.random : Random, unpredictableSeed;
> import mir.random.variable: UniformVariable;
> import mir.random.algorithm: randomSlice;
>
> auto rndMatrix(T)(T max, in int rows, in int cols)
> {
>     auto gen = Random(unpredictableSeed);
>     auto rv = UniformVariable!T(0.0, max);
>     return randomSlice(gen, rv, rows, cols);
> }
>
> void main() {
>     auto m1 = rndMatrix(10.0, 2, 3);
>     auto m2 = rndMatrix(10.0, 2, 3);
>     auto m3 = m1 + m2;
>
>     writeln(m1);
>     writeln(m2);
>     writeln(m3);
> }

The same as numpy for large matrixes because the cost is memory access. Mir+LDC will be faster for small matrixes because it will flatten the inner loop and use SIMD instructions.

Few performances nitpick for your example to be fair with benchmarking againt the test:
1. Random (default) is slower then Xorfish.
2. double is twice larger then int and requires twice more memory, so it would be twice slower then int for large matrixes.

Check the prev. post, we have posted almost in the same time ;)
https://forum.dlang.org/post/izoflhyerkiladngyrov@forum.dlang.org

February 27, 2020

Re: How to sum multidimensional arrays?

Posted by Andrea Fontana
in reply to p.shkadzko

Andrea Fontana

Posted in reply to p.shkadzko

On Thursday, 27 February 2020 at 14:15:26 UTC, p.shkadzko wrote:
> void main() {
>     int[][] m1 = rndMatrix(10, 2, 3);
>     int[][] m2 = rndMatrix(10, 2, 3);
>
>     auto c = m1[] + m2[];
> }


I think you're trying to do this:

int[][] m1 = rndMatrix(10, 2, 3);
int[][] m2 = rndMatrix(10, 2, 3);
int[][] m3;

m3.length = m1.length;
foreach(i; 0..m1.length)
{
    m3[i].length = m1[i].length;
    m3[i][] = m1[i][] + m2[i][];
}

But of course that's not the best solution :)

February 27, 2020

Re: How to sum multidimensional arrays?

Posted by jmh530
in reply to 9il

jmh530

Posted in reply to 9il

On Thursday, 27 February 2020 at 16:39:15 UTC, 9il wrote:
> [snip]
> Few performances nitpick for your example to be fair with benchmarking againt the test:
> 1. Random (default) is slower then Xorfish.
> 2. double is twice larger then int and requires twice more memory, so it would be twice slower then int for large matrixes.
>
> Check the prev. post, we have posted almost in the same time ;)
> https://forum.dlang.org/post/izoflhyerkiladngyrov@forum.dlang.org

Those differences largely came from a lack of attention to detail. I didn't notice the Xorshift until after I posted. I used double because it's such a force of habit for me to use continuous distributions.

I came across this in the documentation.
UniformVariable!T uniformVariable(T = double)(in T a, in T b)
    if(isIntegral!T)
and did a double-take until I read the note associated with it in the source.

February 27, 2020

Re: How to sum multidimensional arrays?

Posted by p.shkadzko
in reply to 9il

p.shkadzko

Posted in reply to 9il

On Thursday, 27 February 2020 at 16:31:07 UTC, 9il wrote:
> On Thursday, 27 February 2020 at 14:15:26 UTC, p.shkadzko wrote:
>> Is there a better way without relying on mir.ndslice?
>
> ndslice Poker Face
>
> /+dub.sdl:
> dependency "mir-algorithm" version="~>3.7.17"
> dependency "mir-random" version="~>2.2.10"
> +/
> import mir.ndslice;
> import mir.random: threadLocal;
> import mir.random.variable: uniformVar;
> import mir.random.algorithm: randomSlice;
> import mir.random.engine.xorshift;
>
> void main() {
>     Slice!(int*, 2) m1 = threadLocal!Xorshift.randomSlice(uniformVar!int(0, 10), [2, 3]);
>     Slice!(int*, 2) m2 = threadLocal!Xorshift.randomSlice(uniformVar!int(0, 10), [2, 3]);
>     Slice!(int*, 2) c = slice(m1 + m2);
> }

Yes, mir.ndslice is a straightforward choice for multidimensional arrays. I shall do some benchmarks with it next. But first, I try to do it with standard D ops and see what's the rough difference against numpy's C.

February 27, 2020

Re: How to sum multidimensional arrays?

Posted by p.shkadzko
in reply to bachmeier

p.shkadzko

Posted in reply to bachmeier

On Thursday, 27 February 2020 at 15:48:53 UTC, bachmeier wrote:
> On Thursday, 27 February 2020 at 14:15:26 UTC, p.shkadzko wrote:
>
> [...]
>
>> This works but it does not look very efficient considering we flatten and then calling array twice. It will get even worse with 3D arrays. Is there a better way without relying on mir.ndslice?
>
> Is there a reason you can't create a struct around a double[] like this?
>
> struct Matrix {
>   double[] data;
> }
>
> Then to add Matrix A to Matrix B, you use A.data[] + B.data[]. But since I'm not sure what exactly you're doing, maybe that won't work.

Right! Ok, here is how I do it.

```
struct Matrix(T)
{
    T[] elems;
    int cols;

    T[][] to2D()
    {
        return elems.chunks(cols).array;
    }
}
```

and Matrix summing and random array generator functions

```
auto matrixSum(Matrix!int m1, Matrix!int m2)
{
    Matrix!int m3;
    m3.cols = m1.cols;
    m3.elems.length = m1.elems.length;
    m3.elems[] = m1.elems[] + m2.elems[];
    return m3.to2D;
}

static T[] rndArr(T)(in T max, in int elems)
{
    Xorshift rnd;
    return generate(() => uniform(0, max, rnd)).take(elems).array;
}
```
Then we do the following

```
auto m1 = Matrix!int(rndArr!int(10, 5000 * 6000), 6000);
auto m2 = Matrix!int(rndArr!int(10, 5000 * 6000), 6000);
auto m3 = matrixSum(m1, m2);
```

And it works effortlessly!
Sum of two 5000 x 6000 int arrays is just 0.105 sec! (on a Windows machine though but with weaker CPU).

I bet using mir.ndslice instead of D arrays would be even faster.

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation