Thread overview | |||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
February 27, 2020 How to sum multidimensional arrays? | ||||
---|---|---|---|---|
| ||||
I'd like to sum 2D arrays. Let's create 2 random 2D arrays and sum them. ``` import std.random : Xorshift, unpredictableSeed, uniform; import std.range : generate, take, chunks; import std.array : array; static T[][] rndMatrix(T)(T max, in int rows, in int cols) { Xorshift rnd; rnd.seed(unpredictableSeed); const amount = rows * cols; return generate(() => uniform(0, max, rnd)).take(amount).array.chunks(cols).array; } void main() { int[][] m1 = rndMatrix(10, 2, 3); int[][] m2 = rndMatrix(10, 2, 3); auto c = m1[] + m2[]; } ``` This won't work because the compiler will throw "Error: array operation m[] + m2[] without destination memory not allowed". Looking at https://forum.dlang.org/thread/wnjepbggivhutgbyjagm@forum.dlang.org, I modified the code to: ``` void main() { int[][] m1 = rndMatrix(10, 2, 3); int[][] m2 = rndMatrix(10, 2, 3); int[][] c; c.length = m[0].length; c[1].length = m[1].length; c[] = m1[] + m2[]; } ``` Well then I am getting "/dlang/dmd/linux/bin64/../../src/druntime/import/core/internal/array/operations.d(165): Error: static assert: "Binary + not supported for types int[] and int[]." Right, then I am trying the following ``` void main() { int[][] m1 = rndMatrix(10, 2, 3); int[][] m2 = rndMatrix(10, 2, 3); auto c = zip(m[], m2[]).map!((a, b) => a + b); } ``` Doesn't work either because "Error: template D main.__lambda1 cannot deduce function from argument types !()(Tuple!(int[], int[])), candidates are: onlineapp.d(21): __lambda1 (...) So, I have to flatten first, then zip + sum and then reshape back to the original. ``` auto c = zip(m.joiner, m2.joiner).map!(t => t[0] + t[1]).array.chunks(3).array; ``` This works but it does not look very efficient considering we flatten and then calling array twice. It will get even worse with 3D arrays. Is there a better way without relying on mir.ndslice? |
February 27, 2020 Re: How to sum multidimensional arrays? | ||||
---|---|---|---|---|
| ||||
Posted in reply to p.shkadzko | On Thursday, 27 February 2020 at 14:15:26 UTC, p.shkadzko wrote:
> This works but it does not look very efficient considering we flatten and then calling array twice. It will get even worse with 3D arrays.
And yes, benchmarks show that summing 2D arrays like in the example above is significantly slower than in numpy. But that is to be expected... I guess.
D -- sum of two 5000 x 6000 2D arrays: 3.4 sec.
numpy -- sum of two 5000 x 6000 2D arrays: 0.0367800739913946 sec.
|
February 27, 2020 Re: How to sum multidimensional arrays? | ||||
---|---|---|---|---|
| ||||
Posted in reply to p.shkadzko | On Thursday, 27 February 2020 at 14:15:26 UTC, p.shkadzko wrote:
[...]
> This works but it does not look very efficient considering we flatten and then calling array twice. It will get even worse with 3D arrays. Is there a better way without relying on mir.ndslice?
Is there a reason you can't create a struct around a double[] like this?
struct Matrix {
double[] data;
}
Then to add Matrix A to Matrix B, you use A.data[] + B.data[]. But since I'm not sure what exactly you're doing, maybe that won't work.
|
February 27, 2020 Re: How to sum multidimensional arrays? | ||||
---|---|---|---|---|
| ||||
Posted in reply to p.shkadzko | On Thursday, 27 February 2020 at 14:15:26 UTC, p.shkadzko wrote: > Is there a better way without relying on mir.ndslice? ndslice Poker Face /+dub.sdl: dependency "mir-algorithm" version="~>3.7.17" dependency "mir-random" version="~>2.2.10" +/ import mir.ndslice; import mir.random: threadLocal; import mir.random.variable: uniformVar; import mir.random.algorithm: randomSlice; import mir.random.engine.xorshift; void main() { Slice!(int*, 2) m1 = threadLocal!Xorshift.randomSlice(uniformVar!int(0, 10), [2, 3]); Slice!(int*, 2) m2 = threadLocal!Xorshift.randomSlice(uniformVar!int(0, 10), [2, 3]); Slice!(int*, 2) c = slice(m1 + m2); } |
February 27, 2020 Re: How to sum multidimensional arrays? | ||||
---|---|---|---|---|
| ||||
Posted in reply to p.shkadzko | On Thursday, 27 February 2020 at 15:28:01 UTC, p.shkadzko wrote:
> On Thursday, 27 February 2020 at 14:15:26 UTC, p.shkadzko wrote:
>> This works but it does not look very efficient considering we flatten and then calling array twice. It will get even worse with 3D arrays.
>
> And yes, benchmarks show that summing 2D arrays like in the example above is significantly slower than in numpy. But that is to be expected... I guess.
>
> D -- sum of two 5000 x 6000 2D arrays: 3.4 sec.
> numpy -- sum of two 5000 x 6000 2D arrays: 0.0367800739913946 sec.
What's the performance of mir like?
The code below seems to work without issue.
/+dub.sdl:
dependency "mir-algorithm" version="~>3.7.17"
dependency "mir-random" version="~>2.2.10"
+/
import std.stdio : writeln;
import mir.random : Random, unpredictableSeed;
import mir.random.variable: UniformVariable;
import mir.random.algorithm: randomSlice;
auto rndMatrix(T)(T max, in int rows, in int cols)
{
auto gen = Random(unpredictableSeed);
auto rv = UniformVariable!T(0.0, max);
return randomSlice(gen, rv, rows, cols);
}
void main() {
auto m1 = rndMatrix(10.0, 2, 3);
auto m2 = rndMatrix(10.0, 2, 3);
auto m3 = m1 + m2;
writeln(m1);
writeln(m2);
writeln(m3);
}
|
February 27, 2020 Re: How to sum multidimensional arrays? | ||||
---|---|---|---|---|
| ||||
Posted in reply to jmh530 | On Thursday, 27 February 2020 at 16:31:49 UTC, jmh530 wrote: > On Thursday, 27 February 2020 at 15:28:01 UTC, p.shkadzko wrote: >> On Thursday, 27 February 2020 at 14:15:26 UTC, p.shkadzko wrote: >>> This works but it does not look very efficient considering we flatten and then calling array twice. It will get even worse with 3D arrays. >> >> And yes, benchmarks show that summing 2D arrays like in the example above is significantly slower than in numpy. But that is to be expected... I guess. >> >> D -- sum of two 5000 x 6000 2D arrays: 3.4 sec. >> numpy -- sum of two 5000 x 6000 2D arrays: 0.0367800739913946 sec. > > What's the performance of mir like? > > The code below seems to work without issue. > > /+dub.sdl: > dependency "mir-algorithm" version="~>3.7.17" > dependency "mir-random" version="~>2.2.10" > +/ > import std.stdio : writeln; > import mir.random : Random, unpredictableSeed; > import mir.random.variable: UniformVariable; > import mir.random.algorithm: randomSlice; > > auto rndMatrix(T)(T max, in int rows, in int cols) > { > auto gen = Random(unpredictableSeed); > auto rv = UniformVariable!T(0.0, max); > return randomSlice(gen, rv, rows, cols); > } > > void main() { > auto m1 = rndMatrix(10.0, 2, 3); > auto m2 = rndMatrix(10.0, 2, 3); > auto m3 = m1 + m2; > > writeln(m1); > writeln(m2); > writeln(m3); > } The same as numpy for large matrixes because the cost is memory access. Mir+LDC will be faster for small matrixes because it will flatten the inner loop and use SIMD instructions. Few performances nitpick for your example to be fair with benchmarking againt the test: 1. Random (default) is slower then Xorfish. 2. double is twice larger then int and requires twice more memory, so it would be twice slower then int for large matrixes. Check the prev. post, we have posted almost in the same time ;) https://forum.dlang.org/post/izoflhyerkiladngyrov@forum.dlang.org |
February 27, 2020 Re: How to sum multidimensional arrays? | ||||
---|---|---|---|---|
| ||||
Posted in reply to p.shkadzko | On Thursday, 27 February 2020 at 14:15:26 UTC, p.shkadzko wrote:
> void main() {
> int[][] m1 = rndMatrix(10, 2, 3);
> int[][] m2 = rndMatrix(10, 2, 3);
>
> auto c = m1[] + m2[];
> }
I think you're trying to do this:
int[][] m1 = rndMatrix(10, 2, 3);
int[][] m2 = rndMatrix(10, 2, 3);
int[][] m3;
m3.length = m1.length;
foreach(i; 0..m1.length)
{
m3[i].length = m1[i].length;
m3[i][] = m1[i][] + m2[i][];
}
But of course that's not the best solution :)
|
February 27, 2020 Re: How to sum multidimensional arrays? | ||||
---|---|---|---|---|
| ||||
Posted in reply to 9il | On Thursday, 27 February 2020 at 16:39:15 UTC, 9il wrote:
> [snip]
> Few performances nitpick for your example to be fair with benchmarking againt the test:
> 1. Random (default) is slower then Xorfish.
> 2. double is twice larger then int and requires twice more memory, so it would be twice slower then int for large matrixes.
>
> Check the prev. post, we have posted almost in the same time ;)
> https://forum.dlang.org/post/izoflhyerkiladngyrov@forum.dlang.org
Those differences largely came from a lack of attention to detail. I didn't notice the Xorshift until after I posted. I used double because it's such a force of habit for me to use continuous distributions.
I came across this in the documentation.
UniformVariable!T uniformVariable(T = double)(in T a, in T b)
if(isIntegral!T)
and did a double-take until I read the note associated with it in the source.
|
February 27, 2020 Re: How to sum multidimensional arrays? | ||||
---|---|---|---|---|
| ||||
Posted in reply to 9il | On Thursday, 27 February 2020 at 16:31:07 UTC, 9il wrote:
> On Thursday, 27 February 2020 at 14:15:26 UTC, p.shkadzko wrote:
>> Is there a better way without relying on mir.ndslice?
>
> ndslice Poker Face
>
> /+dub.sdl:
> dependency "mir-algorithm" version="~>3.7.17"
> dependency "mir-random" version="~>2.2.10"
> +/
> import mir.ndslice;
> import mir.random: threadLocal;
> import mir.random.variable: uniformVar;
> import mir.random.algorithm: randomSlice;
> import mir.random.engine.xorshift;
>
> void main() {
> Slice!(int*, 2) m1 = threadLocal!Xorshift.randomSlice(uniformVar!int(0, 10), [2, 3]);
> Slice!(int*, 2) m2 = threadLocal!Xorshift.randomSlice(uniformVar!int(0, 10), [2, 3]);
> Slice!(int*, 2) c = slice(m1 + m2);
> }
Yes, mir.ndslice is a straightforward choice for multidimensional arrays. I shall do some benchmarks with it next. But first, I try to do it with standard D ops and see what's the rough difference against numpy's C.
|
February 27, 2020 Re: How to sum multidimensional arrays? | ||||
---|---|---|---|---|
| ||||
Posted in reply to bachmeier | On Thursday, 27 February 2020 at 15:48:53 UTC, bachmeier wrote:
> On Thursday, 27 February 2020 at 14:15:26 UTC, p.shkadzko wrote:
>
> [...]
>
>> This works but it does not look very efficient considering we flatten and then calling array twice. It will get even worse with 3D arrays. Is there a better way without relying on mir.ndslice?
>
> Is there a reason you can't create a struct around a double[] like this?
>
> struct Matrix {
> double[] data;
> }
>
> Then to add Matrix A to Matrix B, you use A.data[] + B.data[]. But since I'm not sure what exactly you're doing, maybe that won't work.
Right! Ok, here is how I do it.
```
struct Matrix(T)
{
T[] elems;
int cols;
T[][] to2D()
{
return elems.chunks(cols).array;
}
}
```
and Matrix summing and random array generator functions
```
auto matrixSum(Matrix!int m1, Matrix!int m2)
{
Matrix!int m3;
m3.cols = m1.cols;
m3.elems.length = m1.elems.length;
m3.elems[] = m1.elems[] + m2.elems[];
return m3.to2D;
}
static T[] rndArr(T)(in T max, in int elems)
{
Xorshift rnd;
return generate(() => uniform(0, max, rnd)).take(elems).array;
}
```
Then we do the following
```
auto m1 = Matrix!int(rndArr!int(10, 5000 * 6000), 6000);
auto m2 = Matrix!int(rndArr!int(10, 5000 * 6000), 6000);
auto m3 = matrixSum(m1, m2);
```
And it works effortlessly!
Sum of two 5000 x 6000 int arrays is just 0.105 sec! (on a Windows machine though but with weaker CPU).
I bet using mir.ndslice instead of D arrays would be even faster.
|
Copyright © 1999-2021 by the D Language Foundation