Jump to page: 1 2
Thread overview
Parallelization of a large array
Mar 10, 2015
Dennis Ritchie
Mar 10, 2015
safety0ff
Mar 10, 2015
Dennis Ritchie
Mar 10, 2015
Dennis Ritchie
Mar 10, 2015
Dennis Ritchie
Mar 10, 2015
Meta
Mar 10, 2015
Ali Çehreli
Mar 11, 2015
Meta
Mar 10, 2015
anonymous
Mar 10, 2015
Ali Çehreli
Mar 10, 2015
Dennis Ritchie
Mar 10, 2015
Ali Çehreli
Mar 10, 2015
Dennis Ritchie
Mar 11, 2015
Russel Winder
Mar 11, 2015
Ali Çehreli
Mar 11, 2015
Russel Winder
March 10, 2015
Hi.
How to parallelize a large array to check for the presence of an element matching the value with the data?

std.stdio;
std.algorithm;
std.parallelism;

void main() {

	int[] a = new int[1000000];

	foreach (i, ref elem; a)
		elem = i;

	/*if (find(parallel(a), 895639).length != 0)
		writeln("Yes");
	else
		writeln("No");*/
}
March 10, 2015
On Tuesday, 10 March 2015 at 20:41:14 UTC, Dennis Ritchie wrote:
> Hi.
> How to parallelize a large array to check for the presence of an element matching the value with the data?

Here's a simple method (warning: has pitfalls):

import std.stdio;
import std.parallelism;

void main()
{
    int[] a = new int[1000000];

    foreach (i, ref elem; a)
        elem = cast(int)i;

    bool found;
    foreach (elem; a.parallel)
        if (elem == 895639)
            found = true;

    if (found)
        writeln("Yes");
    else
        writeln("No");
}
March 10, 2015
On Tuesday, 10 March 2015 at 20:41:14 UTC, Dennis Ritchie wrote:
> Hi.
> How to parallelize a large array to check for the presence of an element matching the value with the data?
>
> std.stdio;
> std.algorithm;
> std.parallelism;

You forgot a couple "import"s here.

>
> void main() {
>
> 	int[] a = new int[1000000];
>
> 	foreach (i, ref elem; a)
> 		elem = i;

Type mismatch here. i is a size_t, but elem is an int.

>
> 	/*if (find(parallel(a), 895639).length != 0)
> 		writeln("Yes");
> 	else
> 		writeln("No");*/
> }

I guess you'd have to write your own "find". Since std.algorithm.find is just linear search, that shouldn't be hard. Something like this:

bool found = false;
foreach(x; parallel(a))
{
    if(x == 895639)
    {
        found = true;
        /* Maybe figure out how to break all parallel loops. */
    }
}

std.algorithm.find would work on mere input ranges, and it would return the tail of the range and not just a bool. Both of those don't make sense with a parallel search, though.

Also, with a trivial predicate like integer equality, parallelization might not buy you anything.
March 10, 2015
On Tuesday, 10 March 2015 at 21:15:17 UTC, safety0ff wrote:
> On Tuesday, 10 March 2015 at 20:41:14 UTC, Dennis Ritchie wrote:
>> Hi.
>> How to parallelize a large array to check for the presence of an element matching the value with the data?
>
> Here's a simple method (warning: has pitfalls):
>
> import std.stdio;
> import std.parallelism;
>
> void main()
> {
>     int[] a = new int[1000000];
>
>     foreach (i, ref elem; a)
>         elem = cast(int)i;
>
>     bool found;
>     foreach (elem; a.parallel)
>         if (elem == 895639)
>             found = true;
>
>     if (found)
>         writeln("Yes");
>     else
>         writeln("No");
> }

Thanks.
March 10, 2015
On Tuesday, 10 March 2015 at 21:27:42 UTC, Dennis Ritchie wrote:
> Thanks.

No, it does not suit me, because of the parallel array in a foreach loop there is no break.

import std.stdio;
import std.algorithm;
import std.parallelism;

void main() {

	int b = 2;

	auto a = [1, 2, 2, 3];

	if (find(a, b).length != 0)
		writeln("Yes_");

	foreach (elem; a.parallel)
		if (elem == b)
			writeln("Yes");		// prints "Yes" twice
}
March 10, 2015
On Tuesday, 10 March 2015 at 22:11:57 UTC, Dennis Ritchie wrote:
> No, it does not suit me, because of the parallel array in a foreach loop there is no break.

I already understood everything: found = true;
March 10, 2015
On Tuesday, 10 March 2015 at 22:11:57 UTC, Dennis Ritchie wrote:
> On Tuesday, 10 March 2015 at 21:27:42 UTC, Dennis Ritchie wrote:
>> Thanks.
>
> No, it does not suit me, because of the parallel array in a foreach loop there is no break.
>
> import std.stdio;
> import std.algorithm;
> import std.parallelism;
>
> void main() {
>
> 	int b = 2;
>
> 	auto a = [1, 2, 2, 3];
>
> 	if (find(a, b).length != 0)
> 		writeln("Yes_");
>
> 	foreach (elem; a.parallel)
> 		if (elem == b)
> 			writeln("Yes");		// prints "Yes" twice
> }

Just add a condition variable.

import std.stdio;
import std.algorithm;
import std.parallelism;

void main() {

	int b = 2;

	auto a = [1, 2, 2, 3];

	if (find(a, b).length != 0)
		writeln("Yes_");
	
	auto found = false;

	foreach (elem; a.parallel)
		if (!found && elem == b)
		{
			writeln("Yes");
			found = true;
		}
}

This program pints:

Yes_
Yes
March 10, 2015
On 03/10/2015 01:41 PM, Dennis Ritchie wrote:
> Hi.
> How to parallelize a large array to check for the presence of an element
> matching the value with the data?
>
> std.stdio;
> std.algorithm;
> std.parallelism;
>
> void main() {
>
>      int[] a = new int[1000000];
>
>      foreach (i, ref elem; a)
>          elem = i;
>
>      /*if (find(parallel(a), 895639).length != 0)
>          writeln("Yes");
>      else
>          writeln("No");*/
> }

It is possible by accessing the actual range by chunks:

import std.stdio;
import std.algorithm;
import std.parallelism;
import std.range;
import std.conv;

void main()
{
    const size_t elementCount = 895640;
    int[] a = iota(elementCount)
              .map!(i => i.to!int)
              .array;

    const chunkSize = a.length / taskPool.size;
    auto chunks = a.chunks(chunkSize);
    bool[] results = chunks.taskPool.map!(chunk => chunk.canFind(895639));

    writeln(results.canFind(true) ? "Yes" : "No");
}

We can hope to make it simpler by taking advantage of parallel map but it requires a static local function or a global function (a lambda cannot be used):

    static bool canFindIt(Range)(Range range)
    {
        return range.canFind(895639);
    }

    auto results = taskPool.map!canFindIt(chunks);

Ali

March 10, 2015
On 03/10/2015 03:16 PM, Meta wrote:

> Just add a condition variable.
>
> import std.stdio;
> import std.algorithm;
> import std.parallelism;
>
> void main() {
>
>      int b = 2;
>
>      auto a = [1, 2, 2, 3];
>
>      if (find(a, b).length != 0)
>          writeln("Yes_");
>
>      auto found = false;
>
>      foreach (elem; a.parallel)
>          if (!found && elem == b)
>          {
>              writeln("Yes");
>              found = true;

I thought about the same solution but then realized that it's a race condition, which needs to be taken care of. It's true that the value always changes from false to true but still...

>          }
> }

Ali

March 10, 2015
On Tuesday, 10 March 2015 at 22:34:34 UTC, Ali Çehreli wrote:
>
> It is possible by accessing the actual range by chunks:
>
> import std.stdio;
> import std.algorithm;
> import std.parallelism;
> import std.range;
> import std.conv;
>
> void main()
> {
>     const size_t elementCount = 895640;
>     int[] a = iota(elementCount)
>               .map!(i => i.to!int)
>               .array;
>
>     const chunkSize = a.length / taskPool.size;
>     auto chunks = a.chunks(chunkSize);
>     bool[] results = chunks.taskPool.map!(chunk => chunk.canFind(895639));
>
>     writeln(results.canFind(true) ? "Yes" : "No");
> }
>
> We can hope to make it simpler by taking advantage of parallel map but it requires a static local function or a global function (a lambda cannot be used):
>
>     static bool canFindIt(Range)(Range range)
>     {
>         return range.canFind(895639);
>     }
>
>     auto results = taskPool.map!canFindIt(chunks);

Thanks.
« First   ‹ Prev
1 2