June 18, 2020 Parallel array append using std.parallelism? | ||||
---|---|---|---|---|
| ||||
I have an array of input data that I'm looping over, and, based on some condition, generate new items that are appended onto a target array (which may already contain data). Since the creation of new items is quite expensive, I'm thinking to parallelize it with parallel foreach.
To avoid data races, my thought is for each generated item to be appended to thread-specific temporary arrays, that after the parallel foreach get sequentially appended to the target array. Something like this:
Item[] targetArray = ...; // already contains data
Item[][nThreads] tmp;
foreach (elem; input.parallel) {
if (condition(elem)) {
auto output = expensiveComputation(elem);
tmp[threadId] ~= output;
}
}
foreach (a; tmp)
targetArray ~= a;
Is there an easy way to achieve this with std.parallelism? I looked over the API but there doesn't seem to be any obvious way for a task to know which thread it's running in, in order to know which tmp array it should append to. If possible I'd like to avoid having to manually assign tasks to threads.
T
--
Questions are the beginning of intelligence, but the fear of God is the beginning of wisdom.
|
June 19, 2020 Re: Parallel array append using std.parallelism? | ||||
---|---|---|---|---|
| ||||
Posted in reply to H. S. Teoh | On Thursday, 18 June 2020 at 14:43:54 UTC, H. S. Teoh wrote: > I have an array of input data that I'm looping over, and, based on some condition, generate new items that are appended onto a target array (which may already contain data). Since the creation of new items is quite expensive, I'm thinking to parallelize it with parallel foreach. > > To avoid data races, my thought is for each generated item to be appended to thread-specific temporary arrays, that after the parallel foreach get sequentially appended to the target array. Something like this: > > Item[] targetArray = ...; // already contains data > Item[][nThreads] tmp; > foreach (elem; input.parallel) { > if (condition(elem)) { > auto output = expensiveComputation(elem); > tmp[threadId] ~= output; > } > } > foreach (a; tmp) > targetArray ~= a; > > Is there an easy way to achieve this with std.parallelism? I looked over the API but there doesn't seem to be any obvious way for a task to know which thread it's running in, in order to know which tmp array it should append to. If possible I'd like to avoid having to manually assign tasks to threads. There's an example of exactly this in std.parallelism: https://dlang.org/phobos/std_parallelism.html#.TaskPool.workerIndex In short: Item[] targetArray = ...; // already contains data // Get thread count from taskPool Item[][] tmp = new Item[][taskPool.size+1]; foreach (elem; input.parallel) { if (condition(elem)) { auto output = expensiveComputation(elem); // Use workerIndex as index tmp[taskPool.workerIndex] ~= output; } } foreach (a; tmp) targetArray ~= a; -- Simen |
Copyright © 1999-2021 by the D Language Foundation