On 4/3/23 6:02 PM, Paul wrote:
>On Sunday, 2 April 2023 at 15:32:05 UTC, Steven Schveighoffer wrote:
>It's important to note that parallel doesn't iterate the range in parallel, it just runs the body in parallel limited by your CPU count.
?!?
So for example, if you have:
foreach(i; iota(0, 2_000_000).parallel)
{
runExpensiveTask(i);
}
The foreach is run on the main thread, gets a 0
, then hands off to a task thread runExpensiveTask(0)
. Then it gets a 1
, and hands off to a task thread runExpensiveTask(1)
, etc. The iteration is not expensive, and is not done in parallel.
On the other hand, what you shouldn't do is:
foreach(i; iota(0, 2_000_000).map!(x => runExpensiveTask(x)).parallel)
{
}
as this will run the expensive task before running any tasks.
> >If your foreach
body takes a global lock (like writeln(i);
), then it's not going to run any faster (probably slower actually).
Ok I did have some debug writelns I commented out.
And did it help? Another thing that takes a global lock is memory allocation.
> >Also make sure you have more than one logical CPU.
I have 8.
It's dependent on the work being done, but you should see a roughly 8x speedup as long as the overhead of distributing tasks is not significant compared to the work being done.
-Steve