Jump to page: 1 2 3
Thread overview
foreach (i; taskPool.parallel(0..2_000_000)
Apr 01, 2023
Paul
Apr 01, 2023
Paul
Apr 01, 2023
Paul
Apr 01, 2023
Ali Çehreli
Apr 02, 2023
Salih Dincer
Apr 02, 2023
Salih Dincer
Apr 04, 2023
Salih Dincer
Apr 04, 2023
Ali Çehreli
Apr 04, 2023
Salih Dincer
Apr 06, 2023
Salih Dincer
Apr 01, 2023
Paul
Apr 03, 2023
Paul
Apr 03, 2023
Paul
Apr 03, 2023
Paul
Apr 04, 2023
Paul
Apr 04, 2023
H. S. Teoh
Apr 05, 2023
Paul
Apr 05, 2023
H. S. Teoh
Apr 06, 2023
Paul
Apr 06, 2023
H. S. Teoh
Apr 06, 2023
Paul
April 01, 2023

Thanks in advance for any assistance.

As the subject line suggests can I do something like? :

foreach (i; taskPool.parallel(0..2_000_000))

Obviously this exact syntax doesn't work but I think it expresses the gist of my challenge.

April 01, 2023

On 4/1/23 2:25 PM, Paul wrote:

>

Thanks in advance for any assistance.

As the subject line suggests can I do something like? :

foreach (i; taskPool.parallel(0..2_000_000))

Obviously this exact syntax doesn't work but I think it expresses the gist of my challenge.

import std.range;

foreach(; iota(0, 2_000_000).parallel)

-Steve

April 01, 2023

Thanks Steve.

April 01, 2023
>
import std.range;

foreach(; iota(0, 2_000_000).parallel)

-Steve

Is there a way to verify that it split up the work in to tasks/threads ...? The example you gave me works...compiles w/o errors but the execution time is the same as the non-parallel version. They both take about 6 secs to execute. totalCPUs tells me I have 8 CPUs available.

April 01, 2023

On Saturday, 1 April 2023 at 18:30:32 UTC, Steven Schveighoffer wrote:

>

On 4/1/23 2:25 PM, Paul wrote:

import std.range;

foreach(; iota(0, 2_000_000).parallel)

-Steve

Is there a way to tell if the parallelism actually divided up the work? Both versions of my program run in the same time ~6 secs.

April 01, 2023
On 4/1/23 15:30, Paul wrote:

> Is there a way to verify that it split up the work in to tasks/threads
> ...?

It is hard to see the difference unless there is actual work in the loop that takes time. You can add a Thread.sleep call. (Commented-out in the following program.)

Another option is to monitor a task manager like 'top' on unix based systems. It should multiple threads for the same program.

However, I will do something unspeakably wrong and take advantage of undefined behavior below. :) Since iteration count is an even number, the 'sum' variable should come out as 0 in the end. With .parallel it doesn't because multiple threads are stepping on each other's toes (values):

import std;

void main() {
    long sum;

    foreach(i; iota(0, 2_000_000).parallel) {
        // import core.thread;
        // Thread.sleep(1.msecs);

        if (i % 2) {
            ++sum;

        } else {
            --sum;
        }
    }

    if (sum == 0) {
        writeln("We highly likely worked serially.");

    } else {
        writefln!"We highly likely worked in parallel because %s != 0."(sum);
    }
}

If you remove .parallel, 'sum' will always be 0.

Ali

April 02, 2023

On Saturday, 1 April 2023 at 22:48:46 UTC, Ali Çehreli wrote:

>

On 4/1/23 15:30, Paul wrote:

>

Is there a way to verify that it split up the work in to
tasks/threads
...?

It is hard to see the difference unless there is actual work in the loop that takes time.

I always use the Rowland Sequence for such experiments. At least it's better than the Fibonacci Range:

struct RowlandSequence {
  import std.numeric : gcd;
  import std.format : format;
  import std.conv : text;

  long b, r, a = 3;
  enum empty = false;

  string[] front() {
    string result = format("%s, %s", b, r);
    return [text(a), result];
  }

  void popFront() {
    long result = 1;
    while(result == 1) {
      result = gcd(r++, b);
      b += result;
    }
    a = result;
  }
}

enum BP {
  f = 1, b = 7, r = 2, a = 1, /*
  f = 109, b = 186837516, r = 62279173, //*/
  s = 5
}

void main()
{
  RowlandSequence rs;
  long start, skip;

  with(BP) {
    rs = RowlandSequence(b, r);
    start = f;
    skip = s;
  }
  rs.popFront();

  import std.stdio, std.parallelism;
  import std.range : take;

  auto rsFirst128 = rs.take(128);
  foreach(r; rsFirst128.parallel)
  {
    if(r[0].length > skip)
    {
      start.writeln(": ", r);
    }
    start++;
  }
} /* PRINTS:

46: ["121403", "364209, 121404"]
48: ["242807", "728421, 242808"]
68: ["486041", "1458123, 486042"]
74: ["972533", "2917599, 972534"]
78: ["1945649", "5836947, 1945650"]
82: ["3891467", "11674401, 3891468"]
90: ["7783541", "23350623, 7783542"]
93: ["15567089", "46701267, 15567090"]
102: ["31139561", "93418683, 31139562"]
108: ["62279171", "186837513, 62279172"]

*/

The operation is simple, again multiplication, addition, subtraction and module, i.e. So four operations but enough to overrun the CPU! I haven't seen rsFirst256 until now because I don't have a fast enough processor. Maybe you'll see it, but the first 108 is fast anyway.

PS: Decrease value of the skip to see the entire sequence. In cases where your processor power is not enough, you can create skip points. Check out BP...

SDB@79

April 02, 2023

On Sunday, 2 April 2023 at 04:34:40 UTC, Salih Dincer wrote:

>

I haven't seen rsFirst256 until now...

Edit: I saw, I saw :)

I am struck with consternation! I've never seen these results before. Interesting, there is such a thing as parallel threading :)

Here are my skipPoints:

enum BP : long {
  //f, a, r, b = 7, /* <- beginning
   f = 113, r =   62279227, b =   186837678,
  // f = 146, r =  249134971, b =   747404910,
  // f = 161, r =  498270808, b =  1494812421,
  // f = 178, r = 1993083484, b =  5979250449,
  // f = 210, r = 3986167363, b = 11958502086,
  //*/
  s = 5
} /* PRINTS:
eLab@pico:~/Projeler$ ./RownlandSequence_v2
122: ["124559610, 373678827"]
128: ["249120240, 747360717"]
*/

It looks like there are 5 total skipPoints until 256 where it loops for a long time. (while passing 1's).

SDB@79

April 02, 2023
On 4/1/23 6:32 PM, Paul wrote:
> On Saturday, 1 April 2023 at 18:30:32 UTC, Steven Schveighoffer wrote:
>> On 4/1/23 2:25 PM, Paul wrote:
>>
>> ```d
>> import std.range;
>>
>> foreach(i; iota(0, 2_000_000).parallel)
>> ```
>>
> 
> Is there a way to tell if the parallelism actually divided up the work?  Both versions of my program run in the same time ~6 secs.

It's important to note that parallel doesn't iterate the range in parallel, it just runs the body in parallel limited by your CPU count.

If your `foreach` body takes a global lock (like `writeln(i);`), then it's not going to run any faster (probably slower actually).

If you can disclose more about what you are trying to do, it would be helpful.

Also make sure you have more than one logical CPU.

-Steve
April 03, 2023

On Sunday, 2 April 2023 at 15:32:05 UTC, Steven Schveighoffer wrote:

>

It's important to note that parallel doesn't iterate the range in parallel, it just runs the body in parallel limited by your CPU count.
?!?

>

If your foreach body takes a global lock (like writeln(i);), then it's not going to run any faster (probably slower actually).
Ok I did have some debug writelns I commented out.

>

If you can disclose more about what you are trying to do, it would be helpful.
This seems like it would be a lot of code and explaining but let me think about how to summarize.

>

Also make sure you have more than one logical CPU.
I have 8.

« First   ‹ Prev
1 2 3