Thread overview
[Issue 4566] New: Calling functions in parallel with std.concurrency
Aug 02, 2010
Jonathan M Davis
Feb 03, 2011
Sean Kelly
Sep 01, 2011
Andrew Wiley
August 02, 2010
http://d.puremagic.com/issues/show_bug.cgi?id=4566

           Summary: Calling functions in parallel with std.concurrency
           Product: D
           Version: D2
          Platform: Other
        OS/Version: Linux
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: Phobos
        AssignedTo: nobody@puremagic.com
        ReportedBy: jmdavisProg@gmail.com


--- Comment #0 from Jonathan M Davis <jmdavisProg@gmail.com> 2010-08-01 17:03:55 PDT ---
Okay. I'm finding that it's a common pattern with my threads that I'm effectively trying to call a function, have it run in a separate thread, and then have the result returned to the calling thread (which at present is done through send/receive). Essentially, I end up with something like this in the calling thread

auto tid1 = spawn(&func1, /* args */);
auto tid2 = spawn(&func2, /* args */);
auto tid3 = spawn(&func3, /* args */);
T1 retValFromTid1;
T2 retValFromTid2;
T3 retValFromTid3;
receive(/* whatever is appropriate to receive sends from funcs */);
receive(/* whatever is appropriate to receive sends from funcs */);
receive(/* whatever is appropriate to receive sends from funcs */);
/* retValFromTidXs should now all have the values sent from the spawned threads
*/


and something like this in each spawned thread:

void func1(Tid parentTid, /* args */)
{
    /* obviously, in non-array cases, something other than idup has to be done
to make the data immutable */
    send(parentTid, thisTid, callToActualFunc(/* args */).idup);
}


This causes 3 problems, 2 of which I think are solvable and one of which probably isn't.

1. The args passed to the spawned thread are often intended solely for that thread and never used again in the calling thread. Having to copy mutable data to make it immutable to pass it along to spawn() and then possibly having to copy it _again_ in the spawned thread to make it mutable again is not particularly efficient and definitely annoying. Ideally, there would be a way to indicate that ownership of the data is being handed over to the spawned thread and the calling thread would no longer have access to it.

2. Having to copy the data returned from the spawned thread in order to make it immutable, only to have to copy it again on the receiving end to make it mutable again is also painfully ineffecient. And unless you use something like Rebindable on the other end, you _have_ copy it to make it mutable again - even if you could have used it as immutable - because the local variables that you're assigning it to can't be immutable because you're setting them in receive() rather then when they're declared.

3. Having to use spawn(), receive(), and send() in this manner (not to mention possibly having to create a wrapper function just to make the function being called a thread and pass its return value back) is quite messy in comparison to just calling a function. There has got to be a cleaner way to do this. It's great if you intend to continually pass messages back and forth, but for the case where you're basically trying to call several unrelated functions in parallel and get their return values, this is messy.

With regards to #1, I'm not sure that it can be fixed, because it very quickly seems like it becomes an issue similar to escaped references. You have to have a way to ensure that the calling thread doesn't hold onto references to the data being passed (or at least that it doesn't access them before the spawned threads have terminated). I'd love for it to be fixed, but I'm not sure that it can be.

With regards to #2 and #3, I think that they're totally fixable with a new set of functions in std.concurrency specifically for this case. That is you have a function which you pass a function to call, the arguments for that function, and the variable which the function's return value will be assigned to - and it takes multiple such functions at a time (that may require creating a struct to hold the set of data for each function, but that wouldn't necessarily be a problem). And that function call would block until it had received the return values for each of those functions and set them to the variables which were given for that purpose. The functions being called would be normal functions with return values. Due to #1, they'd probably have to take immutable reference types - though they might be able to take const because the calling thread would be blocking until they terminated, and even if the same data were passed to each of the threads, it would have to be const for all of them, so they might be able to get away with const parameters rather than immutable. However, they would return as normal in either case, so you could genarally avoid wrapper functions, and you wouldn't have to make the return value immutable (since you _know_ that the thread returning it can't alter it anymore, since it's terminating).

Essentially, you'd get something like this:

auto f1 = ThreadedFunc!(&func1, /* args */);
auto f2 = ThreadedFunc!(&func1, /* args */);
auto f3 = ThreadedFunc!(&func1, /* args */);
callThreadedFuncs(f1, f2, f3);
/* f1.retval, f2.retval, and f3.retval have the returned values */


I don't know if that's the best way to handle this scenario, but it seems to me that it would be of great value to have a way in phobos to simultaneously call a set of functions (waiting for all of them to finish before moving on) where the number of copies due to forced immutability is minimized if not eliminated. As it is, it's rather cumbersome and inefficent to use std.concurrency to effectively call multiple functions in parallel.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
February 03, 2011
http://d.puremagic.com/issues/show_bug.cgi?id=4566


Sean Kelly <sean@invisibleduck.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |sean@invisibleduck.org


--- Comment #1 from Sean Kelly <sean@invisibleduck.org> 2011-02-03 15:42:47 PST ---
I'd like to allow Unique!T to be accepted by send(), which should solve your problem.  You could also cast to/from shared, though this is obviously somewhat nasty.  What you're trying to do seems like it may be more appropriate for a thread pool as well.  I believe there's one in std.parallelism, which is under review.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
September 01, 2011
http://d.puremagic.com/issues/show_bug.cgi?id=4566


Andrew Wiley <debio264@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |debio264@gmail.com


--- Comment #2 from Andrew Wiley <debio264@gmail.com> 2011-08-31 22:33:50 PDT ---
This pattern works well using the thread pool provided by std.parallelism, but the problems with immutable data going into and being returned from the function call still remain. I'm currently getting around this by casting to and from shared, but do you have any new thoughts on the Unique!T suggestion?

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------