Thread overview | |||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
August 29, 2010 [phobos] std.parallelism: Request for review/comment | ||||
---|---|---|---|---|
| ||||
Since there seems to be interest in my parallelfuture library becoming std.parallelism even if it uses unchecked implicit sharing, I've cleaned up the code, removed anything that could possibly be construed as having been borrowed from Tango (which previously was just a few tiny, unoriginal ASM snippets) and improved the documentation. I'm requesting review of it. The code is available at: http://dsource.org/projects/scrapple/browser/trunk/parallelFuture/std_parallelism.d A draft of how the documentation would look is available at: http://cis.jhu.edu/~dsimcha/d/phobos/std_parallelism.html BTW, how are we working the review process for new modules? Do they get reviewed here first, then in the NG, should they be posted directly to the NG, or what? --Dave |
August 29, 2010 [phobos] std.parallelism: Request for review/comment | ||||
---|---|---|---|---|
| ||||
Posted in reply to David Simcha | I have tried it before, and based on that (unless you have made substantial changes since it's first incarnation as 'parallelfuture') I say it's good to go.
I only have two suggestions:
1. Can we call it std.parallel instead?
2. IMO, the killer feature of the module is the parallel foreach.
Therefore, I think parallel() should be moved to the top of the module,
so it's the first thing one sees upon reading the documentation.
-Lars
On Sun, 2010-08-29 at 13:56 -0400, David Simcha wrote:
> Since there seems to be interest in my parallelfuture library becoming std.parallelism even if it uses unchecked implicit sharing, I've cleaned up the code, removed anything that could possibly be construed as having been borrowed from Tango (which previously was just a few tiny, unoriginal ASM snippets) and improved the documentation. I'm requesting review of it. The code is available at:
>
> http://dsource.org/projects/scrapple/browser/trunk/parallelFuture/std_parallelism.d
>
> A draft of how the documentation would look is available at:
>
> http://cis.jhu.edu/~dsimcha/d/phobos/std_parallelism.html
>
> BTW, how are we working the review process for new modules? Do they get reviewed here first, then in the NG, should they be posted directly to the NG, or what?
>
> --Dave
> _______________________________________________
> phobos mailing list
> phobos at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/phobos
|
August 29, 2010 [phobos] std.parallelism: Request for review/comment | ||||
---|---|---|---|---|
| ||||
Posted in reply to Lars Tandle Kyllingstad | This is basically the latest incarnation of ParallelFuture, but with
improved documentation. A lot has changed, though, since the first
incarnation. Here are some of the more important changes:
1. Parallel map and reduce.
2. Merge Task and Future.
3. foreach(i, elem; pool.parallel(range)) now works.
4. ref foreach now works even for non-random access ranges. (It's
implemented by incrementally copying pointers into an array.)
5. Worker-local storage.
On 8/29/2010 4:42 PM, Lars Tandle Kyllingstad wrote:
> I have tried it before, and based on that (unless you have made substantial changes since it's first incarnation as 'parallelfuture') I say it's good to go.
>
> I only have two suggestions:
>
> 1. Can we call it std.parallel instead?
>
> 2. IMO, the killer feature of the module is the parallel foreach.
> Therefore, I think parallel() should be moved to the top of the module,
> so it's the first thing one sees upon reading the documentation.
>
> -Lars
>
>
> On Sun, 2010-08-29 at 13:56 -0400, David Simcha wrote:
>> Since there seems to be interest in my parallelfuture library becoming std.parallelism even if it uses unchecked implicit sharing, I've cleaned up the code, removed anything that could possibly be construed as having been borrowed from Tango (which previously was just a few tiny, unoriginal ASM snippets) and improved the documentation. I'm requesting review of it. The code is available at:
>>
>> http://dsource.org/projects/scrapple/browser/trunk/parallelFuture/std_parallelism.d
>>
>> A draft of how the documentation would look is available at:
>>
>> http://cis.jhu.edu/~dsimcha/d/phobos/std_parallelism.html
>>
>> BTW, how are we working the review process for new modules? Do they get reviewed here first, then in the NG, should they be posted directly to the NG, or what?
>>
>> --Dave
>> _______________________________________________
>> phobos mailing list
>> phobos at puremagic.com
>> http://lists.puremagic.com/mailman/listinfo/phobos
>
> _______________________________________________
> phobos mailing list
> phobos at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/phobos
>
|
August 31, 2010 [phobos] std.parallelism: Request for review/comment | ||||
---|---|---|---|---|
| ||||
Posted in reply to David Simcha | Point (3) is pretty cool. I just used your module for my current project at work, and the ability to get the index made the code a lot nicer.
Another question: Why have you chosen the default number of work units to be just two units per thread? In my experience, it's not uncommon that calculations are harder on some parts of the range than others, and then there is a risk of some cores running out of work to do. I'd think that having more work units, 3-4 per thread, say, would allow for better distribution of work between cores.
-Lars
On Sun, 2010-08-29 at 17:32 -0400, David Simcha wrote:
> This is basically the latest incarnation of ParallelFuture, but with improved documentation. A lot has changed, though, since the first incarnation. Here are some of the more important changes:
>
> 1. Parallel map and reduce.
> 2. Merge Task and Future.
> 3. foreach(i, elem; pool.parallel(range)) now works.
> 4. ref foreach now works even for non-random access ranges. (It's
> implemented by incrementally copying pointers into an array.)
> 5. Worker-local storage.
>
> On 8/29/2010 4:42 PM, Lars Tandle Kyllingstad wrote:
> > I have tried it before, and based on that (unless you have made substantial changes since it's first incarnation as 'parallelfuture') I say it's good to go.
> >
> > I only have two suggestions:
> >
> > 1. Can we call it std.parallel instead?
> >
> > 2. IMO, the killer feature of the module is the parallel foreach.
> > Therefore, I think parallel() should be moved to the top of the module,
> > so it's the first thing one sees upon reading the documentation.
> >
> > -Lars
> >
> >
> > On Sun, 2010-08-29 at 13:56 -0400, David Simcha wrote:
> >> Since there seems to be interest in my parallelfuture library becoming std.parallelism even if it uses unchecked implicit sharing, I've cleaned up the code, removed anything that could possibly be construed as having been borrowed from Tango (which previously was just a few tiny, unoriginal ASM snippets) and improved the documentation. I'm requesting review of it. The code is available at:
> >>
> >> http://dsource.org/projects/scrapple/browser/trunk/parallelFuture/std_parallelism.d
> >>
> >> A draft of how the documentation would look is available at:
> >>
> >> http://cis.jhu.edu/~dsimcha/d/phobos/std_parallelism.html
> >>
> >> BTW, how are we working the review process for new modules? Do they get reviewed here first, then in the NG, should they be posted directly to the NG, or what?
> >>
> >> --Dave
> >> _______________________________________________
> >> phobos mailing list
> >> phobos at puremagic.com
> >> http://lists.puremagic.com/mailman/listinfo/phobos
> >
> > _______________________________________________
> > phobos mailing list
> > phobos at puremagic.com
> > http://lists.puremagic.com/mailman/listinfo/phobos
> >
>
> _______________________________________________
> phobos mailing list
> phobos at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/phobos
|
August 31, 2010 [phobos] std.parallelism: Request for review/comment | ||||
---|---|---|---|---|
| ||||
Posted in reply to Lars Tandle Kyllingstad | On 8/31/2010 6:22 AM, Lars Tandle Kyllingstad wrote:
> Point (3) is pretty cool. I just used your module for my current project at work, and the ability to get the index made the code a lot nicer.
>
> Another question: Why have you chosen the default number of work units to be just two units per thread? In my experience, it's not uncommon that calculations are harder on some parts of the range than others, and then there is a risk of some cores running out of work to do. I'd think that having more work units, 3-4 per thread, say, would allow for better distribution of work between cores.
>
> -Lars
Good point. I should probably change this, as the more I think about it the more I realize that I never use the default for the reason you mention. It seemed like a good idea in iteration 1, and then I just never reconsidered.
|
August 31, 2010 [phobos] std.parallelism: Request for review/comment | ||||
---|---|---|---|---|
| ||||
Posted in reply to David Simcha | A few general comments: I'm wondering whether the use of a TaskPool should always be explicit. Perhaps a default pool should be considered to exist, and calling parallel() instead of pool.parallel() will use that? It's rare for apps to have multiple distinct pools anyway, so this may be an easy way to help obscure a tiny bit more of the implementation. Rather than making the use of 'break' inside parallel foreach undefined, why not say that it aborts the processing of the current block? Ideally, the syntax for map() and reduce() should match std.algorithm. Not sure what this means for blockSize though. Maybe it should be a template parameter? Regarding the code... There are a lot of places where you do: lock(); doSomething(); unlock(); This isn't exception safe, which isn't a big deal since the enclosed code won't throw anyway, but it's still bad form. Use this instead: synchronized (mutex) { doSomething(); } There's some black magic in the Mutex class that makes it work with the synchronized statement, so it will do exactly the same thing but be exception-safe as well. You can even do: class C { this() { mutex = new Mutex(this); } } to make the mutex the enclosing object's monitor so that "synchronized void func() {}" will lock the mutex. This may turn out to be necessary when/if shared is applied to the code (which will require core.sync to work properly with shared, etc... something I'm going to take care of shortly). When waiting on condition variables, there's a remote chance that wait() will return when there isn't actually something to process. Instead of: if (!condition) wait(); use: while (!condition) wait(); I saw the "if" approach in "AbstractTask* pop()"... I think that's the only one. In "ReturnType spinWait()" you throw "exception" if it exists but don't clear the variable. Will this be a problem? That's all I saw in a quick scan of the code. I'll trust that there are no races and play with it in a bit. |
August 31, 2010 [phobos] std.parallelism: Request for review/comment | ||||
---|---|---|---|---|
| ||||
Posted in reply to Sean Kelly | On Tue, Aug 31, 2010 at 3:07 PM, Sean Kelly <sean at invisibleduck.org> wrote: > A few general comments: > > I'm wondering whether the use of a TaskPool should always be explicit. > Perhaps a default pool should be considered to exist, and calling > parallel() instead of pool.parallel() will use that? It's rare for apps to > have multiple distinct pools anyway, so this may be an easy way to help > obscure a tiny bit more of the implementation. > I could create a *default* lazily initialized Singleton for the TaskPool (a few other people I've showed this code to have requested this, too, so maybe it's a trend), but I absolutely insist on being able to manually initialize it. First of all, this gives the user the ability to manually decide how many threads to use. Secondly, I've had use cases where all threads in a pool are blocked waiting for the same really expensive computation to complete, so I fire up a new pool to do that really expensive computation in parallel. > > Rather than making the use of 'break' inside parallel foreach undefined, why not say that it aborts the processing of the current block? > Eventually I'll define it to do something, but I kept it undefined for now because I don't want to pin down what it does yet. Ideally I'd like it to not only stop the current block, but end the processing of the whole rest of the parallel foreach loop. This is more difficult to implement, though. > > Ideally, the syntax for map() and reduce() should match std.algorithm. Not > sure what this means for blockSize though. Maybe it should be a template > parameter? > It does match std.algorithm as far as I can tell, as long as you use the default block size and allow a buffer to be automatically allocated. If it doesn't, let me know, because I meant to make it match. > > > Regarding the code... > > There are a lot of places where you do: > > lock(); > doSomething(); > unlock(); > > This isn't exception safe, which isn't a big deal since the enclosed code won't throw anyway, but it's still bad form. Use this instead: > > synchronized (mutex) { > doSomething(); > } > I'm aware that that's not exception safe, but I did it anyhow because I knew that the enclosed code wouldn't throw. The reason it's like this is because I was experimenting with different locking mechanisms, such as making everything spinlock-based, at one point. Unless there are strong objections, I prefer to leave it like this so that I can easily experiment with different locking mechanisms in the future. > > There's some black magic in the Mutex class that makes it work with the synchronized statement, so it will do exactly the same thing but be exception-safe as well. You can even do: > > class C { this() { mutex = new Mutex(this); } } > > to make the mutex the enclosing object's monitor so that "synchronized void func() {}" will lock the mutex. This may turn out to be necessary when/if shared is applied to the code (which will require core.sync to work properly with shared, etc... something I'm going to take care of shortly). > > When waiting on condition variables, there's a remote chance that wait() will return when there isn't actually something to process. Instead of: > > if (!condition) > wait(); > > use: > > while (!condition) > wait(); > > I saw the "if" approach in "AbstractTask* pop()"... I think that's the only > one. > Thanks, will do. How does this happen, though? > > In "ReturnType spinWait()" you throw "exception" if it exists but don't clear the variable. Will this be a problem? > I can't see why it would be a problem. Also, if for some crazy reason the user handles the exception and then tries to get at the return value again, IMHO the correct behavior is to throw it again. If there's something I'm missing, please let me know. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.puremagic.com/pipermail/phobos/attachments/20100831/ea59b442/attachment.html> |
September 01, 2010 [phobos] std.parallelism: Request for review/comment | ||||
---|---|---|---|---|
| ||||
Posted in reply to David Simcha | Previously, I wrote future module. http://bitbucket.org/repeatedly/scrap/src/tip/future.d This is not product level. This module for asynchronous calculation in my test. I think parallel features are very important (Currently, many core is default). In addition, some products support parallel-processing framework (MongoDB provides MapReduce for batch manipulation). So, I want such features in Phobos. This is a good candidate if other modules doesn't exist. At the moment, I am busy. Sorry, please wait the review. Masahiro On Mon, 30 Aug 2010 02:56:55 +0900, David Simcha <dsimcha at gmail.com> wrote: > Since there seems to be interest in my parallelfuture library becoming > std.parallelism even if it uses unchecked implicit sharing, I've cleaned > up the code, removed anything that could possibly be construed as having > been borrowed from Tango (which previously was just a few tiny, > unoriginal ASM snippets) and improved the documentation. I'm requesting > review of it. The code is available at: > > http://dsource.org/projects/scrapple/browser/trunk/parallelFuture/std_parallelism.d > > A draft of how the documentation would look is available at: > > http://cis.jhu.edu/~dsimcha/d/phobos/std_parallelism.html > > BTW, how are we working the review process for new modules? Do they get reviewed here first, then in the NG, should they be posted directly to the NG, or what? > > --Dave |
August 31, 2010 [phobos] std.parallelism: Request for review/comment | ||||
---|---|---|---|---|
| ||||
Posted in reply to David Simcha | On Aug 31, 2010, at 12:41 PM, David Simcha wrote: > > On Tue, Aug 31, 2010 at 3:07 PM, Sean Kelly <sean at invisibleduck.org> wrote: A few general comments: > > I'm wondering whether the use of a TaskPool should always be explicit. Perhaps a default pool should be considered to exist, and calling parallel() instead of pool.parallel() will use that? It's rare for apps to have multiple distinct pools anyway, so this may be an easy way to help obscure a tiny bit more of the implementation. > > I could create a default lazily initialized Singleton for the TaskPool (a few other people I've showed this code to have requested this, too, so maybe it's a trend), but I absolutely insist on being able to manually initialize it. First of all, this gives the user the ability to manually decide how many threads to use. Secondly, I've had use cases where all threads in a pool are blocked waiting for the same really expensive computation to complete, so I fire up a new pool to do that really expensive computation in parallel. Right. I've added some fancy behavior to the TaskPool itself so the thread count is a bit more dynamic, but that tends to overcomplicate the code. Your approach is probably better. > Rather than making the use of 'break' inside parallel foreach undefined, why not say that it aborts the processing of the current block? > > > When waiting on condition variables, there's a remote chance that wait() will return when there isn't actually something to process. Instead of: > > if (!condition) > wait(); > > use: > > while (!condition) > wait(); > > I saw the "if" approach in "AbstractTask* pop()"... I think that's the only one. > > Thanks, will do. How does this happen, though? Theoretical mumbo-jumbo, mostly. Here's the wiki entry: http://en.wikipedia.org/wiki/Spurious_wakeup For the most part, it's just good programming practice to double-check the invariant on wakeup. > > In "ReturnType spinWait()" you throw "exception" if it exists but don't clear the variable. Will this be a problem? > > I can't see why it would be a problem. Also, if for some crazy reason the user handles the exception and then tries to get at the return value again, IMHO the correct behavior is to throw it again. If there's something I'm missing, please let me know. I hadn't spent enough time with the code to know what the correct behavior should be, so I asked. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.puremagic.com/pipermail/phobos/attachments/20100831/7a8786f0/attachment-0001.html> |
August 31, 2010 [phobos] std.parallelism: Request for review/comment | ||||
---|---|---|---|---|
| ||||
Posted in reply to Sean Kelly | On Tue, 31 Aug 2010, Sean Kelly wrote: > On Aug 31, 2010, at 12:41 PM, David Simcha wrote: > > > > When waiting on condition variables, there's a remote chance that wait() will return when there isn't actually something to process. Instead of: > > > > if (!condition) > > wait(); > > > > use: > > > > while (!condition) > > wait(); > > > > I saw the "if" approach in "AbstractTask* pop()"... I think that's the only one. > > > > Thanks, will do. How does this happen, though? > > Theoretical mumbo-jumbo, mostly. Here's the wiki entry: http://en.wikipedia.org/wiki/Spurious_wakeup For the most part, it's just good programming practice to double-check the invariant on wakeup. Very much not theoretical. I've seen it happen more than a little. A lot of it can be blamed on using notify all rather than notify one type behavior. The other ways it can happen is signals for things like gc pause/resume, or pipe, or ... breaking in-progress sleeps. It's one of the class of issues that tends not to happen in small simple apps, so don't learn the risks until later in some large or complex app, wierd things happen. Anyway, gotta do it, it does happen. Later, Brad -------------- next part -------------- _______________________________________________ phobos mailing list phobos at puremagic.com http://lists.puremagic.com/mailman/listinfo/phobos |
Copyright © 1999-2021 by the D Language Foundation