May 15, 2015 Re: std.parallelism equivalents for posix fork and multi-machine processing | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Ola Fosheim Grøstad | On Thursday, 14 May 2015 at 20:56:16 UTC, Ola Fosheim Grøstad wrote: > On Thursday, 14 May 2015 at 20:28:20 UTC, Laeeth Isharc wrote: >> My own is a pragmatic commercial one. I have some problems which perhaps scale quite well, and rather than write it using fork directly, I would rather have a higher level wrapper along the lines of std.parallelism. > > Languages like Chapel and extended versions of C++ have built in support for parallel computing that is relatively effortless and designed by experts (Cray/IBM etc) to cover common patterns in demanding batch processing for those who wants something higher level than plain C++ (or in this case D which is pretty much the same thing). Yes - I am sure that there is excellent stuff here, from which one may learn much: especially if approaching it from a more theoretical or enterprisey industrial scale perspective. > However, you could consider combining single threaded processes in D with e.g. Python as a supervising process if the datasets allow it. You'll find lots of literature on Inter Process Communication (IPC) for Unix. Performance will be lower, but your own productivity might be higher, YMMV. But why would one use python when fork itself isn't hard to use in a narrow sense, and neither is the kind of interprocess communication I would like to do for the kind of tasks I have in mind. It just seems to make sense to have a light wrapper. Just because some problems in parallel processing are hard doesn't seem to me a reason not to do some work on addressing the easier ones that may in a practical sense have great value in having an imperfect (but real) solution for. Sometimes I have the sense when talking with you that the answer to any question is anything but D! ;) (But I am sure I must be mistaken!) >> Perhaps such would be flawed and limited, but often something is better than nothing, even if not perfect. And I mention it on the forum only because usually I have found the problems I face turn out to be those faced by many others too.. > > You need momentum in order to get from a raw state to something polished, so you essentially need a larger community that both have experience with the topic and a need for it in order to get a sensible framework that is maintained. True. But we are not speaking of getting from a raw state to perfection but just starting to play with the problem. If Walter Bright had listened to well-intentioned advice, he wouldn't be in the compiler business, let alone have given us what became D. I am no Walter Bright, but this is an easier problem to start exploring, and this would be beyond the scope of anything I would do just by myself. > If you can get away with it, the most common simplistic approach seems to be map-reduce. Because it is easy to distribute over many machines and there are frameworks that do the tedious bits for you. Yes, indeed. But my question was more about the distinctions between processes and threads and the non-obvious implications for the design of such a framework. Nice chatting. Laeeth. | |||
May 15, 2015 Re: std.parallelism equivalents for posix fork and multi-machine processing | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Laeeth Isharc | On Friday, 15 May 2015 at 00:07:15 UTC, Laeeth Isharc wrote: > But why would one use python when fork itself isn't hard to use in a narrow sense, and neither is the kind of interprocess communication I would like to do for the kind of tasks I have in mind. It just seems to make sense to have a light wrapper. The managing process doesn't have to be fast, but should be easy to reconfigure. It is overall more effective (not efficient) to use a scripting language with a REPL for scripty tasks. Forking comes with its own set of pitfalls. The unix-way is to have a conglomerate of simple processes tied together with a script. Overall easier to debug and modify. > Just because some problems in parallel processing are hard doesn't seem to me a reason not to do some work on addressing the easier ones that may in a practical sense have great value in having an imperfect (but real) solution for. Sometimes I have the sense when talking with you that the answer to any question is anything but D! ;) (But I am sure I must be mistaken!) I would have said the same thing about Rust and Nim too. Overall, what other people do with a tool affects the eco system and maturity. If you do system level programming you are less affected by the eco system then when you do higher level task-oriented programming. What is your mission, to solve a problem effectively now or to start building a new framework with a time horizon measured in years? You have to decide this first. Then you have to decide what is more expensive, your time or spending twice as much on CPU power (whether it is hardware or rented time at a datacenter). > True. But we are not speaking of getting from a raw state to perfection but just starting to play with the problem. If Walter Bright had listened to well-intentioned advice, he wouldn't be in the compiler business, let alone have given us what became D. He set out to build a new framework with a time horizon measured in decades. That's perfectly reasonable and what you have to expect when starting on a new language. If you want to build a framework for a specific use you need both the theoretical insights and the pragmatical experience in order to complete it in a timely manner. You need many many iterations to get to a state where it is better (than whatever people use today). Which is why most (sensible) engineers will pick existing solutions that are receiving polish, rather than the next big thing. > Yes, indeed. But my question was more about the distinctions between processes and threads and the non-obvious implications for the design of such a framework. If you want to use fork(), you might as well use threads, the main distinction is that with processes you have to be explicit about what resources to share, but after a fork() you also risk ending up in an inconsistent state if you aren't careful. With a fork based solution you still need to deal with a different level of complexity than you get with a Unixy conglomerate of simple programs that cooperate, the Unix way is easier to debug and test, but slower than an optimized multi threaded solution (and marginally slower than a process that fork itself). | |||
Copyright © 1999-2021 by the D Language Foundation
Permalink
Reply