Passing data and ownership to new thread

Sep 26, 2017

James Brister

Sep 26, 2017

Jonathan M Davis

Sep 26, 2017

jmh530

Sep 26, 2017

Moritz Maxeiner

Sep 26, 2017

Moritz Maxeiner

I'm pretty new to D, but from what I've seen there are two modes of using data across threads: (a) immutable message passing and the new thread copies the data if it needs to be modified, (b) shared, assuming the data will be modified by both threads, and the limits that imposes . But why nothing for passing an object of some sort to another thread with the ownership moving to the new thread. I suppose this would be hard enforce at the language level, but wouldn't you want this when trying to pass large-ish data structures from one thread to another (thinking of a network server, such as a DHCP server, that has a thread for handling the network interface and it reads the incoming requests and then passes off to another thread to handle).

September 26, 2017

Re: Passing data and ownership to new thread

Posted by Jonathan M Davis
in reply to James Brister

Permalink

Jonathan M Davis

Posted in reply to James Brister

Permalink

On Tuesday, September 26, 2017 09:10:41 James Brister via Digitalmars-d wrote:
> I'm pretty new to D, but from what I've seen there are two modes of using data across threads: (a) immutable message passing and the new thread copies the data if it needs to be modified, (b) shared, assuming the data will be modified by both threads, and the limits that imposes . But why nothing for passing an object of some sort to another thread with the ownership moving to the new thread. I suppose this would be hard enforce at the language level, but wouldn't you want this when trying to pass large-ish data structures from one thread to another (thinking of a network server, such as a DHCP server, that has a thread for handling the network interface and it reads the incoming requests and then passes off to another thread to handle).

The problem is that the type system has no concept of thread ownership or memory ownership in general (beyond knowing whether something is typed as thread-local or shared, and even that doesn't say what the data was originally, just what it's currently treated as), and the compiler has no way of determining that there are no other references to the data that you're passing between threads (at least not beyond very simply cases). The programmer can cast an object to shared or immutable on one thread, pass it, and then cast it to mutable on the other and essentially pass ownership of the object in the process, but it's up to the programmer to verify that that object is no longer referenced by anything on the thread that it came from, and it's up to the programmer to make sure that casting doesn't violate the type system. As it is, if the cast is mutable to immutable to mutable, you're technically violating the type system but in a way that will always work - so casting to and from shared is definitely better, but std.concurrency does have some bugs with regards to shared where it won't always allow a type to be passed when it should. Dealing with types that you can copy rather than really needing to pass ownership is always cleaner but not necessarily efficient.

In princple, what we'd really like is the ability to safely take a mutable object that has no other references to it, pass it to another thread, and have all of that work with no casting, but D's type system simply does not have the level of information in it that it would need to make that work. It's my understanding that Rust tries to encode ownership like that into its type system but that that makes its type system considerably more complicated. D doesn't make the attempt. It just leaves it up to the programmer to get it right - which isn't ideal, but there's only so much that we can do without making the type system too restrictive or too unwieldy.

Perhaps someone will come up with a solution that will work under at some set of circumstances without over complicating things, but thus far, no one has. So, we're essentially forced to have the programmer either send data that has no references or send data that can be left as immutable or copied from immutable, or require the programmer to carefully use casts. It's not ideal, but for the most part, it works if you're careful.

- Jonathan M Davis

On Tuesday, 26 September 2017 at 09:10:41 UTC, James Brister wrote: > I'm pretty new to D, but from what I've seen there are two modes of using data across threads: (a) immutable message passing and the new thread copies the data if it needs to be modified, (b) shared, assuming the data will be modified by both threads, and the limits that imposes . But why nothing for passing an object of some sort to another thread with the ownership moving to the new thread. If you're talking about Phobos (import std.{...}): AFAICT because no one has has a strong enough need implement such a thing and propose it for Phobos inclusion. If you're talking about the language: Because D doesn't have any builtin concept of ownership. > I suppose this would be hard enforce at the language level, but wouldn't you want this when trying to pass large-ish data structures from one thread to another (thinking of a network server, such as a DHCP server, that has a thread for handling the network interface and it reads the incoming requests and then passes off to another thread to handle). In such server code you're probably better off distributing the request reading (and potentially even the client socket accepting) to multiple workers, e.g. having multiple threads (or processes for that matter, as that can minimize downtime when combined with process supervision) listening on their own socket with the same address:port (see SO_REUSEPORT). If you really want to do it, though, the way I'd start going about it would be with a classic work queue / thread pool system. Below is pseudo code showing how to do that for a oneshot request scenario. [shared data] work_queue (protect methods with mutex or use a lockfree queue) main thread: loop: auto client_socket = accept(...); // Allocate request on the heap Request* request = client_socket.readRequest(...); // Send a pointer to the request to the work queue work_queue ~= tuple(client_socket, request); // Model "ownership" by forgetting about client_socket and request here worker thread: loop: ... auto job = work_queue.pop(); scope (exit) { close(job[0]); free(job[1]); } auto response = job[1].handle(); client_socket.writeResponse(response);

On Tuesday, 26 September 2017 at 09:10:41 UTC, James Brister wrote: > I'm pretty new to D, but from what I've seen there are two modes of using data across threads: (a) immutable message passing and the new thread copies the data if it needs to be modified, (b) shared, assuming the data will be modified by both threads, and the limits that imposes . But why nothing for passing an object of some sort to another thread with the ownership moving to the new thread. If you're talking about the language: Because D doesn't have any builtin concept of ownership. If you're talking about Phobos (import std.{...}): Because a general solution is not a trivial problem (see Jonathan's answer for more detail). > I suppose this would be hard enforce at the language level, but wouldn't you want this when trying to pass large-ish data structures from one thread to another (thinking of a network server, such as a DHCP server, that has a thread for handling the network interface and it reads the incoming requests and then passes off to another thread to handle). In such server code you're probably better off distributing the request reading (and potentially even the client socket accepting) to multiple workers, e.g. having multiple threads (or processes for that matter, as that can minimize downtime when combined with process supervision) listening on their own socket with the same address:port (see SO_REUSEPORT). If you really want to do it, though, the way I'd start going about it would be with a classic work queue / thread pool system. Below is pseudo code showing how to do that for a oneshot request scenario. [shared data] work_queue (synchronize methods e.g. with mutex or use a lockfree queue) main thread: loop: auto client_socket = accept(...); // Allocate request on the heap Request* request = client_socket.readRequest(...); // Send a pointer to the request to the work queue work_queue ~= tuple(client_socket, request); // Poor man's ownership by forgetting about client_socket and request here worker thread: loop: ... auto job = work_queue.pop(); scope (exit) { close(job[0]); free(job[1]); } auto response = job[1].handle(); job[0].writeResponse(response);

On Tuesday, 26 September 2017 at 09:36:39 UTC, Jonathan M Davis wrote: > > [snip] It's my understanding that Rust tries to encode ownership like that into its type system but that that makes its type system considerably more complicated. D doesn't make the attempt. Reference capabilities in pony are also an interesting (albeit complicated) approach. https://tutorial.ponylang.org/capabilities/reference-capabilities.html

Forums