Thread overview
[dmd-concurrency] Thread termination protocol (shutdown protocol evolved)
Jan 21, 2010
Michel Fortin
Jan 21, 2010
Robert Jacques
Jan 21, 2010
Michel Fortin
Jan 21, 2010
Michel Fortin
Feb 09, 2010
Sean Kelly
[dmd-concurrency] Shutdown protocol
Jan 21, 2010
Sean Kelly
Jan 22, 2010
Michel Fortin
January 21, 2010
Here is another idea for the "shutdown protocol". I'm changing the name to better reflect what the proposal is. Also take note that I've renamed the "Shutdown" exception to "Terminated".

It includes ideas from my previous proposal as well as from how Erlang handles linked processes. Linked processes in Erlang define an error handling mechanism, much like the one I'm proposing here. I was mistaken before about how it worked and what it did. This time I've integrated the concept correctly.

Thank you for reading! This might take a while. :-)

 - - -

The thread termination protocol has two goals:

* Establish a generic way of expressing when you want a thread to terminate that can cover a majority of cases. But it's important that cases not supported by it can still be handled by user-constructed termination protocols.

* Establish a generic way to handle thrown exceptions in spawned threads.

So the thread termination protocol relies on four important points:

1. When spawning a thread, the parent thread is set as the owner of the new one.

2. The owner link with the child thread can be broken by choosing another thread as the owner. Setting the owner to the main thread means that you don't want the child to be terminated until the program itself terminates.

3. When a thread terminates, it sends a Terminated exception message to each of the threads it owns.

4. When a child thread receives a Terminated exception message, the thread can handle it and even ignore it if it wants. But in the absence of corresponding message handlers and exception handlers, the thrown exception will stop the thread.

5. When a thread terminates via an exception other than Terminated, the exception is sent back as a message to the owner thread. In the absence of corresponding message handlers and exception handlers, the thrown exception will stop the owner thread and thus again send the exception as a message to the owner's owner, until it reaches the main thread (which has no owner).

Here is an important thing: sending a Terminated exception must not prevent the thread from receiving more messages afterwards. If the child thread chooses to ignore the Terminated message then nothing prevents it to continue receiving messages normally afterward. One reason for this is that it might want to postpone termination to perform a closing handshake with something else it is currently communicating with.

Also important is that you can at any time send manually a Terminated message to a thread when you want it to terminate.

And we might want to add a Tid field to the Exception class to identify the thread it originated from.

 - - -

Now, let's see how it works with various use cases. (This first case one is pretty much a repetition of the one that came along with my previous shutdown protocol proposal.)

For the file copy example with an intermediate processing step, it's a simple ownership graph:

	main -owns- read thread -owns- processing thread -owns- writer thread

When main terminates, it sends Terminated to the read thread, which ignores it because it's reading from a file. When the read threads finish reading, it terminates and send a Terminated to the processing thread which will receive it as its last message. When the processing thread receives Terminated it terminates which automatically sends a Terminated message to the writer thread. The writer threads then terminates after writing the last part. At this moment the program closes.

What happens if the writer thread throws an exception (other than Terminated)? The exception will terminate the writer thread, be sent back as a message to the processing thread, which will terminate and send the exception to the reader thread, which will terminate and send the exception to the main thread, which will terminate the program. If any of those threads in the middle of the chain is already terminated when the exception is thrown, the exception is sent directly to the owner's owner.

Of course, any thread in the graph might catch the exception, preventing it from percolating to other threads.

So this simple case works well out of the box. That's because the graph is a simple tree. If you have a thread spawning a child thread only to then give it to another thread, then you'll probably want to decide yourself when you want to terminate it and who should handle exceptions. Here is how that should work:

1. Create your thread, setting ownership to the main thread.
2. Give the Tid to whoever you want.
3. ...
4. Send the thread a Terminated exception when you're done with it.

Here the owner thread just acts as a safeguard in case you forget to send a Terminated message manually. You can set the owner to any thread that lives longer than the spawned thread, not necessarily the main thread. When you know you want to terminate the thread, just send it a Terminated exception.

You might want to setup a special "monitoring" thread as the owner of such child threads. This thread could catch exceptions leaking from child threads and do some error handling.

 - - -

For the API, I propose this:

	spawn(function, args...)
	// creates a new thread having the spawning thread as the owner.

	spawnOwned(ownerTid, function, args...)
	// creates a new thread with a specific owner.

	tid.owner = ownerTid
	// Changes the owner of a thread.
	// Note 1: this needs to be protected against circular ownerships.

	terminate(tid);
	// Sends a Terminated exception to the thread. This only works for
	// threads listening for messages.

This makes only two notable differences with Erlang:

1. You cannot have unlinked threads. This ensures that all threads receive a Terminated message eventually (if they don't terminate by themselves before that). This also make sure that uncaught exceptions will always be propagated back to somewhere, right up to the main thread if you don't catch them.

2. Sending a Terminated exception is a standard way to tell a thread to just stop. I don't think there is such a thing in Erlang. Fortunately, you don't have to obey the Terminated message if you don't want to, but most likely you'll just want to postpone termination while you clean things up.


-- 
Michel Fortin
michel.fortin at michelf.com
http://michelf.com/



January 21, 2010
On Thu, 21 Jan 2010 13:11:17 -0500, Michel Fortin <michel.fortin at michelf.com> wrote:

> Here is another idea for the "shutdown protocol". I'm changing the name to better reflect what the proposal is. Also take note that I've renamed the "Shutdown" exception to "Terminated".
>
> It includes ideas from my previous proposal as well as from how Erlang handles linked processes. Linked processes in Erlang define an error handling mechanism, much like the one I'm proposing here. I was mistaken before about how it worked and what it did. This time I've integrated the concept correctly.
>
> Thank you for reading! This might take a while. :-)
>
>  - - -
>
> The thread termination protocol has two goals:
>
> * Establish a generic way of expressing when you want a thread to terminate that can cover a majority of cases. But it's important that cases not supported by it can still be handled by user-constructed termination protocols.
>
> * Establish a generic way to handle thrown exceptions in spawned threads.
>
> So the thread termination protocol relies on four important points:
>
> 1. When spawning a thread, the parent thread is set as the owner of the new one.
>
> 2. The owner link with the child thread can be broken by choosing another thread as the owner. Setting the owner to the main thread means that you don't want the child to be terminated until the program itself terminates.
>
> 3. When a thread terminates, it sends a Terminated exception message to each of the threads it owns.
>
> 4. When a child thread receives a Terminated exception message, the thread can handle it and even ignore it if it wants. But in the absence of corresponding message handlers and exception handlers, the thrown exception will stop the thread.
>
> 5. When a thread terminates via an exception other than Terminated, the exception is sent back as a message to the owner thread. In the absence of corresponding message handlers and exception handlers, the thrown exception will stop the owner thread and thus again send the exception as a message to the owner's owner, until it reaches the main thread (which has no owner).
>
> Here is an important thing: sending a Terminated exception must not prevent the thread from receiving more messages afterwards. If the child thread chooses to ignore the Terminated message then nothing prevents it to continue receiving messages normally afterward. One reason for this is that it might want to postpone termination to perform a closing handshake with something else it is currently communicating with.
>
> Also important is that you can at any time send manually a Terminated message to a thread when you want it to terminate.
>
> And we might want to add a Tid field to the Exception class to identify the thread it originated from.
>
>  - - -
>
> Now, let's see how it works with various use cases. (This first case one is pretty much a repetition of the one that came along with my previous shutdown protocol proposal.)
>
> For the file copy example with an intermediate processing step, it's a simple ownership graph:
>
> 	main -owns- read thread -owns- processing thread -owns- writer thread
>
> When main terminates, it sends Terminated to the read thread, which ignores it because it's reading from a file. When the read threads finish reading, it terminates and send a Terminated to the processing thread which will receive it as its last message. When the processing thread receives Terminated it terminates which automatically sends a Terminated message to the writer thread. The writer threads then terminates after writing the last part. At this moment the program closes.
>
> What happens if the writer thread throws an exception (other than Terminated)? The exception will terminate the writer thread, be sent back as a message to the processing thread, which will terminate and send the exception to the reader thread, which will terminate and send the exception to the main thread, which will terminate the program. If any of those threads in the middle of the chain is already terminated when the exception is thrown, the exception is sent directly to the owner's owner.
>
> Of course, any thread in the graph might catch the exception, preventing it from percolating to other threads.
>
> So this simple case works well out of the box. That's because the graph is a simple tree. If you have a thread spawning a child thread only to then give it to another thread, then you'll probably want to decide yourself when you want to terminate it and who should handle exceptions. Here is how that should work:
>
> 1. Create your thread, setting ownership to the main thread.
> 2. Give the Tid to whoever you want.
> 3. ...
> 4. Send the thread a Terminated exception when you're done with it.
>
> Here the owner thread just acts as a safeguard in case you forget to send a Terminated message manually. You can set the owner to any thread that lives longer than the spawned thread, not necessarily the main thread. When you know you want to terminate the thread, just send it a Terminated exception.
>
> You might want to setup a special "monitoring" thread as the owner of such child threads. This thread could catch exceptions leaking from child threads and do some error handling.
>
>  - - -
>
> For the API, I propose this:
> 
> 	spawn(function, args...)
> 	// creates a new thread having the spawning thread as the owner.
>
> 	spawnOwned(ownerTid, function, args...)
> 	// creates a new thread with a specific owner.
>
> 	tid.owner = ownerTid
> 	// Changes the owner of a thread.
> 	// Note 1: this needs to be protected against circular ownerships.
>
> 	terminate(tid);
> 	// Sends a Terminated exception to the thread. This only works for
> 	// threads listening for messages.
>
> This makes only two notable differences with Erlang:
>
> 1. You cannot have unlinked threads. This ensures that all threads receive a Terminated message eventually (if they don't terminate by themselves before that). This also make sure that uncaught exceptions will always be propagated back to somewhere, right up to the main thread if you don't catch them.
>
> 2. Sending a Terminated exception is a standard way to tell a thread to just stop. I don't think there is such a thing in Erlang. Fortunately, you don't have to obey the Terminated message if you don't want to, but most likely you'll just want to postpone termination while you clean things up.

Looks okay at first glance. To reduce namespace pollution: terminate(tid) -> tid.terminate. Also, overloading spawn and spawnOwned should also be considered. To clarify, the exception/terminate message passing are passed with the same priority as normal messages, so they only get re-thrown after prior messages are sent / received / etc. Correct?

January 21, 2010
Michel Fortin wrote:
> Here is another idea for the "shutdown protocol".

On first read, it seems to have all desired features, the right defaults, and the right amount of aggravation for the cases that do want to ignore the defaults. Thanks Michel.

Let's discuss this some more and knock it into shape. My only mild disagreement is that people can catch Terminate to continue spawning and using threads indefinitely, but I'm willing to concede that point given that the design is otherwise coherent.

But in that case I think we should make Terminate inherit Error, not Exception. That way just writing catch (Exception e) won't catch Terminate. Agree?


Andrei
January 21, 2010
Le 2010-01-21 ? 13:43, Andrei Alexandrescu a ?crit :

> Michel Fortin wrote:
>> Here is another idea for the "shutdown protocol".
> 
> On first read, it seems to have all desired features, the right defaults, and the right amount of aggravation for the cases that do want to ignore the defaults. Thanks Michel.

You're welcome.


> Let's discuss this some more and knock it into shape. My only mild disagreement is that people can catch Terminate to continue spawning and using threads indefinitely, but I'm willing to concede that point given that the design is otherwise coherent.

I knew you wouldn't like that. :-)

But I think you need that sometime and you shouldn't have to reinvent a separate termination system just for those cases (which includes figuring a way to prevent automatic termination messages). The fact that the default (throwing Terminate) does stop a thread makes it unlikely that someone will


> But in that case I think we should make Terminate inherit Error, not Exception. That way just writing catch (Exception e) won't catch Terminate. Agree?

I disagree about inheriting from Error: Terminate isn't an error. You're probably right that it shouldn't be an Exception, so in that case it should inherit from Throwable, not Error.


-- 
Michel Fortin
michel.fortin at michelf.com
http://michelf.com/



January 21, 2010
Le 2010-01-21 ? 13:36, Robert Jacques a ?crit :

> Looks okay at first glance. To reduce namespace pollution: terminate(tid) -> tid.terminate. Also, overloading spawn and spawnOwned should also be considered.

Overloading could work. The advantage of a different name though is that it's easier to spot while reading the code that you're giving ownership to a different thread.


> To clarify, the exception/terminate message passing are passed with the same priority as normal messages, so they only get re-thrown after prior messages are sent / received / etc. Correct?

That was the idea, yes.

I think the simplest case (sequential) should be the default. That said, it might be useful to add a policy to be able to receive messages by priority, and not just exception messages. But it's probably a little premature right now.

-- 
Michel Fortin
michel.fortin at michelf.com
http://michelf.com/



January 21, 2010
Michel Fortin wrote:
> I disagree about inheriting from Error: Terminate isn't an error. You're probably right that it shouldn't be an Exception, so in that case it should inherit from Throwable, not Error.

The hierarchy as described in TDPL has Error and Exception inheriting Error. I think it's ok to rename Error into the more descriptive Throwable. Sean?

Andrei
January 21, 2010
On Jan 21, 2010, at 10:59 AM, Steve Schveighoffer wrote:

> Le 2010-01-21 ? 13:43, Andrei Alexandrescu a ?crit :
> 
>> But in that case I think we should make Terminate inherit Error, not Exception. That way just writing catch (Exception e) won't catch Terminate. Agree?
> 
> I disagree about inheriting from Error: Terminate isn't an error. You're probably right that it shouldn't be an Exception, so in that case it should inherit from Throwable, not Error.

It's been suggested that Throwable be removed and Exception derive from Error.
January 21, 2010
Sean Kelly wrote:
> On Jan 21, 2010, at 10:59 AM, Steve Schveighoffer wrote:
> 
>> Le 2010-01-21 ? 13:43, Andrei Alexandrescu a ?crit :
>>
>>> But in that case I think we should make Terminate inherit Error, not Exception. That way just writing catch (Exception e) won't catch Terminate. Agree?
>> I disagree about inheriting from Error: Terminate isn't an error. You're probably right that it shouldn't be an Exception, so in that case it should inherit from Throwable, not Error.
> 
> It's been suggested that Throwable be removed and Exception derive from Error.

I suggested that. In wake of this discussion, I think removing Error and having Exception inherit Throwable is better naming. I'm still mildly opposed to having all three Throwable, Error, and Exception around, unless there's evidence we do need three types.

Andrei
January 22, 2010
Le 2010-01-21 ? 15:11, Sean Kelly a ?crit :

> On Jan 21, 2010, at 10:59 AM, Steve Schveighoffer wrote:

This is misquoted, my name should be up here. (Also the subject changed, I wonder what happened.)

>> Le 2010-01-21 ? 13:43, Andrei Alexandrescu a ?crit :
>> 
>>> But in that case I think we should make Terminate inherit Error, not Exception. That way just writing catch (Exception e) won't catch Terminate. Agree?
>> 
>> I disagree about inheriting from Error: Terminate isn't an error. You're probably right that it shouldn't be an Exception, so in that case it should inherit from Throwable, not Error.
> 
> It's been suggested that Throwable be removed and Exception derive from Error.

But then the same problem arise: should someone catching errors also catch Terminate? Terminate isn't an error. That doesn't prevent Exception from deriving from Error though.

-- 
Michel Fortin
michel.fortin at michelf.com
http://michelf.com/



February 09, 2010
On Jan 21, 2010, at 11:27 AM, Andrei Alexandrescu wrote:

> Michel Fortin wrote:
>> I disagree about inheriting from Error: Terminate isn't an error. You're probably right that it shouldn't be an Exception, so in that case it should inherit from Throwable, not Error.
> 
> The hierarchy as described in TDPL has Error and Exception inheriting Error. I think it's ok to rename Error into the more descriptive Throwable. Sean?

Oops... I just noticed that this was on my "to reply" list.  The only issue with this is that specific Error objects would all be siblings of Exception--they wouldn't have a common parent to identify them as Errors.  I thought it might be handy to have a common parent in case we want to treat Errors differently at some point, though I guess we could just as easily assume that everything that isn't an Exception is an Error and exclude that way.  It just means slightly different logic:

try {}
catch( Exception e ) { throw e; }
catch( Throwable e ) { /* handle error */ }