View mode: basic / threaded / horizontal-split · Log in · Help
January 21, 2010
[dmd-concurrency] Shutdown protocol
----- Original Message ----
> From: Andrei Alexandrescu <andrei at erdani.com>
> 
> Steve Schveighoffer wrote:
> > ----- Original Message ----
> > 
> >> From: Andrei Alexandrescu 
> >> Steve Schveighoffer wrote:
> >>>   foreach(tid; tids) {tid.join();}
> >> Oh btw that shouldn't be coded like that, it's slow. Must be something like:
> >>
> >> joinAll(tids);
> >>
> >> so joining is initiated in parallel.
> >>
> > 
> > What does joinAll do?  I admit I am not too familiar with your proposed API.
> 
> The idea is to join on all threads in parallel. Otherwise, if you have N 
> threads, the Nth will become aware you plan to shutdown only after all 
> others have already finished.

Hm... in all other threading libraries I used, calling join did not change the state of the thread at all, it just waited for the thread to return.  So in those libraries, "joining threads in parallel" had no effect.

I thought triggering a shutdown would be done separately from join (see the call to "shutdown()" in my code sample).

In that case, you may wish to rename what you call "join" and "joinAll", since a common usage of join in other threading libraries (including pthreads) is to wait until a thread has exited, not to tell it that it should shut down.  It will cause confusion if it's named the same in D but does something different.

-Steve
January 21, 2010
[dmd-concurrency] Shutdown protocol
On Jan 21, 2010, at 10:28 AM, Steve Schveighoffer wrote:
> 
> Note that it's not about a few seconds: socket streams are obstinate as a mule. They'll wait for 60-90 seconds to terminate if the connection is infinitely slow.

Yeah, if you want to be sure the system terminates promptly then you can't always count on a clean teardown.  For example, here's the dtor for an IOCP connection I wrote back in the 90s:

cp_connection::~cp_connection()
{
  close();
  for( uint ticks = 0;
       ( !HasOverlappedIoCompleted( &m_ol_send ) || !HasOverlappedIoCompleted( &m_ol_recv ) ) && ticks < 1000;
       ++ticks )
  {
      ::Sleep( 10 );
  }
}

Pretty nasty, but I never actually had the dtor fall through the wait loop in production.
January 21, 2010
[dmd-concurrency] Shutdown protocol
On Jan 21, 2010, at 11:28 AM, Andrei Alexandrescu wrote:

> Steve Schveighoffer wrote:
>> ----- Original Message ----
>>> From: Andrei Alexandrescu <andrei at erdani.com>
>>> Steve Schveighoffer wrote:
>>>>  foreach(tid; tids) {tid.join();}
>>> Oh btw that shouldn't be coded like that, it's slow. Must be something like:
>>> 
>>> joinAll(tids);
>>> 
>>> so joining is initiated in parallel.
>>> 
>> What does joinAll do?  I admit I am not too familiar with your proposed API.
> 
> The idea is to join on all threads in parallel. Otherwise, if you have N threads, the Nth will become aware you plan to shutdown only after all others have already finished.

But joining just waits for the thread, it doesn't tell it anything.  Or am I misunderstanding what join() does in this API?
January 21, 2010
[dmd-concurrency] shutting down
Sean Kelly wrote:
> Seriously?  Is this in server or client apps?  In server apps an
> orderly shutdown is generally crucial, since data often has to be
> serialized, logging must be done, etc.

Client. I agree that server shutdown is a very different ballgame, for 
example Apache's restart sequence is very elaborate.

>> I only know what I've developed in my projects, and the most
>> effective and robust shutdown I found involved properly cleaning up
>> sockets.  Having a shutdown time of a few seconds is not terrible,
>> and allows proper cleanup.  I've never had any problems in my
>> applications of having blocking sockets hold up shutdown, because I
>> always write my threads to periodically check for shutdown.
> 
> Servers and services generally work this way.  In fact there often
> isn't a main() function at all, since it's common to build them as
> shared libraries (Windows NT Services, for example).  An event is
> triggered when a shutdown is desired, and the signal must propagate
> through the system.  In this case it might be possible to designate
> some main execution loop as the main thread, but I'm frankly
> skeptical that a clean shutdown could be handled magically via the
> method previously described.

I agree we shouldn't put too much faith in the default shutdown method. 
I just want to choose a default that's reasonable.

Let me make a counterexample: pthreads have terrible defaults, i.e. it 
tears down everything without warning when main() exits. Virtually every 
example in Butenhof's book must use calls that override that crappy 
default. Look e.g. here:

http://www.yolinux.com/TUTORIALS/LinuxTutorialPosixThreads.html

/* Wait till threads are complete before main continues. Unless we  */
/* wait we run the risk of executing an exit which will terminate   */
/* the process and all threads before the threads have completed.   */

   pthread_join( thread1, NULL);
   pthread_join( thread2, NULL);

   exit(0);

Not to mention that that Einstein calls exit(0) as the last line in main().

So I'd be happy to define a reasonable default that does allow simple, 
meaningful applications to be written, without disallowing people who 
write servers from doing what they want to do.

>>> I think it's tenuous to conceive an application that doesn't
>>> terminate when the user wants to terminate it. Even MS Money,
>>> which synchronizes online when you shut down (a weird design if
>>> you ask me) has a "Exit now" button that does what needs to be
>>> done.
> 
> Why?  Even electronics these days generally don't have an on/off
> switch that's hardwired to the power supply.  Instead they send a
> signal that's received by the system and processed in an orderly
> manner.  I still have to unplug my laptop and pull the battery to
> shut it down on occasion when it gets wedged in some particularly
> interesting manner.

I agree about the orderly part but let's not get to the point we can't 
close an app come hell or high water.


Andrei
January 21, 2010
[dmd-concurrency] Shutdown protocol
Steve Schveighoffer wrote:
>> From: Andrei Alexandrescu <andrei at erdani.com>
>> The idea is to join on all threads in parallel. Otherwise, if you
>> have N threads, the Nth will become aware you plan to shutdown only
>> after all others have already finished.
> 
> Hm... in all other threading libraries I used, calling join did not
> change the state of the thread at all, it just waited for the thread
> to return.  So in those libraries, "joining threads in parallel" had
> no effect.

What I mean is the following. Before joining threads and terminate app, 
you have a means to tell them it's about time to finish. So the sequence 
looks like this:

broadcastShutdown();
joinAll();

Now say you have 24 threads with a latency of 100ms each. If you join 
them serially, it takes 2.4 seconds. If you join them simultaneously, it 
ideally takes 100ms.

> I thought triggering a shutdown would be done separately from join
> (see the call to "shutdown()" in my code sample).

Correct.


Andrei
January 21, 2010
[dmd-concurrency] Shutdown protocol
Sean Kelly wrote:
> On Jan 21, 2010, at 11:28 AM, Andrei Alexandrescu wrote:
> 
>> Steve Schveighoffer wrote:
>>> ----- Original Message ----
>>>> From: Andrei Alexandrescu <andrei at erdani.com>
>>>> Steve Schveighoffer wrote:
>>>>>  foreach(tid; tids) {tid.join();}
>>>> Oh btw that shouldn't be coded like that, it's slow. Must be something like:
>>>>
>>>> joinAll(tids);
>>>>
>>>> so joining is initiated in parallel.
>>>>
>>> What does joinAll do?  I admit I am not too familiar with your proposed API.
>> The idea is to join on all threads in parallel. Otherwise, if you have N threads, the Nth will become aware you plan to shutdown only after all others have already finished.
> 
> But joining just waits for the thread, it doesn't tell it anything.  Or am I misunderstanding what join() does in this API?

join() on one thread waits for the thread. join() on 100 threads waits 
for 100 threads, not for one thread 100 times.

Andrei
January 21, 2010
[dmd-concurrency] Shutdown protocol
----- Original Message ----
> From: Andrei Alexandrescu <andrei at erdani.com>
> 
> Steve Schveighoffer wrote:
> >> From: Andrei Alexandrescu 
> >> The idea is to join on all threads in parallel. Otherwise, if you
> >> have N threads, the Nth will become aware you plan to shutdown only
> >> after all others have already finished.
> > 
> > Hm... in all other threading libraries I used, calling join did not
> > change the state of the thread at all, it just waited for the thread
> > to return.  So in those libraries, "joining threads in parallel" had
> > no effect.
> 
> What I mean is the following. Before joining threads and terminate app, you have 
> a means to tell them it's about time to finish. So the sequence looks like this:
> 
> broadcastShutdown();
> joinAll();
> 
> Now say you have 24 threads with a latency of 100ms each. If you join them 
> serially, it takes 2.4 seconds. If you join them simultaneously, it ideally 
> takes 100ms.

If a thread has exited, it takes probably a few hundred cycles to return, probably not 100ms.  If it has not exited, no amount of parallelism is going to save you from waiting for the thread to exit.

A join sends no messages or anything, it simply waits until the thread has exited and deposited it's return code, then returns the return code.  While you are joining a slow-to-exit thread, all your other threads have exited, so in essence the parallelism occurs because the broadcast of the shutdown gets all the threads ready to be joined.  I don't see any benefit to joinAll (except to avoid having to write a loop).

-Steve
January 21, 2010
[dmd-concurrency] Shutdown protocol
Le 2010-01-21 ? 14:59, Steve Schveighoffer a ?crit :

>>> What does joinAll do?  I admit I am not too familiar with your proposed API.
>> 
>> The idea is to join on all threads in parallel. Otherwise, if you have N 
>> threads, the Nth will become aware you plan to shutdown only after all 
>> others have already finished.
> 
> Hm... in all other threading libraries I used, calling join did not change the state of the thread at all, it just waited for the thread to return.  So in those libraries, "joining threads in parallel" had no effect.

If the thread calling join has a higher priority, it might bump the priority of the joined threads to avoid a priority inversion. See: <http://en.wikipedia.org/wiki/Priority_inversion>.

If all threads have the same priority, the effect joinAll might have is avoid a few context switches that would be otherwise required advance the loop. It probably won't be noticeable unless you're very tight on processing time.

But as a general rule you can say that joinAll is better since it gives more information to the scheduler about your thread dependencies.

-- 
Michel Fortin
michel.fortin at michelf.com
http://michelf.com/
January 21, 2010
[dmd-concurrency] Shutdown protocol
On Jan 21, 2010, at 12:22 PM, Andrei Alexandrescu wrote:

> Sean Kelly wrote:
>> On Jan 21, 2010, at 11:28 AM, Andrei Alexandrescu wrote:
>>> Steve Schveighoffer wrote:
>>>> ----- Original Message ----
>>>>> From: Andrei Alexandrescu <andrei at erdani.com>
>>>>> Steve Schveighoffer wrote:
>>>>>> foreach(tid; tids) {tid.join();}
>>>>> Oh btw that shouldn't be coded like that, it's slow. Must be something like:
>>>>> 
>>>>> joinAll(tids);
>>>>> 
>>>>> so joining is initiated in parallel.
>>>>> 
>>>> What does joinAll do?  I admit I am not too familiar with your proposed API.
>>> The idea is to join on all threads in parallel. Otherwise, if you have N threads, the Nth will become aware you plan to shutdown only after all others have already finished.
>> But joining just waits for the thread, it doesn't tell it anything.  Or am I misunderstanding what join() does in this API?
> 
> join() on one thread waits for the thread. join() on 100 threads waits for 100 threads, not for one thread 100 times.

There's no API call that lets you join more than one thread simultaneously, is there?  So isn't it really the same thing?
January 21, 2010
[dmd-concurrency] Shutdown protocol
Steve Schveighoffer wrote:
> If a thread has exited, it takes probably a few hundred cycles to return, probably not 100ms.

The problem is if it hasn't yet exited (e.g. it's closing files etc.)

> If it has not exited, no amount of parallelism is going to save you from waiting for the thread to exit.

That doesn't dilute my point. Waiting for 100 threads must not be 
waiting for each thread in turn. I can't believe I need to argue this point.

> A join sends no messages or anything, it simply waits until the thread has exited and deposited it's return code, then returns the return code.  While you are joining a slow-to-exit thread, all your other threads have exited, so in essence the parallelism occurs because the broadcast of the shutdown gets all the threads ready to be joined.  I don't see any benefit to joinAll (except to avoid having to write a loop).

Right. join() does come after a broadcast of the intent to terminate.


Andrei
2 3 4 5 6 7 8
Top | Discussion index | About this forum | D home