October 07, 2007
On Sun, Oct 07, 2007 at 08:11:30AM +0200, downs wrote:

>Sure, on it.
>This should work .. at a glance, I wasn't able to find any cases where
>it deadlocks.
>I made an unbounded buffer, because it seemed easier ^_____^

It's half the work :-)

Congratulations, though.  It certainly isn't easy to do even unbounded
buffers in std.thread.

>  private {
>    T[] buffer;
>    bool[Thread] waiting;
>    Object waiting_sync;
>  }
>  void put(T t)
>  {
>    synchronized(this) buffer~=t;
>    while (waiting.length && buffer.length)
>      synchronized(waiting_sync)
>        foreach (thr, bogus; waiting)
>          thr.resume;
>  }
>  T get()
>  {
>    synchronized(waiting_sync) waiting[Thread.getThis]=true;
>    scope(exit) synchronized(waiting_sync) waiting.remove(Thread.getThis);
>    while (true)
>    {
>      synchronized(this)
>        if (buffer.length)
>        {
>          auto res=buffer[0];
>          buffer=buffer[1..$];
>          return res;
>        }
>      Thread.getThis.pause;
>    }
>  }

Ok everyone, let that sink in a little bit.  As far as I can tell, this is
the simplest possible synchronization using the Phobos thread primitives.
It requires two critical sections (this and waiting_sync), an associative
array of Threads, a while loop on 'put', the scope(exit) with a
synchronized blockin get().  It also isn't going to be very efficient on
any underlying implementation.  Under Linux, the implementation of pause
and resume make debugging more difficult, and prevent the user from being
able to use SIGUSR1 and SIGUSR2 for their own purposes.

This is certainly not what I would hope to ever find in a normal piece of
threading code.  Let's pretend I have condition variables (.NET style), and
see what it looks like:

  void put(T t)
  {
    synchronized (this) {
      buffer ~= t;
      conditionSignal (this);
    }
  }

  void get(T t)
  {
    synchronized (this) {
      while (buffer.length == 0)
        conditionWait (this);
      auto res=buffer[0];
      buffer = buffer[1..$];
      return res;
    }
  }

where conditionSignal and conditionWait are as defined by .NET or posix
Pthreads (with the condition variable and mutex as part of each object).
This can be implemented directly on top of both Windows and Linux, meaning
it won't hurt performance.  It's a whole lot easier to use correctly, and
doesn't obscure the code quite as much.  (Yes, the conditionWait is inside
of the synchronized, it is _required_ to be there, that's why it is so much
easier to use).  In fact, if a thread was allowed to 'pause()' itself
inside of a synchronized section and that meant it would atomically release
the critical section and pause, and then regain the critical section on
resume, the Phobos version could be made much simpler.

Other implementations using semaphores aren't much more difficult.

I still would say this is too primitive for most uses, and higher level
constructs can be made.  But, only the useful primitives need to be in the
thread library itself.

David
October 07, 2007
Stewart Gordon wrote:
> "Frits van Bommel" <fvbommel@REMwOVExCAPSs.nl> wrote in message news:fe995o$3de$1@digitalmars.com...
> <snip>
>> Also, AFAIK the compiler would be well within its rights to use the opposite order of .length and .ptr in memory; meaning that even if the C calling convention allows this the order they're passed in isn't guaranteed.
> 
> Not according to
> http://www.digitalmars.com/d/abi.html

I'm not sure if that's part of the actual spec. If conformance to that ABI is required to be an implementation of D then only DMD qualifies, and non-x86 D compilers are impossible (since the ABI is x86-specific)...

And from that page: "A D implementation that conforms to the D ABI (Application Binary Interface) will be able to [...]" which I read to imply there can also be D implementations that _don't_ conform to that ABI. (such as GDC, and probably the compilers currently in production (LLVM, dil) as well as any compiler that supports non-x86 target systems)

> But one thing that might affect whether it works is if, on a given system, the memory layout of the stack goes the other way or is even non-linear.

Certainly. But such a system wouldn't be an x86, so see above.
October 07, 2007
David Brown wrote:
> This is certainly not what I would hope to ever find in a normal piece of threading code.  Let's pretend I have condition variables (.NET style), and see what it looks like:
> 
>   void put(T t)
>   {
>     synchronized (this) {
>       buffer ~= t;
>       conditionSignal (this);
>     }
>   }
> 
>   void get(T t)
>   {
>     synchronized (this) {
>       while (buffer.length == 0)
>         conditionWait (this);
>       auto res=buffer[0];
>       buffer = buffer[1..$];
>       return res;
>     }
>   }
> 

Your implementation has, as far as I can tell, a fatal flaw. If
conditionWait is called before put, put's synchronized(this) will block
indefinitely.
Am I missing something here?
 --downs, confused
October 07, 2007
downs wrote:
> David Brown wrote:
>> This is certainly not what I would hope to ever find in a normal piece of
>> threading code.  Let's pretend I have condition variables (.NET style), and
>> see what it looks like:
>>
>>   void put(T t)
>>   {
>>     synchronized (this) {
>>       buffer ~= t;
>>       conditionSignal (this);
>>     }
>>   }
>>
>>   void get(T t)
>>   {
>>     synchronized (this) {
>>       while (buffer.length == 0)
>>         conditionWait (this);
>>       auto res=buffer[0];
>>       buffer = buffer[1..$];
>>       return res;
>>     }
>>   }
>>
> 
> Your implementation has, as far as I can tell, a fatal flaw. If
> conditionWait is called before put, put's synchronized(this) will block
> indefinitely.
> Am I missing something here?
>  --downs, confused

Yes, you seem to have missed the paragraph after the code. The conditionWait(this) atomically unlocks the synchronized(this) mutex and pauses the current thread, and then when resuming automatically relocks the mutex.
This means that when conditionWait is called before put, get() will no longer have the mutex locked when put() is called, so the synchronized(this) in put() won't block (assuming no other thread has locked that mutex in the mean time).
October 07, 2007
Frits van Bommel wrote:
> downs wrote:
>> Your implementation has, as far as I can tell, a fatal flaw. If
>> conditionWait is called before put, put's synchronized(this) will block
>> indefinitely.
>> Am I missing something here?
>>  --downs, confused
> 
> Yes, you seem to have missed the paragraph after the code. The
> conditionWait(this) atomically unlocks the synchronized(this) mutex and
> pauses the current thread, and then when resuming automatically relocks
> the mutex.
> This means that when conditionWait is called before put, get() will no
> longer have the mutex locked when put() is called, so the
> synchronized(this) in put() won't block (assuming no other thread has
> locked that mutex in the mean time).

Oh crap. Total spontaneous loss of reading comprehension. Sorry.
Thanks for pointing it out again.
 --downs
October 07, 2007
David Brown wrote:
> This is certainly not what I would hope to ever find in a normal piece of threading code.  Let's pretend I have condition variables (.NET style), and see what it looks like:
> 
>   void put(T t)
>   {
>     synchronized (this) {
>       buffer ~= t;
>       conditionSignal (this);
>     }
>   }
> 
>   void get(T t)
>   {
>     synchronized (this) {
>       while (buffer.length == 0)
>         conditionWait (this);
>       auto res=buffer[0];
>       buffer = buffer[1..$];
>       return res;
>     }
>   }
> 
> where conditionSignal and conditionWait are as defined by .NET or posix Pthreads (with the condition variable and mutex as part of each object). This can be implemented directly on top of both Windows and Linux, meaning it won't hurt performance.  It's a whole lot easier to use correctly, and doesn't obscure the code quite as much.  (Yes, the conditionWait is inside of the synchronized, it is _required_ to be there, that's why it is so much easier to use).  In fact, if a thread was allowed to 'pause()' itself inside of a synchronized section and that meant it would atomically release the critical section and pause, and then regain the critical section on resume, the Phobos version could be made much simpler.
> 
> Other implementations using semaphores aren't much more difficult.
> 
> I still would say this is too primitive for most uses, and higher level constructs can be made.  But, only the useful primitives need to be in the thread library itself.
> 
> David

Okay, let's try this again.
I'm not sure if the following code will work, seeing as I never actually
tested it. I'm fairly optimistic though ^^

module test4;

import std.thread;
class Lock
{
  private
  {
    Thread locked;
    bool[Thread] sleeping;
    bool signalling=false;
  }
  void lock(Object what)
  {
    while (locked !is Thread.getThis) synchronized(what)
      if (!locked) locked = Thread.getThis;
  }
  void unlock()
  in
  {
    assert (Thread.getThis is locked);
  }
  body
  {
    locked = null;
  }
  void synchronize(Object what, void delegate() dg)
  {
    lock(what);
    scope(exit) unlock;
    dg();
  }
  void conditionWait(Object what)
  {
    synchronized(this) sleeping[Thread.getThis]=true;
    unlock;
    // prevent accidental resume
    while (!signalling) Thread.getThis.pause;
    synchronized(this) sleeping.remove(Thread.getThis);
    lock(what);
  }
  void conditionSignal()
  {
    synchronized(this)
    {
      signalling=true;
      scope(exit) signalling=false;
      foreach (thr, bogus; sleeping) thr.resume;
    }
  }
}

class UnBoundedBuffer(T) {
  private T[] buffer;
  Lock lock; this() { lock=new Lock; }
  void put(T t)
  {
    lock.synchronize(this, {
      buffer ~= t;
      lock.conditionSignal;
    });
  }
  T get()
  {
    T res=void;
    lock.synchronize(this, {
      while (!buffer.length)
        lock.conditionWait(this);
      res = buffer[0];
      buffer=buffer[1..$];
    });
    return res;
  }
}

void main() {
  auto bb=new UnBoundedBuffer!(int);
}
October 07, 2007
downs wrote:
> Walter Bright wrote:
>> Sean Kelly wrote:
>>> For what it's worth, Tango will never choose to run exclusively on the
>>> Phobos runtime.  One of the primary reasons many people choose Tango
>>> is for its threading package, and this is a runtime feature. Therefore, dropping this package to use the Phobos version would be
>>> foolish, and I suspect that it would leave a number of Tango users
>>> (some developing professional software) in a bit of a lurch.
>> The threading package in Phobos is not very good, as it is based on my
>> very poor understanding of the problem. Drop kicking std.thread is
>> certainly open for discussion for v2. I wouldn't shed a tear for it <g>.
> 
> Maybe I'm missing something here; maybe it's because Phobos' threading
> is all I ever used, but what's wrong with it?
> I mean, it's threads .. you can start them, halt them, run stuff in them
> ... what more is needed? It's a very basic implementation, very close to
> the OS; but thanks to synchronized, I generally found it easy to add the
> primitives I needed.
> Somebody kindly explain. Thanks.

There are at least two known deadlock issues with the Phobos thread module.  One I've suggested a fix for (it's related to how object finalization occurs) and the other I've been unable to find the cause of.  The remaining issues are more subtle and probably not a concern for the average user.  Thread.getThis is an O(N) operation when it could be O(1), etc.


Sean
October 07, 2007
On Sun, Oct 07, 2007 at 09:50:09AM -0700, Sean Kelly wrote:

>> I mean, it's threads .. you can start them, halt them, run stuff in them
>> ... what more is needed? It's a very basic implementation, very close to
>> the OS; but thanks to synchronized, I generally found it easy to add the
>> primitives I needed.
>> Somebody kindly explain. Thanks.
>
> There are at least two known deadlock issues with the Phobos thread module.  One I've suggested a fix for (it's related to how object finalization occurs) and the other I've been unable to find the cause of.  The remaining issues are more subtle and probably not a concern for the average user.  Thread.getThis is an O(N) operation when it could be O(1), etc.

I have several issues with it:

  - 'pause/resume' is not a useful thread primitive.  As we've seen by
    sample code, just doing simple things is dreadfully difficult, and very
    easy to get wrong, having deadlocks, starvation, and other issues.

  - 'pause/resume' aren't very efficient ways to implement synchronization,
    at least not on Linux.  If this is the mechanism that Windows gives for
    synchronization (which at least no longer is the case), something else
    needs to be implemented on top of it.

    On Linux, pause/resume are far from "close to the OS".

    Even on windows, it seems unlikely to me that the only threading
    primitives provided require the user to implement their own wait queues
    and require waking every waiting thread to prevent deadlocks.

As I've said before, Phobos should contains a small set of easy to use
primitives that are efficiently implemented on the underlying OS.  This is
not currently the case.

As nearly everyone who develops threading primitives discovers, essential
to making them work is the importance of something that will block while a
mutex is held, and does the right thing.  This gets rid of most of the
loops and races that occur without this.

Aside from that, it would be useful to have a set of higher-level
synchronization operations, such as bounded and unbounded buffers,
implemented on top of this that are also part of a common library.

David
October 07, 2007
David Brown wrote:
>   - 'pause/resume' is not a useful thread primitive.  As we've seen by
>     sample code, just doing simple things is dreadfully difficult, and very
>     easy to get wrong, having deadlocks, starvation, and other issues.

The main reason for the existence of pause/resume is so the garbage collector can pause all the threads, do a gc sweep, then resume them.
October 07, 2007
> I have several issues with it:
> 
>   - 'pause/resume' is not a useful thread primitive.  As we've seen by
>     sample code, just doing simple things is dreadfully difficult, and very
>     easy to get wrong, having deadlocks, starvation, and other issues.
> 
>   - 'pause/resume' aren't very efficient ways to implement synchronization,
>     at least not on Linux.  If this is the mechanism that Windows gives for
>     synchronization (which at least no longer is the case), something else
>     needs to be implemented on top of it.

pause and resume have absolute nothing to do with synchronzation on no platform

windows got

events (a bad version of condition), mutexes, critical sections, atomic operations and semaphores for such stuff

http://msdn2.microsoft.com/en-us/library/ms686360.aspx