Thread overview
is core Mutex lock "fast"?
Jan 26, 2021
ludo
Jan 26, 2021
IGotD-
Jan 26, 2021
ludo
Jan 26, 2021
IGotD-
January 26, 2021
Hi guys,

still working on old D1 code, to be updated to D2. At some point the previous dev wrote a FastLock class. The top comment is from the dev himself, not me. My question is after the code.

---

class FastLock
{
	protected Mutex mutex;
	protected int lockCount;
	protected Thread owner;

	///
	this()
	{	mutex = new Mutex();
	}

	/**
	 * This works the same as Tango's Mutex's lock()/unlock except provides extra performance in the special case where
	 * a thread calls lock()/unlock() multiple times while it already has ownership from a previous call to lock().
	 * This is a common case in Yage.
	 *
	 * For convenience, lock() and unlock() calls may be nested.  Subsequent lock() calls will still maintain the lock,
	 * but unlocking will only occur after unlock() has been called an equal number of times.
	 *
	 * On Windows, Tango's lock() is always faster than D's synchronized statement.  */
	void lock()
	{	auto self = Thread.getThis();
		if (self !is owner)
		{	mutex.lock();
			owner = self;
		}
		lockCount++;
	}
	void unlock() /// ditto
	{	assert(Thread.getThis() is owner);
		lockCount--;
		if (!lockCount)
		{	owner = null;
			mutex.unlock();
		}
	}
}

---

Now if I look at the doc , in particular Class core.sync.mutex.Mutex, I see:
---
 lock () 	If this lock is not already held by the caller, the lock is acquired, then the internal counter is incremented by one.
--
Which looks exactly like the behavior of "fastLock". Is it so that the old Tango's mutex lock was not keeping count and would lock the same object several time? Do we agree that the FastLock class is obsolete considering current D core?

cheers
January 26, 2021
On 1/26/21 1:07 PM, ludo wrote:
> Hi guys,
> 
> still working on old D1 code, to be updated to D2. At some point the previous dev wrote a FastLock class. The top comment is from the dev himself, not me. My question is after the code.
> 

[snip]

> Is it so that the old Tango's mutex lock was not keeping count and would lock the same object several time? Do we agree that the FastLock class is obsolete considering current D core?

Just want to point out that druntime was based off of Tango's runtime. So I would expect D2's mutex to have at least as much performance as Tango's mutex.

If this is D1 code, it was before this happened (D1 phobos did not share a runtime with Tango).

-Steve
January 26, 2021
On Tuesday, 26 January 2021 at 18:07:06 UTC, ludo wrote:
> Hi guys,
>
> still working on old D1 code, to be updated to D2. At some point the previous dev wrote a FastLock class. The top comment is from the dev himself, not me. My question is after the code.
>
> ---
>
> class FastLock
> {
> 	protected Mutex mutex;
> 	protected int lockCount;
> 	protected Thread owner;
>
> 	///
> 	this()
> 	{	mutex = new Mutex();
> 	}
>
> 	/**
> 	 * This works the same as Tango's Mutex's lock()/unlock except provides extra performance in the special case where
> 	 * a thread calls lock()/unlock() multiple times while it already has ownership from a previous call to lock().
> 	 * This is a common case in Yage.
> 	 *
> 	 * For convenience, lock() and unlock() calls may be nested.  Subsequent lock() calls will still maintain the lock,
> 	 * but unlocking will only occur after unlock() has been called an equal number of times.
> 	 *
> 	 * On Windows, Tango's lock() is always faster than D's synchronized statement.  */
> 	void lock()
> 	{	auto self = Thread.getThis();
> 		if (self !is owner)
> 		{	mutex.lock();
> 			owner = self;
> 		}
> 		lockCount++;
> 	}
> 	void unlock() /// ditto
> 	{	assert(Thread.getThis() is owner);
> 		lockCount--;
> 		if (!lockCount)
> 		{	owner = null;
> 			mutex.unlock();
> 		}
> 	}
> }
>
> ---
>
> Now if I look at the doc , in particular Class core.sync.mutex.Mutex, I see:
> ---
>  lock () 	If this lock is not already held by the caller, the lock is acquired, then the internal counter is incremented by one.
> --
> Which looks exactly like the behavior of "fastLock". Is it so that the old Tango's mutex lock was not keeping count and would lock the same object several time? Do we agree that the FastLock class is obsolete considering current D core?
>
> cheers

That code isn't thread safe at all (assuming FastLock is used from several threads). lockCount isn't atomic which means the code will not work with several threads. Also the assignment of the variable owner isn't thread safe. As soon you start to include more that one supposedly atomic assignment in synchronization primitives, things quickly get out of hand.

Normal D Mutex uses pthread_mutex on Linux and the usual CriticalSection stuff on Windows. Neither is particularly fast. Futex on Linux isn't exactly super fast either. Synchronization primitives aren't exactly fast on any system because that's how it is. There are ways to makes things faster but adding stuff like timeouts and the complexity goes exponential. There so many pitfalls with synchronization primitives that it is hardly worth it making your own.
January 26, 2021
On 1/26/21 3:56 PM, IGotD- wrote:

> That code isn't thread safe at all (assuming FastLock is used from several threads). lockCount isn't atomic which means the code will not work with several threads.> Also the assignment of the variable owner isn't thread safe. As soon you start to include more that one supposedly atomic assignment in synchronization primitives, things quickly get out of hand.

The only item that is read without being locked is owner. If you change that to an atomic read and write, it should be fine (and is likely fine on x86* without atomics anyway).

All the other data is protected by the actual mutex, and so should be synchronous.

However, I think this is all moot, druntime is the same as Tango.

-Steve
January 26, 2021
> However, I think this is all moot, druntime is the same as Tango.

Moot you mean debatable? Or irrelevant :) Thanks to your explanations, I understand now that the dev tried to imitate a Tango feature with very old D1 code. This is 2005/2009 code

And as pointed out by IGotD-, better not to mess around with synch / reinvent the wheel. I will trash this class. Still learning a lot, thank you guys.
January 26, 2021
On Tuesday, 26 January 2021 at 21:09:34 UTC, Steven Schveighoffer wrote:
>
> The only item that is read without being locked is owner. If you change that to an atomic read and write, it should be fine (and is likely fine on x86* without atomics anyway).
>
> All the other data is protected by the actual mutex, and so should be synchronous.
>
> However, I think this is all moot, druntime is the same as Tango.
>
> -Steve

Yes, I didn't see the lock would block subsequent threads.

Both pthread_mutex_lock and EnterCriticalSection do exactly the same as FastLock and the only difference is the the check is "closer" to the running code. Performance increase should be low.

As I was wrong about the thread safety, I will not write here any further.
January 26, 2021
On 1/26/21 4:40 PM, ludo wrote:
>> However, I think this is all moot, druntime is the same as Tango.
> 
> Moot you mean debatable? Or irrelevant :)

I *think* it's irrelevant. The comment makes it sound like it's slightly different than Tango, but for sure reentrant locks are possible with D2 phobos. I am *100% sure* that druntime is from Tango, because I was there when it was a controversy and this was the solution.

> Thanks to your explanations, I understand now that the dev tried to imitate a Tango feature with very old D1 code. This is 2005/2009 code
> 
> And as pointed out by IGotD-, better not to mess around with synch / reinvent the wheel. I will trash this class. Still learning a lot, thank you guys.

Yes, I agree. I'm no expert on thread atomics, but I know enough to know I shouldn't mess with the tried-and-true primitives that are generally used.

Consider that very very smart people have tried to write "lock free" stuff and commonly fail (and some dumb people, including me). And when you fail here, it's not a failure you see immediately, because it may happen once in a blue moon.

Performance is irrelevant if it's not correct.

-Steve