November 15, 2012
On Monday, 12 November 2012 at 02:31:05 UTC, Walter Bright wrote:
>
> To make a shared type work in an algorithm, you have to:
>
> 1. ensure single threaded access by aquiring a mutex
> 2. cast away shared
> 3. operate on the data
> 4. cast back to shared
> 5. release the mutex

This is a fairly reasonable use of shared, but it is bypassing the type system. Once shared is cast away, it is free to be mixed with thread local variables. Pieces can be assigned to non-shared globals, impure functions can stash reference, weakly pure functions can mix their arguments together, etc... If locking converts shared(T) to bikeshed(T), I bet some of safeD's logic for no escaping references could be used to improve things.

It's also interesting to note that casting away shared after taking a lock implicitly means that everything was transitively owned by that lock. I wonder how well a library could promote/enforce such a thing?

November 15, 2012
On 11/14/12 4:50 PM, Sean Kelly wrote:
> On Nov 14, 2012, at 2:25 PM, Andrei
> Alexandrescu<SeeWebsiteForEmail@erdani.org>  wrote:
>
>> On 11/14/12 1:09 PM, Walter Bright wrote:
>>> Yes. And also, I agree that having something typed as "shared"
>>> must prevent the compiler from reordering them. But that's
>>> separate from inserting memory barriers.
>>
>> It's the same issue at hand: ordering properly and inserting
>> barriers are two ways to ensure one single goal, sequential
>> consistency. Same thing.
>
> Sequential consistency is great and all, but it doesn't render
> concurrent code correct.  At worst, it provides a false sense of
> security that somehow it does accomplish this, and people end up
> actually using it as such.

Yah, but the baseline here is acquire-release which has subtle differences that are all the more maddening.

Andrei
November 15, 2012
On 11/11/12 6:30 PM, Walter Bright wrote:
> 1. ensure single threaded access by aquiring a mutex
> 2. cast away shared
> 3. operate on the data
> 4. cast back to shared
> 5. release the mutex

This is very different from how I view we should do things (and how we actually agreed to do things and how I wrote in TDPL).

I can't believe I need to restart this on a cold cache.


Andrei


November 15, 2012
On Wednesday, November 14, 2012 18:30:56 Andrei Alexandrescu wrote:
> On 11/11/12 6:30 PM, Walter Bright wrote:
> > 1. ensure single threaded access by aquiring a mutex
> > 2. cast away shared
> > 3. operate on the data
> > 4. cast back to shared
> > 5. release the mutex
> 
> This is very different from how I view we should do things (and how we actually agreed to do things and how I wrote in TDPL).
> 
> I can't believe I need to restart this on a cold cache.

Well, this is clearly how things work now, and if you want to use shared with much of anything, it's how things generally have to work, because almost nothing takes shared. Templated stuff will at least some of the time (though it's often untested for it and probably will get screwed by Unqual in quite a few cases), but there's no way aside from templates or casting to get shared variables to share the same functions as non-shared ones, leading to code duplication.

>From what I recall of what TDPL says, this doesn't really contradict it. It's
just that TDPL doesn't really say much about the fact that almost nothing will work with shared, which means that casting is necessary.

I have no idea what we want to do about this situation though. Regardless of what we do with memory barriers and the like, it has no impact on whether casts are required. And I think that introducing the shared equivalent of const would be a huge mistake, because then most code would end up being written using that attribute, meaning that all code essentially has to be treated as shared from the standpoint of compiler optimizations. It would almost be the same as making everything shared by default again. So, as far as I can see, casting is what we're forced to do.

- Jonathan M Davis
November 15, 2012
On 11/15/12, Jonathan M Davis <jmdavisProg@gmx.com> wrote:
> From what I recall of what TDPL says

It says (on p.413) reading and writing shared values are guaranteed to be atomic, for pointers, arrays, function pointers, delegates, class references, and struct types containing exactly one of these types. Reals are not supported.

It also talks about automatically inserting memory barriers on page 414.
November 15, 2012
On Thursday, November 15, 2012 04:12:47 Andrej Mitrovic wrote:
> On 11/15/12, Jonathan M Davis <jmdavisProg@gmx.com> wrote:
> > From what I recall of what TDPL says
> 
> It says (on p.413) reading and writing shared values are guaranteed to be atomic, for pointers, arrays, function pointers, delegates, class references, and struct types containing exactly one of these types. Reals are not supported.
> 
> It also talks about automatically inserting memory barriers on page 414.

Good to know, but none of that really has anything to do with the casting, which is what I was responding to. And looking at that list, it sounds reasonable that all of that would be guaranteed to be atomic, but I think that the fundamental problem that's affecting usability is all of the casting that's typically required. And I don't see any way around that other than writing code that doesn't need to pass shared objects around or using templates very heavily.

- Jonathan M Davis
November 15, 2012
On Thursday, November 15, 2012 03:51:13 Jonathan M Davis wrote:
> I have no idea what we want to do about this situation though. Regardless of what we do with memory barriers and the like, it has no impact on whether casts are required. And I think that introducing the shared equivalent of const would be a huge mistake, because then most code would end up being written using that attribute, meaning that all code essentially has to be treated as shared from the standpoint of compiler optimizations. It would almost be the same as making everything shared by default again. So, as far as I can see, casting is what we're forced to do.

Actually, I think that what it comes down to is that shared works nicely when you have a type which is designed to be shared, and it encapsulates everything that it needs. Where it starts requiring casting is when you need to pass it to other stuff.

- Jonathan M Davis
November 15, 2012
On 11/14/12 7:24 PM, Jonathan M Davis wrote:
> On Thursday, November 15, 2012 03:51:13 Jonathan M Davis wrote:
>> I have no idea what we want to do about this situation though. Regardless of
>> what we do with memory barriers and the like, it has no impact on whether
>> casts are required. And I think that introducing the shared equivalent of
>> const would be a huge mistake, because then most code would end up being
>> written using that attribute, meaning that all code essentially has to be
>> treated as shared from the standpoint of compiler optimizations. It would
>> almost be the same as making everything shared by default again. So, as far
>> as I can see, casting is what we're forced to do.
>
> Actually, I think that what it comes down to is that shared works nicely when
> you have a type which is designed to be shared, and it encapsulates everything
> that it needs. Where it starts requiring casting is when you need to pass it
> to other stuff.
>
> - Jonathan M Davis

TDPL 13.14 explains that inside synchronized classes, top-level shared is automatically lifted.

Andrei
November 15, 2012
On 2012-11-15 02:51:13 +0000, "Jonathan M Davis" <jmdavisProg@gmx.com> said:

> I have no idea what we want to do about this situation though. Regardless of
> what we do with memory barriers and the like, it has no impact on whether
> casts are required.

One thing I'm confused about right now is how people are using shared. If you're using shared with atomic operations, then you need barriers when accessing or mutating the variable. If you're using shared with mutexes, spin-locks, etc., you don't care about the barriers. But you can't use it with both at the same time. So which of these shared stands for?

In both of these cases, there's an implicit policy for accessing or mutating the variable. I think the language need some way to express that policy. I suggested some time ago a way to protect variables with mutexes so that the compiler can actually help you use those mutexes correctly[1]. The idea was to associate a mutex to the variable declaration. This could be extended to support an atomic access policy.

Let me restate and extend that idea to atomic operations. Declare a variable using the synchronized storage class and it automatically get a mutex:

	synchronized int i; // declaration

	i++; // error, variable shared

	synchronized (i)
		i++; // fine, variable is thread-local inside synchronized block

Synchronized here is some kind of storage class causing two things: a mutex is attached to the variable declaration, and the type of the variable is made shared. The variable being shared, you can't access it directly. But a synchronized statement will make the variable non-shared within its bounds.

Now, if you want a custom mutex class, write it like this:

	synchronized(SpinLock) int i;

	synchronized(i)
	{
		// implicit: i.mutexof.lock();
		// implicit: scope (exit) i.mutexof.unlock();
		i++;
	}

If you want to declare the mutex separately, you could do it by specifying a variable instead of a type in the variable declaration:

	Mutex m;
	synchronized(m) int i;
	
	synchronized(i)
	{
		// implicit: m.lock();
		// implicit: scope (exit) m.unlock();
		i++;
	}

Also, if you have a read-write mutex and only need read access, you could declare that you only need read access using const:

	synchronized(RWMutex) int i;

	synchronized(const i)
	{
		// implicit: i.mutexof.constLock();
		// implicit: scope (exit) i.mutexof.constUnlock();
		i++; // error, i is const
	}

And finally, if you want to use atomic operations, declare it this way:

	synchronized(Atomic) int i;

You can't really synchronize on something protected by Atomic:

	syncronized(i) // cannot make sycnronized block, no lock/unlock method in Atomic
	{}

But you can call operators on it while synchronized, it works for anything implemented by Atomic:

	synchronized(i)++; // implicit: Atomic.opUnary!"++"(i);

Because the policy object is associated with the variable declaration, when locking the mutex you need direct access to the original variable, or an alias to it. Same for performing atomic operations. You can't pass a reference to some function and have that function perform the locking. If that's a problem it can be avoided by having a way to pass the mutex to the function, or by passing an alias to a template.

Okay, this syntax probably still has some problems, feel free to point them out. I don't really care about the syntax though. The important thing is that you need a way to define the policy for accessing the shared data in a way the compiler can actually enforce it and that programmers can actually reuse it.

Because right now there is no policy. Having to cast things everywhere is equivalent to having to redefine the policy everywhere. Same for having to write encapsulation types that work with shared for everything you want to share: each type has to implement the policy. There's nothing worse than constantly rewriting the sharing policies. Concurrency error-prone because of all the subtleties; you don't want to encourage people to write policies of their own every time they invent a new type. You need to reuse existing ones, and the compiler can help with that.

[1]: http://michelf.ca/blog/2012/mutex-synchonization-in-d/


-- 
Michel Fortin
michel.fortin@michelf.ca
http://michelf.ca/

November 15, 2012
On 14 November 2012 19:54, Andrei Alexandrescu < SeeWebsiteForEmail@erdani.org> wrote:

> On 11/14/12 9:31 AM, David Nadlinger wrote:
>
>> On Wednesday, 14 November 2012 at 15:08:35 UTC, Andrei Alexandrescu wrote:
>>
>>> Sorry, I was imprecise. We need to (a) define intrinsics for loading and storing data with high-level semantics (a short list: acquire, release, acquire+release, and sequentially-consistent) and THEN (b) implement the needed code generation appropriately for each architecture. Indeed on x86 there is little need to insert fence instructions, BUT there is a definite need for the compiler to prevent certain reorderings. That's why implementing shared data operations (whether implicit or explicit) as sheer library code is NOT possible.
>>>
>>
>> Sorry, I didn't see this message of yours before replying (the perils of
>> threaded news readers…).
>>
>> You are right about the fact that we need some degree of compiler support for atomic instructions. My point was that is it already available, otherwise it would have been impossible to implement core.atomic.{atomicLoad, atomicStore} (for DMD inline asm is used, which prohibits compiler code motion).
>>
>
> Yah, the whole point here is that we need something IN THE LANGUAGE DEFINITION about atomicLoad and atomicStore. NOT IN THE IMPLEMENTATION.
>
> THIS IS VERY IMPORTANT.


I won't outright disagree, but this seems VERY dangerous to me.

You need to carefully study all popular architectures, and consider that if the language is made to depend on these primitives, and the architecture doesn't support it, or support that particular style of implementation (fairly likely), than D will become incompatible with a huge number of architectures on that day.

This is a very big deal. I would be scared to see the compiler generate intrinsic calls to atomic synchronisation primitives. It's almost like banning architectures from the language.

The Nintendo Wii for instance, not an unpopular machine, only sold 130
million units! Does not have synchronisation instructions in the
architecture (insane, I know, but there it is. I've had to spend time
working around this in the past).
I'm sure it's not unique in this way.

People getting fancy with lock-free/atomic operations will probably wrap it up in libraries. And they're not globally applicable, atomic memory operations don't magically solve problems, they require very specific structures and access patterns around them. I'm just not convinced they should be intrinsics issued by the language. They're just not as well standardised as 'int' or 'float'.

Side note: I still think a convenient and fairly practical solution is to
make 'shared' things 'lockable'; where you can lock()/unlock() them, and
assignment to/from shared things is valid (no casting), but a runtime
assert insists that the entity is locked whenever it is accessed.* *It's
simplistic, but it's safe, and it works with the same primitives that
already exist and are proven. Let the programmer mark the lock/unlock
moments, worry about sequencing, etc... at least for the time being. Don't
try and do it automatically (yet).
The broad use cases in D aren't yet known, but making 'shared' useful today
would be valuable.

 Thus, »we«, meaning on a language level, don't need to change anything
>> about the current situations, with the possible exception of adding
>> finer-grained control to core.atomic.MemoryOrder/mysnc [1]. It is the
>> duty of the compiler writers to provide the appropriate means to
>> implement druntime on their code generation infrastructure – and indeed,
>> the situation in DMD could be improved, using inline asm is hitting a
>> fly with a sledgehammer.
>>
>
> That is correct. My point is that compiler implementers would follow some specification. That specification would contain informationt hat atomicLoad and atomicStore must have special properties that put them apart from any other functions.
>
>
>  David
>>
>>
>> [1] I am not sure where the point of diminishing returns is here, although it might make sense to provide the same options as C++11. If I remember correctly, D1/Tango supported a lot more levels of synchronization.
>>
>
> We could start with sequential consistency and then explore riskier/looser policies.
>
>
> Andrei
>