November 14, 2012
On Nov 12, 2012, at 2:57 AM, Johannes Pfau <nospam@example.com> wrote:

> Am Sun, 11 Nov 2012 18:30:17 -0800
> schrieb Walter Bright <newshound2@digitalmars.com>:
> 
>> 
>> To make a shared type work in an algorithm, you have to:
>> 
>> 1. ensure single threaded access by aquiring a mutex
>> 2. cast away shared
>> 3. operate on the data
>> 4. cast back to shared
>> 5. release the mutex
>> 
>> Also, all op= need to be disabled for shared types.
> 
> But there are also shared member functions and they're kind of annoying right now:
> 
> * You can't call shared methods from non-shared methods or vice versa.
>  This leads to code duplication, you basically have to implement
>  everything twice:
> 
> ----------
> struct ABC
> {
>        Mutext mutex;
> 	void a()
> 	{
> 		aImpl();
> 	}
> 	shared void a()
> 	{
> 		synchronized(mutex)
> 		    aImpl();  //not allowed
> 	}
> 	private void aImpl()
> 	{
> 
> 	}
> }
> ----------
> The only way to avoid this is casting away shared in the shared a method, but that really is annoying.

Yes.  You end up having two methods for each function, one as a synchronized wrapper that casts away shared and another that does the actual work.


> and then there's also the druntime issue: core.sync doesn't work with
> shared which leads to this schizophrenic situation:
> struct A
> {
>    Mutex m;
>    void a() //Doesn't compile with shared
>    {
>        m.lock();  //Compiles, but locks on a TLS mutex!
>        m.unlock();
>    }
> }

Most of the reason for this was that I didn't like the old implications of shared, which was that shared methods would at some time in the future end up with memory barriers all over the place.  That's been dropped, but I'm still not a fan of the wrapper method for each function.  It makes for a crappy class design.
November 14, 2012
On Nov 14, 2012, at 6:16 AM, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:

> On 11/14/12 1:20 AM, Walter Bright wrote:
>> On 11/13/2012 11:37 PM, Jacob Carlborg wrote:
>>> If the compiler should/does not add memory barriers, then is there a
>>> reason for
>>> having it built into the language? Can a library solution be enough?
>> 
>> Memory barriers can certainly be added using library functions.
> 
> The compiler must understand the semantics of barriers such as e.g. it doesn't hoist code above an acquire barrier or below a release barrier.

That was the point of the now deprecated "volatile" statement.  I still don't entirely understand why it was deprecated.
November 14, 2012
On Nov 14, 2012, at 6:32 AM, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:
> 
> This is a simplification of what should be going on. The core.atomic.{atomicLoad, atomicStore} functions must be intrinsics so the compiler generate sequentially consistent code with them (i.e. not perform certain reorderings). Then there are loads and stores with weaker consistency semantics (acquire, release, acquire/release, and consume).

No.  These functions all contain volatile ask blocks.  If the compiler respected the "volatile" it would be enough.
November 14, 2012
On Nov 14, 2012, at 12:01 PM, Sean Kelly <sean@invisibleduck.org> wrote:

> On Nov 14, 2012, at 6:32 AM, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:
>> 
>> This is a simplification of what should be going on. The core.atomic.{atomicLoad, atomicStore} functions must be intrinsics so the compiler generate sequentially consistent code with them (i.e. not perform certain reorderings). Then there are loads and stores with weaker consistency semantics (acquire, release, acquire/release, and consume).
> 
> No.  These functions all contain volatile ask blocks.  If the compiler respected the "volatile" it would be enough.

asm blocks.  Darn auto-correct.
November 14, 2012
On 14-11-2012 21:00, Sean Kelly wrote:
> On Nov 14, 2012, at 6:16 AM, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:
>
>> On 11/14/12 1:20 AM, Walter Bright wrote:
>>> On 11/13/2012 11:37 PM, Jacob Carlborg wrote:
>>>> If the compiler should/does not add memory barriers, then is there a
>>>> reason for
>>>> having it built into the language? Can a library solution be enough?
>>>
>>> Memory barriers can certainly be added using library functions.
>>
>> The compiler must understand the semantics of barriers such as e.g. it doesn't hoist code above an acquire barrier or below a release barrier.
>
> That was the point of the now deprecated "volatile" statement.  I still don't entirely understand why it was deprecated.
>

The volatile statement was too general. All relevant compiler back ends today only know of two kinds of volatile operations: Loads and stores. Volatile statements couldn't ever be properly implemented in GDC and LDC for example.

See also: http://prowiki.org/wiki4d/wiki.cgi?LanguageDevel/DIPs/DIP20

-- 
Alex Rønne Petersen
alex@lycus.org
http://lycus.org
November 14, 2012
On Nov 14, 2012, at 12:07 PM, Alex Rønne Petersen <alex@lycus.org> wrote:

> On 14-11-2012 21:00, Sean Kelly wrote:
>> On Nov 14, 2012, at 6:16 AM, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:
>> 
>>> On 11/14/12 1:20 AM, Walter Bright wrote:
>>>> On 11/13/2012 11:37 PM, Jacob Carlborg wrote:
>>>>> If the compiler should/does not add memory barriers, then is there a
>>>>> reason for
>>>>> having it built into the language? Can a library solution be enough?
>>>> 
>>>> Memory barriers can certainly be added using library functions.
>>> 
>>> The compiler must understand the semantics of barriers such as e.g. it doesn't hoist code above an acquire barrier or below a release barrier.
>> 
>> That was the point of the now deprecated "volatile" statement.  I still don't entirely understand why it was deprecated.
>> 
> 
> The volatile statement was too general. All relevant compiler back ends today only know of two kinds of volatile operations: Loads and stores. Volatile statements couldn't ever be properly implemented in GDC and LDC for example.

Well, the semantics of volatile are that there's an acquire barrier before the statement block and a release barrier after the statement block.  Or for a first cut just insert a full barrier at the beginning and end of the block.  Either way, it should be pretty simply for a compiler to handle if the compiler supports mutex use.

I do like the idea of built-in load and store intrinsics only because D only supports x86 assembler right now.  But really, it would be just as easy to fan out a D template function to a bunch of C functions implemented in separate ASM code files.  Druntime actually had this for core.atomic on PPC until not too long ago.
November 14, 2012
On Nov 13, 2012, at 1:14 AM, luka8088 <luka8088@owave.net> wrote:

> On Tuesday, 13 November 2012 at 09:11:15 UTC, luka8088 wrote:
>> On 12.11.2012 3:30, Walter Bright wrote:
>>> On 11/11/2012 10:46 AM, Alex Rønne Petersen wrote:
>>>> It's starting to get outright embarrassing to talk to newcomers about D's
>>>> concurrency support because the most fundamental part of it -- the
>>>> shared type
>>>> qualifier -- does not have well-defined semantics at all.
>>> 
>>> I think a couple things are clear:
>>> 
>>> 1. Slapping shared on a type is never going to make algorithms on that type work in a concurrent context, regardless of what is done with memory barriers. Memory barriers ensure sequential consistency, they do nothing for race conditions that are sequentially consistent. Remember, single core CPUs are all sequentially consistent, and still have major concurrency problems. This also means that having templates accept shared(T) as arguments and have them magically generate correct concurrent code is a pipe dream.
>>> 
>>> 2. The idea of shared adding memory barriers for access is not going to ever work. Adding barriers has to be done by someone who knows what they're doing for that particular use case, and the compiler inserting them is not going to substitute.
>>> 
>>> 
>>> However, and this is a big however, having shared as compiler-enforced self-documentation is immensely useful. It flags where and when data is being shared. So, your algorithm won't compile when you pass it a shared type? That is because it is NEVER GOING TO WORK with a shared type. At least you get a compile time indication of this, rather than random runtime corruption.
>>> 
>>> To make a shared type work in an algorithm, you have to:
>>> 
>>> 1. ensure single threaded access by aquiring a mutex
>>> 2. cast away shared
>>> 3. operate on the data
>>> 4. cast back to shared
>>> 5. release the mutex
>>> 
>>> Also, all op= need to be disabled for shared types.
>> 
>> 
>> This clarifies a lot, but still a lot of people get confused with:
>> http://dlang.org/faq.html#shared_memory_barriers
>> is it a faq error ?
>> 
>> and also with http://dlang.org/faq.html#shared_guarantees said, I come to think that the fact that the following code compiles is either lack of implementation, a compiler bug or a faq error ?
> 
> //////////
> 
> import core.thread;
> 
> void main () {
>  int i;
>  (new Thread({ i++; })).start();
> }

It's intentional.  core.thread is for people who know what they're doing, and there are legitimate uses along these lines:

void main() {
    int i;
    auto t = new Thread({i++;});
    t.start();
    t.join();
    write(i);
}

This is perfectly safe and has a deterministic result.
November 14, 2012
On Nov 14, 2012, at 9:50 AM, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:
> 
> First, there are more kinds of atomic loads and stores. Then, the fact that the calls are not supposed to be reordered must be a guarantee of the language, not a speculation about an implementation. We can't argue that a feature works just because it so happens an implementation works a specific way.

I've always been a fan of release consistency, and it dovetails well with the behavior of mutexes (http://en.wikipedia.org/wiki/Release_consistency).  It would be cool if we could sort out transactional memory as well, but that's not a short term thing.
November 14, 2012
On 2012-11-14 14:30:19 +0000, Timon Gehr <timon.gehr@gmx.ch> said:

> On 11/14/2012 01:42 PM, Michel Fortin wrote:
>> On 2012-11-14 10:30:46 +0000, Timon Gehr <timon.gehr@gmx.ch> said:
>> 
>>> So do I. A thread-local static variable does not imply global state.
>>> (The execution stack is static.) Eg. in a few cases it is sensible to
>>> use static variables as implicit arguments to avoid having to pass
>>> them around by copying them all over the execution stack.
>>> 
>>> private int x = 0;
>>> 
>>> int foo(){
>>>      int xold = x;
>>>      scope(exit) x = xold;
>>>      x = new_value;
>>>      bar(); // reads x
>>>      return baz(); // reads x
>>> }
>> 
>> I'd consider that poor style.
> 
> I'd consider this a poor statement to make. Universally quantified assertions require more rigorous justification.

Indeed. There's not enough context to judge fairly. I can accept the idea there are situations where it is really inconvenient or impossible to pass the state as an argument.

That said, I disagree that this is not using global state. It might not be globally accessible (because x is private), but the state still exists globally since variable x exists in all threads irrespective of whether they use foo or not.


> If done in such a way that it makes refactoring error prone, it is to be considered poor style.

I guess we agree.

-- 
Michel Fortin
michel.fortin@michelf.ca
http://michelf.ca/

November 14, 2012
On 11/14/2012 3:14 AM, Benjamin Thaut wrote:
> A small code example which would break as soon as we allow destructing of shared
> value types would really be nice.

I hate to repeat myself, but:

Thread 1:
    1. create shared object
    2. pass reference to that object to Thread 2
    3. destroy object

Thread 2:
    1. manipulate that object