November 14, 2012
On 11/14/2012 7:08 AM, Andrei Alexandrescu wrote:
> On 11/14/12 6:39 AM, Alex Rønne Petersen wrote:
>> On 14-11-2012 15:14, Andrei Alexandrescu wrote:
>>> On 11/14/12 1:19 AM, Walter Bright wrote:
>>>> On 11/13/2012 11:56 PM, Jonathan M Davis wrote:
>>>>> Being able to have double-checked locking work would be valuable, and
>>>>> having
>>>>> memory barriers would reduce race condition weirdness when locks
>>>>> aren't used
>>>>> properly, so I think that it would be desirable to have memory
>>>>> barriers.
>>>>
>>>> I'm not saying "memory barriers are bad". I'm saying that having the
>>>> compiler blindly insert them for shared reads/writes is far from the
>>>> right way to do it.
>>>
>>> Let's not hasten. That works for Java and C#, and is allowed in C++.
>>>
>>> Andrei
>>>
>>>
>>
>> I need some clarification here: By memory barrier, do you mean x86's
>> mfence, sfence, and lfence?
>
> Sorry, I was imprecise. We need to (a) define intrinsics for loading and storing
> data with high-level semantics (a short list: acquire, release, acquire+release,
> and sequentially-consistent) and THEN (b) implement the needed code generation
> appropriately for each architecture. Indeed on x86 there is little need to
> insert fence instructions, BUT there is a definite need for the compiler to
> prevent certain reorderings. That's why implementing shared data operations
> (whether implicit or explicit) as sheer library code is NOT possible.
>
>> Because as Walter said, inserting those blindly when unnecessary can
>> lead to terrible performance because it practically murders
>> pipelining.
>
> I think at this point we need to develop a better understanding of what's going
> on before issuing assessments.

Yes. And also, I agree that having something typed as "shared" must prevent the compiler from reordering them. But that's separate from inserting memory barriers.

November 14, 2012
On 14-11-2012 21:15, Sean Kelly wrote:
> On Nov 14, 2012, at 12:07 PM, Alex Rønne Petersen <alex@lycus.org> wrote:
>
>> On 14-11-2012 21:00, Sean Kelly wrote:
>>> On Nov 14, 2012, at 6:16 AM, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:
>>>
>>>> On 11/14/12 1:20 AM, Walter Bright wrote:
>>>>> On 11/13/2012 11:37 PM, Jacob Carlborg wrote:
>>>>>> If the compiler should/does not add memory barriers, then is there a
>>>>>> reason for
>>>>>> having it built into the language? Can a library solution be enough?
>>>>>
>>>>> Memory barriers can certainly be added using library functions.
>>>>
>>>> The compiler must understand the semantics of barriers such as e.g. it doesn't hoist code above an acquire barrier or below a release barrier.
>>>
>>> That was the point of the now deprecated "volatile" statement.  I still don't entirely understand why it was deprecated.
>>>
>>
>> The volatile statement was too general. All relevant compiler back ends today only know of two kinds of volatile operations: Loads and stores. Volatile statements couldn't ever be properly implemented in GDC and LDC for example.
>
> Well, the semantics of volatile are that there's an acquire barrier before the statement block and a release barrier after the statement block.  Or for a first cut just insert a full barrier at the beginning and end of the block.  Either way, it should be pretty simply for a compiler to handle if the compiler supports mutex use.
>
> I do like the idea of built-in load and store intrinsics only because D only supports x86 assembler right now.  But really, it would be just as easy to fan out a D template function to a bunch of C functions implemented in separate ASM code files.  Druntime actually had this for core.atomic on PPC until not too long ago.
>

Well, there's not much point in that when all compilers have intrinsics anyway (e.g. GDC has __sync_* and __atomic_* and LDC has some intrinsics in ldc.intrinsics that map to certain LLVM instructions).

-- 
Alex Rønne Petersen
alex@lycus.org
http://lycus.org
November 14, 2012
On 11/14/12 11:21 AM, Iain Buclaw wrote:
> On 14 November 2012 17:50, Andrei Alexandrescu
> <SeeWebsiteForEmail@erdani.org>  wrote:
>> On 11/14/12 9:15 AM, David Nadlinger wrote:
>>>
>>> On Wednesday, 14 November 2012 at 14:16:57 UTC, Andrei Alexandrescu wrote:
>>>>
>>>> On 11/14/12 1:20 AM, Walter Bright wrote:
>>>>>
>>>>> On 11/13/2012 11:37 PM, Jacob Carlborg wrote:
>>>>>>
>>>>>> If the compiler should/does not add memory barriers, then is there a
>>>>>> reason for
>>>>>> having it built into the language? Can a library solution be enough?
>>>>>
>>>>>
>>>>> Memory barriers can certainly be added using library functions.
>>>>
>>>>
>>>> The compiler must understand the semantics of barriers such as e.g. it
>>>> doesn't hoist code above an acquire barrier or below a release barrier.
>>>
>>>
>>> Again, this is true, but it would be a fallacy to conclude that
>>> compiler-inserted memory barriers for »shared« are required due to this
>>> (and it is »shared« we are discussing here!).
>>>
>>> Simply having compiler intrinsics for atomic loads/stores is enough,
>>> which is hardly »built into the language«.
>>
>>
>> Compiler intrinsics ====== built into the language.
>>
>> Andrei
>>
>
> Not necessarily. For example, printf is a compiler intrinsic for GDC,
> but it's not built into the language in the sense of the compiler
> *provides* the codegen for it.  Though it is aware of what it is and
> what it does, so can perform relevant optimisations around the use of
> it.

aware of what it is and what it does ====== built into the language.

Andrei

November 14, 2012
On 11/14/12 12:00 PM, Sean Kelly wrote:
> On Nov 14, 2012, at 6:16 AM, Andrei Alexandrescu<SeeWebsiteForEmail@erdani.org>  wrote:
>
>> On 11/14/12 1:20 AM, Walter Bright wrote:
>>> On 11/13/2012 11:37 PM, Jacob Carlborg wrote:
>>>> If the compiler should/does not add memory barriers, then is there a
>>>> reason for
>>>> having it built into the language? Can a library solution be enough?
>>>
>>> Memory barriers can certainly be added using library functions.
>>
>> The compiler must understand the semantics of barriers such as e.g. it doesn't hoist code above an acquire barrier or below a release barrier.
>
> That was the point of the now deprecated "volatile" statement.  I still don't entirely understand why it was deprecated.

Because it's better to associate volatility with data than with code.

Andrei
November 14, 2012
On 11/14/12 12:04 PM, Sean Kelly wrote:
> On Nov 14, 2012, at 9:50 AM, Andrei Alexandrescu<SeeWebsiteForEmail@erdani.org>  wrote:
>>
>> First, there are more kinds of atomic loads and stores. Then, the fact that the calls are not supposed to be reordered must be a guarantee of the language, not a speculation about an implementation. We can't argue that a feature works just because it so happens an implementation works a specific way.
>
> I've always been a fan of release consistency, and it dovetails well with the behavior of mutexes (http://en.wikipedia.org/wiki/Release_consistency).  It would be cool if we could sort out transactional memory as well, but that's not a short term thing.

I think we should focus on sequential consistency as that's where the industry is converging.

Andrei
November 14, 2012
On 11/14/12 1:06 PM, Walter Bright wrote:
> On 11/14/2012 3:14 AM, Benjamin Thaut wrote:
>> A small code example which would break as soon as we allow destructing
>> of shared
>> value types would really be nice.
>
> I hate to repeat myself, but:
>
> Thread 1:
> 1. create shared object
> 2. pass reference to that object to Thread 2

That should be disallowed at least in safe code. If I had my way I'd explore disallowing in all code.

Andrei
November 14, 2012
On 11/14/12 1:09 PM, Walter Bright wrote:
> Yes. And also, I agree that having something typed as "shared" must
> prevent the compiler from reordering them. But that's separate from
> inserting memory barriers.

It's the same issue at hand: ordering properly and inserting barriers are two ways to ensure one single goal, sequential consistency. Same thing.

Andrei


November 14, 2012
On 14.11.2012 20:54, Sean Kelly wrote:
> On Nov 13, 2012, at 1:14 AM, luka8088<luka8088@owave.net>  wrote:
>
>> On Tuesday, 13 November 2012 at 09:11:15 UTC, luka8088 wrote:
>>> On 12.11.2012 3:30, Walter Bright wrote:
>>>> On 11/11/2012 10:46 AM, Alex Rønne Petersen wrote:
>>>>> It's starting to get outright embarrassing to talk to newcomers about D's
>>>>> concurrency support because the most fundamental part of it -- the
>>>>> shared type
>>>>> qualifier -- does not have well-defined semantics at all.
>>>>
>>>> I think a couple things are clear:
>>>>
>>>> 1. Slapping shared on a type is never going to make algorithms on that
>>>> type work in a concurrent context, regardless of what is done with
>>>> memory barriers. Memory barriers ensure sequential consistency, they do
>>>> nothing for race conditions that are sequentially consistent. Remember,
>>>> single core CPUs are all sequentially consistent, and still have major
>>>> concurrency problems. This also means that having templates accept
>>>> shared(T) as arguments and have them magically generate correct
>>>> concurrent code is a pipe dream.
>>>>
>>>> 2. The idea of shared adding memory barriers for access is not going to
>>>> ever work. Adding barriers has to be done by someone who knows what
>>>> they're doing for that particular use case, and the compiler inserting
>>>> them is not going to substitute.
>>>>
>>>>
>>>> However, and this is a big however, having shared as compiler-enforced
>>>> self-documentation is immensely useful. It flags where and when data is
>>>> being shared. So, your algorithm won't compile when you pass it a shared
>>>> type? That is because it is NEVER GOING TO WORK with a shared type. At
>>>> least you get a compile time indication of this, rather than random
>>>> runtime corruption.
>>>>
>>>> To make a shared type work in an algorithm, you have to:
>>>>
>>>> 1. ensure single threaded access by aquiring a mutex
>>>> 2. cast away shared
>>>> 3. operate on the data
>>>> 4. cast back to shared
>>>> 5. release the mutex
>>>>
>>>> Also, all op= need to be disabled for shared types.
>>>
>>>
>>> This clarifies a lot, but still a lot of people get confused with:
>>> http://dlang.org/faq.html#shared_memory_barriers
>>> is it a faq error ?
>>>
>>> and also with http://dlang.org/faq.html#shared_guarantees said, I come to think that the fact that the following code compiles is either lack of implementation, a compiler bug or a faq error ?
>>
>> //////////
>>
>> import core.thread;
>>
>> void main () {
>>   int i;
>>   (new Thread({ i++; })).start();
>> }
>
> It's intentional.  core.thread is for people who know what they're doing, and there are legitimate uses along these lines:
>
> void main() {
>      int i;
>      auto t = new Thread({i++;});
>      t.start();
>      t.join();
>      write(i);
> }
>
> This is perfectly safe and has a deterministic result.

Yes, that makes perfect sense... I just wanted to point out the misguidance in FAQ because (at least before this forum thread) there is not much written about shared and you can get a wrong idea from it (at least I did).
November 15, 2012
On Nov 14, 2012, at 2:21 PM, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:

> On 11/14/12 12:00 PM, Sean Kelly wrote:
>> On Nov 14, 2012, at 6:16 AM, Andrei Alexandrescu<SeeWebsiteForEmail@erdani.org>  wrote:
>> 
>>> On 11/14/12 1:20 AM, Walter Bright wrote:
>>>> On 11/13/2012 11:37 PM, Jacob Carlborg wrote:
>>>>> If the compiler should/does not add memory barriers, then is there a
>>>>> reason for
>>>>> having it built into the language? Can a library solution be enough?
>>>> 
>>>> Memory barriers can certainly be added using library functions.
>>> 
>>> The compiler must understand the semantics of barriers such as e.g. it doesn't hoist code above an acquire barrier or below a release barrier.
>> 
>> That was the point of the now deprecated "volatile" statement.  I still don't entirely understand why it was deprecated.
> 
> Because it's better to associate volatility with data than with code.

Fair enough.  Though this may mean building a bunch of different forms of volatility into the language.  I always saw "volatile" as a library tool anyway, so while making it code-related was a bit weird, it was a sufficient tool for the job.

November 15, 2012
On Nov 14, 2012, at 2:25 PM, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:

> On 11/14/12 1:09 PM, Walter Bright wrote:
>> Yes. And also, I agree that having something typed as "shared" must prevent the compiler from reordering them. But that's separate from inserting memory barriers.
> 
> It's the same issue at hand: ordering properly and inserting barriers are two ways to ensure one single goal, sequential consistency. Same thing.

Sequential consistency is great and all, but it doesn't render concurrent code correct.  At worst, it provides a false sense of security that somehow it does accomplish this, and people end up actually using it as such.