November 14, 2012

On 11/14/2012 3:05 AM, Alex Rønne Petersen wrote:
> On 14-11-2012 03:02, Andrei Alexandrescu wrote:
>> On 11/13/12 5:58 PM, Alex Rønne Petersen wrote:
>>> On 14-11-2012 02:52, Andrei Alexandrescu wrote:
>>>> On 11/13/12 3:48 PM, Alex Rønne Petersen wrote:
>>>>> Slices and delegates can't be loaded/stored atomically because very
>>>>> few
>>>>> architectures provide instructions to atomically load/store 16
>>>>> bytes of
>>>>> data (required on 64-bit; 32-bit would be fine since that's just 8
>>>>> bytes, but portability is king). This is also why ucent, cent, and
>>>>> real
>>>>> are not included in the list.
>>>>
>>>> When I wrote TDPL I looked at the contemporary architectures and it
>>>> seemed all were or were about to support double-word atomic ops. So the
>>>> intent is to allow shared delegates and slices.
>>>>
>>>> Are there any architectures today that don't support double-word load,
>>>> store, and CAS?
>>>>
>>>>
>>>> Andrei
>>>
>>> I do not know of a single architecture apart from x86 that supports >
>>> 8-byte load/store/CAS (and come to think of it, I'm not so sure x86
>>> actually can do 16-byte load/store, only CAS). So while a shared
>>> delegate is doable in 32-bit, it isn't really in 64-bit.
>>
>> Intel does 128-bit atomic load and store, see
>> http://www.intel.com/content/www/us/en/processors/itanium/itanium-architecture-software-developer-rev-2-3-vol-2-manual.html,
>>
>> "4.5 Memory Datum Alignment and Atomicity".
>>
>> Andrei
>>
>
> That's Itanium, though, not x86. Itanium is a fairly high-end,
> enterprise-class thing, so that's not very surprising.
>

On x86 you can use LOCK CMPXCHG16b to do the atomic read: http://stackoverflow.com/questions/9726566/atomic-16-byte-read-on-x64-cpus

This just excludes a small number of early AMD processors.
November 14, 2012
Am 13.11.2012 23:22, schrieb Walter Bright:
>
> But I do see enormous value in shared in that it logically (and rather
> forcefully) separates thread-local code from multi-thread code. For
> example, see the post here about adding a destructor to a shared struct,
> and having it fail to compile. The complaint was along the lines of
> shared being broken, whereas I viewed it along the lines of shared
> pointing out a logic problem in the code - what does destroying a struct
> accessible from multiple threads mean? I think it must be clear that
> destroying an object can only happen in one thread, i.e. the object must
> become thread local in order to be destroyed.
>

I still don't agree with you there. The struct would have clearly outlived any thread (as it was in the global scope) so at the point where it is destroyed there should be really only one thread left. So it IS destroyed in a single threaded context. The same is done for classes by the GC just that the GC ignores shared altogether.

Kind Regards
Benjamin Thaut


November 14, 2012
On 11/14/2012 1:01 AM, Benjamin Thaut wrote:
> I still don't agree with you there. The struct would have clearly outlived any
> thread (as it was in the global scope) so at the point where it is destroyed
> there should be really only one thread left. So it IS destroyed in a single
> threaded context.

If you know this for a fact, then cast it to thread local. The compiler cannot figure this out for you, hence it issues the error.


> The same is done for classes by the GC just that the GC
> ignores shared altogether.

That's different, because the GC verifies that there are *no* references to it from any thread first.
November 14, 2012
On 11/13/2012 11:56 PM, Jonathan M Davis wrote:
> Being able to have double-checked locking work would be valuable, and having
> memory barriers would reduce race condition weirdness when locks aren't used
> properly, so I think that it would be desirable to have memory barriers.

I'm not saying "memory barriers are bad". I'm saying that having the compiler blindly insert them for shared reads/writes is far from the right way to do it.

November 14, 2012
On 11/13/2012 11:37 PM, Jacob Carlborg wrote:
> If the compiler should/does not add memory barriers, then is there a reason for
> having it built into the language? Can a library solution be enough?

Memory barriers can certainly be added using library functions.

November 14, 2012
Am 14.11.2012 10:18, schrieb Walter Bright:
> On 11/14/2012 1:01 AM, Benjamin Thaut wrote:
>> I still don't agree with you there. The struct would have clearly
>> outlived any
>> thread (as it was in the global scope) so at the point where it is
>> destroyed
>> there should be really only one thread left. So it IS destroyed in a
>> single
>> threaded context.
>
> If you know this for a fact, then cast it to thread local. The compiler
> cannot figure this out for you, hence it issues the error.
>
>
>> The same is done for classes by the GC just that the GC
>> ignores shared altogether.
>
> That's different, because the GC verifies that there are *no* references
> to it from any thread first.

Could you please give an example where it would break?

And whats the difference between:

struct Value
{
  ~this()
  {
    printf("destroy\n");
  }
}

shared Value v;


and:


shared static ~this()
{
  printf("destory\n");
}

Kind Regards
Benjamin Thaut
November 14, 2012
On 2012-11-14 08:56, Jonathan M Davis wrote:

> Being able to have double-checked locking work would be valuable, and having
> memory barriers would reduce race condition weirdness when locks aren't used
> properly, so I think that it would be desirable to have memory barriers. If
> there's a major performance penalty though, that might be a reason not to do
> it. Certainly, I don't think that there's any question that adding memory
> barriers won't make it so that you don't need mutexes or synchronized blocks
> or whatnot. shared's primary benefit is in logically separating normal code
> from code that must shared data across threads and making it possible for the
> compiler to optimize based on the fact that it knows that a variable is
> thread-local.

If there is a problem with efficiency in some cases then the developer can use __gshared and manually handling things. But of course, we don't want the developer to have to do this in most cases.

-- 
/Jacob Carlborg
November 14, 2012
On 2012-11-14 10:20, Walter Bright wrote:

> Memory barriers can certainly be added using library functions.

Is there then any real advantage of having it directly in the language?

-- 
/Jacob Carlborg
November 14, 2012
On 11/14/2012 04:12 AM, Michel Fortin wrote:
> On 2012-11-13 19:54:32 +0000, Timon Gehr <timon.gehr@gmx.ch> said:
>
>> On 11/12/2012 02:48 AM, Michel Fortin wrote:
>>> I feel like the concurrency aspect of D2 was rushed in the haste of
>>> having it ready for TDPL. Shared, deadlock-prone synchronized classes[1]
>>> as well as destructors running in any thread (thanks GC!) plus a couple
>>> of other irritants makes the whole concurrency scheme completely flawed
>>> if you ask me. D2 needs a near complete overhaul on the concurrency
>>> front.
>>>
>>> I'm currently working on a big code base in C++. While I do miss D when
>>> it comes to working with templates as well as for its compilation speed
>>> and a few other things, I can't say I miss D much when it comes to
>>> anything touching concurrency.
>>>
>>> [1]: http://michelf.ca/blog/2012/mutex-synchonization-in-d/
>>
>> I am always irritated by shared-by-default static variables.
>
> I tend to have very little global state in my code,

So do I. A thread-local static variable does not imply global state. (The execution stack is static.) Eg. in a few cases it is sensible to use static variables as implicit arguments to avoid having to pass them around by copying them all over the execution stack.

private int x = 0;

int foo(){
    int xold = x;
    scope(exit) x = xold;
    x = new_value;
    bar(); // reads x
    return baz(); // reads x
}

Unfortunately, this destroys 'pure' even though it actually does not.

> so shared-by-default
> is not something I have to fight with very often.  I do agree that
> thread-local is a better default.
>


November 14, 2012
On 11/14/2012 1:31 AM, Jacob Carlborg wrote:
> On 2012-11-14 10:20, Walter Bright wrote:
>
>> Memory barriers can certainly be added using library functions.
>
> Is there then any real advantage of having it directly in the language?
>

Not that I can think of.