November 12, 2018
On Monday, 12 November 2018 at 09:11:04 UTC, John Colvin wrote:
> On Sunday, 11 November 2018 at 20:47:16 UTC, Steven Schveighoffer wrote:
>> On 11/11/18 2:21 PM, John Colvin wrote:
>>> Take a look at this (I think?) supposed-to-be-thread-safe code according to the rules of D:
>>> 
>>> import std.stdio;
>>> import core.thread;
>>> import core.atomic;
>>> 
>>> int* foo() pure
>>> {
>>>      auto ret = new int;
>>>      *ret = 3;  // MODIFY
>>>      return ret;
>>> }
>>> 
>>> shared immutable(int)* g;
>>> 
>>> void bar()
>>> {
>>>      immutable(int)* d = null;
>>>      while (d is null)
>>>          d = g.atomicLoad;
>>>      assert(*d == 3); // READ
>>>      assert(*d == 3); // READ AGAIN
>>> }
>>> 
>>> void main()
>>> {
>>>      auto t = new Thread(&bar).start();
>>>      immutable(int)* a = foo();
>>>      g.atomicStore(a);
>>>      t.join;
>>> }
>>> 
>>> What stops the CPU executing this such that MODIFY happens between READ and READ AGAIN ?
>>> 
>>> To aid in the thought experiment, imagine if we replace `*ret = 3` with `*ret = longComputation()`? It might help your reasoning about re-ordering if you consider `foo` inlined.
>>
>> Hm... I wouldn't expect the compiler would allow reordering across a return boundary. Even if the inlining occurs. Implicit conversion of pure return values depends on that.
>>
>> I don't know all the rules of reordering, but this would definitely cause races if it was allowed to reorder the storage of data in *d, and the storage of d into g.
>
> The compiler can definitely re-order over return boundaries in general, but perhaps there is some special logic to prevent mistakes in these cases?
>
>> Wait, wouldn't the atomicStore create a barrier, such that all the code before it must execute?
>> 
>> -Steve
>
> Not a full memory barrier on most (all relevant?) platforms.

Hmm, ok, so it seems that perhaps this only occurs with relaxed memory ordering, because with acquire-release (or better) if bar sees the result of main's write to g (i.e. it gets out the while loop) then it must also see all side-effects of everything main did before that write.

However, with relaxed memory ordering I'm now more convinced that this would break immutability, which in turn means that the oft-made statement "immutable data doesn't require synchronisation" isn't true.
November 12, 2018
On Monday, 12 November 2018 at 14:54:12 UTC, John Colvin wrote:
> However, with relaxed memory ordering I'm now more convinced that this would break immutability, which in turn means that the oft-made statement "immutable data doesn't require synchronisation" isn't true.

The correct statement would be more like "reading immutable data doesn't require synchronisation if the method of obtaining the reference to the immutable data has at least acquire-release semantics". Which is considerably less snappy and confidence inspiring haha!
November 12, 2018
On 12.11.18 16:00, John Colvin wrote:
> On Monday, 12 November 2018 at 14:54:12 UTC, John Colvin wrote:
>> However, with relaxed memory ordering I'm now more convinced that this would break immutability, which in turn means that the oft-made statement "immutable data doesn't require synchronisation" isn't true.
> 
> The correct statement would be more like "reading immutable data doesn't require synchronisation if the method of obtaining the reference to the immutable data has at least acquire-release semantics". Which is considerably less snappy and confidence inspiring haha!

Yes. More generally, I think implicit sharing of immutable data was a mistake. (All it does is to forbid thread-local immutable data -- a valid use case.)
November 13, 2018
On Monday, 12 November 2018 at 15:00:33 UTC, John Colvin wrote:
> The correct statement would be more like "reading immutable data doesn't require synchronisation if the method of obtaining the reference to the immutable data has at least acquire-release semantics". Which is considerably less snappy and confidence inspiring haha!

Well, without acquire-release you can't meaningfully share data - it simply doesn't reach another thread. Does it even count as sharing? With immutable data you need to deliver data only once, after that you can read it without synchronization.
November 13, 2018
On Tuesday, 13 November 2018 at 11:39:48 UTC, Kagamin wrote:
> On Monday, 12 November 2018 at 15:00:33 UTC, John Colvin wrote:
>> The correct statement would be more like "reading immutable data doesn't require synchronisation if the method of obtaining the reference to the immutable data has at least acquire-release semantics". Which is considerably less snappy and confidence inspiring haha!
>
> Well, without acquire-release you can't meaningfully share data - it simply doesn't reach another thread. Does it even count as sharing? With immutable data you need to deliver data only once, after that you can read it without synchronization.

What - precisely - do you mean by "without acquire-release you can't meaningfully share data - it simply doesn't reach another thread"?

How would the incrementing of the reference count in shared_ptr work then?
November 14, 2018
On Tuesday, 13 November 2018 at 17:18:17 UTC, John Colvin wrote:
> On Tuesday, 13 November 2018 at 11:39:48 UTC, Kagamin wrote:
>> On Monday, 12 November 2018 at 15:00:33 UTC, John Colvin wrote:
>>> The correct statement would be more like "reading immutable data doesn't require synchronisation if the method of obtaining the reference to the immutable data has at least acquire-release semantics". Which is considerably less snappy and confidence inspiring haha!
>>
>> Well, without acquire-release you can't meaningfully share data - it simply doesn't reach another thread. Does it even count as sharing? With immutable data you need to deliver data only once, after that you can read it without synchronization.
>
> What - precisely - do you mean by "without acquire-release you can't meaningfully share data - it simply doesn't reach another thread"?

I *think* it isn't guaranteed to leave cache otherwise.


November 14, 2018
On Tuesday, 13 November 2018 at 17:18:17 UTC, John Colvin wrote:
> What - precisely - do you mean by "without acquire-release you can't meaningfully share data - it simply doesn't reach another thread"?

The data simply remains in processor cache.

> How would the incrementing of the reference count in shared_ptr work then?

You mean relaxed order? Looks like llvm has better description for that http://llvm.org/docs/Atomics.html#monotonic it's a synchronization, just weak - synchronizes only one variable, not the entire cache, and why you need full synchronization on decrement - because you may need to run the destructor, which needs the latest state of the object and everything it references.
November 14, 2018
On Wednesday, 14 November 2018 at 08:38:55 UTC, Kagamin wrote:
> On Tuesday, 13 November 2018 at 17:18:17 UTC, John Colvin wrote:
>> What - precisely - do you mean by "without acquire-release you can't meaningfully share data - it simply doesn't reach another thread"?
>
> The data simply remains in processor cache.
>
>> How would the incrementing of the reference count in shared_ptr work then?
>
> You mean relaxed order? Looks like llvm has better description for that http://llvm.org/docs/Atomics.html#monotonic it's a synchronization, just weak - synchronizes only one variable, not the entire cache, and why you need full synchronization on decrement - because you may need to run the destructor, which needs the latest state of the object and everything it references.

What relaxed does is roughly: "all relaxed operations on this value happen atomically in some order".

int* foo() pure {
    auto ret = new int; // A
    *ret = 3; // B
    return ret;
}

shared immutable(int)* g;

void bar() {
    immutable(int)* d = null;
    while (d is null)
    	d = g.atomicLoad;
    assert(*d == 3); // D
    assert(*d == 3); // E
}

void main() {
    auto t = new Thread(&bar).start();
    immutable(int)* a = foo();
    g.atomicStore!(MemoryOrder.relaxed)(a); // C
    t.join;
}

The relevant ordering restrictions are as follows:

A must happen before B, because `ret` must exist to be used.
A must happen before C, because `a` must exist to be used, which cannot exist until `ret` is used.
C must happen before D & E because it's the only way out of the while loop.

So we have

A -> C -> D -> E

but B can sit anywhere after A, including between D and E!

To address the "it won't leave the cache" idea which I'm guessing is actually meant to be "relaxed atomics make no promises about when changes become visible", you should consider whether acq-rel actually gives you any more guarantee about that in this case (how many iterations of the while loop would you expect?), but also that lack of guarantee of timeliness is not the same as a guarantee of never.
November 15, 2018
Correcting the obvious typos:

int* foo() pure {
    auto ret = new int; // A
    *ret = 3; // B
    return ret;
}

shared immutable(int)* g;

void bar() {
    immutable(int)* d = null;
    while (d is null)
        d = g.atomicLoad!(MemoryOrder.raw);
    assert(*d == 3); // D
    assert(*d == 3); // E
}

void main() {
    auto t = new Thread(&bar).start();
    immutable(int)* a = foo();
    g.atomicStore!(MemoryOrder.raw)(a); // C
    t.join;
}

1 2
Next ›   Last »