Is core.internal.atomic.atomicFetchAdd implementation really lock free? (page 2)

Settings

Help

Index » General » Is core.internal.atomic.atomicFetchAdd implementation really lock free? (page 2)

December 03, 2022

Re: Is core.internal.atomic.atomicFetchAdd implementation really lock free?

Posted by claptrap
in reply to Iain Buclaw

Permalink

claptrap

Posted in reply to Iain Buclaw

Permalink

On Wednesday, 30 November 2022 at 09:25:57 UTC, Iain Buclaw wrote:

On Wednesday, 30 November 2022 at 00:35:55 UTC, claptrap wrote:

On Wednesday, 30 November 2022 at 00:16:00 UTC, rikki cattermole wrote:

C++'s std::atomic has followed this precedent too of using a mutex if all else fails.

Really, you'd have to be pretty explicit in forcing GDC built-in atomics to be disabled at configure-time - there's an atomic library right there in GCC's tree if a given target does not have native support.

I used to make this joke about my brother, if you ever needed to know how to do something you would ask him what to do... and then do the exact opposite.

I feel the same way about C++

December 03, 2022

Re: Is core.internal.atomic.atomicFetchAdd implementation really lock free?

Posted by max haughton
in reply to claptrap

Permalink

max haughton

Posted in reply to claptrap

Permalink

On Saturday, 3 December 2022 at 13:05:44 UTC, claptrap wrote:
> On Saturday, 3 December 2022 at 03:42:01 UTC, max haughton wrote:
>> On Wednesday, 30 November 2022 at 00:35:55 UTC, claptrap wrote:
>>> On Wednesday, 30 November 2022 at 00:16:00 UTC, rikki cattermole wrote:
>>>> On 30/11/2022 1:12 PM, H. S. Teoh wrote:
>>>>> Hmm, that's weird that the docs would say that.  I've always been under
>>>>> the impression that core.atomic ops use locks to achieve atomicity.
>>>>
>>>> No its correct.
>>>>
>>>> As long as the hardware supports atomic operations, it'll use those instructions. It does have a fallback to use a mutex if need be though, which might be where you got that idea from.
>>>
>>> It really shouldn't do that IMO. People expect atomic ops to be lock-free, it should be compile error if it cant be so.
>>
>> I expect atomic ops to provide correct memory consistency, and preferably be atomic where the platform allows it.
>
> I am really hoping that "preferably atomic" was just a poor choice or words.
>
> I guess regarding the other issues I still think of atomic ops in terms of what the cpu does, so a high level wrapper around them that takes a lock is the complete antithesis to what is actually desired.
>
> I mean the whole point of atomic ops is to be able to avoid using locks.
>
> And if you look at the phobos docs...
>
> "Atomically adds mod to the value referenced by val and returns the value val held previously. This operation is both lock-free and atomic."
>
> https://dlang.org/library/core/atomic/atomic_fetch_add.html

If it provides the same memory ordering guarantees does it matter (if we ignore performance for a second)? There are situations where you do (for reasons beyond performance) actually need a efficient (no overhead) atomic operation in lock-free coding, but these are really on the edge of what can be considered guaranteed by any specification.

Obviously they *should* be atomic, but not all platforms have atomics and not all platforms support the same atomic privileges - but they all need to support the same code/patterns. Our atomics prefer to error whereas the C++ ones have a fallback thing for large types and so on.

December 03, 2022

Re: Is core.internal.atomic.atomicFetchAdd implementation really lock free?

Posted by claptrap
in reply to max haughton

Permalink

claptrap

Posted in reply to max haughton

Permalink

On Saturday, 3 December 2022 at 20:18:07 UTC, max haughton wrote:
> On Saturday, 3 December 2022 at 13:05:44 UTC, claptrap wrote:
>> On Saturday, 3 December 2022 at 03:42:01 UTC, max haughton wrote:
>>> On Wednesday, 30 November 2022 at 00:35:55 UTC, claptrap
>>
>> "Atomically adds mod to the value referenced by val and returns the value val held previously. This operation is both lock-free and atomic."
>>
>> https://dlang.org/library/core/atomic/atomic_fetch_add.html
>
> If it provides the same memory ordering guarantees does it matter (if we ignore performance for a second)? There are situations where you do (for reasons beyond performance) actually need a efficient (no overhead) atomic operation in lock-free coding, but these are really on the edge of what can be considered guaranteed by any specification.

It matters because the whole point of atomic operations is lock free coding. There is no oh you might need atomic for lock free coding, you literally have to have them. If they fall back on a mutex it's not lock free anymore.

memory ordering is a somewhat orthogonal issue from atomic ops.

December 04, 2022

Re: Is core.internal.atomic.atomicFetchAdd implementation really lock free?

Posted by max haughton
in reply to claptrap

Permalink

max haughton

Posted in reply to claptrap

Permalink

On Saturday, 3 December 2022 at 21:05:51 UTC, claptrap wrote:
> On Saturday, 3 December 2022 at 20:18:07 UTC, max haughton wrote:
>> On Saturday, 3 December 2022 at 13:05:44 UTC, claptrap wrote:
>>> On Saturday, 3 December 2022 at 03:42:01 UTC, max haughton wrote:
>>>> On Wednesday, 30 November 2022 at 00:35:55 UTC, claptrap
>>>
>>> "Atomically adds mod to the value referenced by val and returns the value val held previously. This operation is both lock-free and atomic."
>>>
>>> https://dlang.org/library/core/atomic/atomic_fetch_add.html
>>
>> If it provides the same memory ordering guarantees does it matter (if we ignore performance for a second)? There are situations where you do (for reasons beyond performance) actually need a efficient (no overhead) atomic operation in lock-free coding, but these are really on the edge of what can be considered guaranteed by any specification.
>
> It matters because the whole point of atomic operations is lock free coding. There is no oh you might need atomic for lock free coding, you literally have to have them. If they fall back on a mutex it's not lock free anymore.
>
> memory ordering is a somewhat orthogonal issue from atomic ops.

Memory ordering is literally why modern atomic operations exist. That's why there's a lock prefix on the instruction in X86 - it doesn't just say "do this in one go" it says "do this in one go *and* maintain this memory ordering for the other threads".

You'd never see the mutex or similar, it's just a detail of the atomics library. Again, I'm not saying it would be good, just that it wouldn't make any difference to code.

December 04, 2022

Re: Is core.internal.atomic.atomicFetchAdd implementation really lock free?

Posted by rikki cattermole
in reply to max haughton

Permalink

rikki cattermole

Posted in reply to max haughton

Permalink

On 04/12/2022 2:51 PM, max haughton wrote:
> Memory ordering is literally why modern atomic operations exist. That's why there's a lock prefix on the instruction in X86 - it doesn't just say "do this in one go" it says "do this in one go *and* maintain this memory ordering for the other threads".

Also worth remembering that at some point somebody has to do synchronization to make atomics work.

It doesn't matter if its in user code, or in microcode. It has to exist some place.

If it doesn't, you continue to have different cores with different values for a chunk of memory. This is something you do not want to encounter and it makes you lose hair (from experience).

December 04, 2022

Re: Is core.internal.atomic.atomicFetchAdd implementation really lock free?

Posted by claptrap
in reply to max haughton

Permalink

claptrap

Posted in reply to max haughton

Permalink

On Sunday, 4 December 2022 at 01:51:56 UTC, max haughton wrote:
> On Saturday, 3 December 2022 at 21:05:51 UTC, claptrap wrote:
>> On Saturday, 3 December 2022 at 20:18:07 UTC, max haughton wrote:
>>> On Saturday, 3 December 2022 at 13:05:44 UTC, claptrap wrote:
>>>> On Saturday, 3 December 2022 at 03:42:01 UTC, max haughton wrote:
>>>>> On Wednesday, 30 November 2022 at 00:35:55 UTC, claptrap
>>>>
>>>> "Atomically adds mod to the value referenced by val and returns the value val held previously. This operation is both lock-free and atomic."
>>>>
>>>> https://dlang.org/library/core/atomic/atomic_fetch_add.html
>>>
>>> If it provides the same memory ordering guarantees does it matter (if we ignore performance for a second)? There are situations where you do (for reasons beyond performance) actually need a efficient (no overhead) atomic operation in lock-free coding, but these are really on the edge of what can be considered guaranteed by any specification.
>>
>> It matters because the whole point of atomic operations is lock free coding. There is no oh you might need atomic for lock free coding, you literally have to have them. If they fall back on a mutex it's not lock free anymore.
>>
>> memory ordering is a somewhat orthogonal issue from atomic ops.
>
> Memory ordering is literally why modern atomic operations exist. That's why there's a lock prefix on the instruction in X86 - it doesn't just say "do this in one go" it says "do this in one go *and* maintain this memory ordering for the other threads".

No it's not, literally go read the Intel SDM, the lock prefix is for atomic operations, that's it's intent. That it also has extra memory ordering guarantees between cores is simply because it would be useless if it didn't also do that.

The m/s/l/fence instructions are for memory ordering.

If you still dont believe me consider this. x86 has always had strong memory ordering, there's no need to worry about reads to a given location being moved ahead of writes to the same location if you are on a single core. That only goes out of the window when you have a multicore x86. The lock prefix existed before x86 cpus had multiple cores. IE locked instructions are for atomics, the memory ordering guarantees already existed for all reads/writes on single cores. The extra guarantees for memory ordering between cores were added when x86 went multicore, because lock free algorithms would not work if the atomic reads/writes were not also ordered between cores.

To say atmoics are about memory ordering is like saying cars are for parking. Yes you have to park them somewhere, but thats not the problem they are meant to solve.

December 04, 2022

Re: Is core.internal.atomic.atomicFetchAdd implementation really lock free?

Posted by max haughton
in reply to claptrap

Permalink

max haughton

Posted in reply to claptrap

Permalink

On Sunday, 4 December 2022 at 10:15:48 UTC, claptrap wrote:
> On Sunday, 4 December 2022 at 01:51:56 UTC, max haughton wrote:
>> On Saturday, 3 December 2022 at 21:05:51 UTC, claptrap wrote:
>>> On Saturday, 3 December 2022 at 20:18:07 UTC, max haughton wrote:
>>>> On Saturday, 3 December 2022 at 13:05:44 UTC, claptrap wrote:
>>>>> [...]
>>>>
>>>> If it provides the same memory ordering guarantees does it matter (if we ignore performance for a second)? There are situations where you do (for reasons beyond performance) actually need a efficient (no overhead) atomic operation in lock-free coding, but these are really on the edge of what can be considered guaranteed by any specification.
>>>
>>> It matters because the whole point of atomic operations is lock free coding. There is no oh you might need atomic for lock free coding, you literally have to have them. If they fall back on a mutex it's not lock free anymore.
>>>
>>> memory ordering is a somewhat orthogonal issue from atomic ops.
>>
>> Memory ordering is literally why modern atomic operations exist. That's why there's a lock prefix on the instruction in X86 - it doesn't just say "do this in one go" it says "do this in one go *and* maintain this memory ordering for the other threads".
>
> No it's not, literally go read the Intel SDM, the lock prefix is for atomic operations, that's it's intent. That it also has extra memory ordering guarantees between cores is simply because it would be useless if it didn't also do that.
>
> The m/s/l/fence instructions are for memory ordering.
>
> If you still dont believe me consider this. x86 has always had strong memory ordering, there's no need to worry about reads to a given location being moved ahead of writes to the same location if you are on a single core. That only goes out of the window when you have a multicore x86. The lock prefix existed before x86 cpus had multiple cores. IE locked instructions are for atomics, the memory ordering guarantees already existed for all reads/writes on single cores. The extra guarantees for memory ordering between cores were added when x86 went multicore, because lock free algorithms would not work if the atomic reads/writes were not also ordered between cores.
>
> To say atmoics are about memory ordering is like saying cars are for parking. Yes you have to park them somewhere, but thats not the problem they are meant to solve.

X86 isn't the only processor, by atomics I am referring to the ones we use in programming languages (note that I said modern). The ones in D are basically inherited from C++11 (and LLVM), and we're drafted because working with memory ordering prior to them was wild-west.

X86 is also still ordered with or without a LOCK prefix. It's a weaker kind of order but it's still defined (it actually wasn't defined to an academically satisfactorily standard until shockingly recently). For example, a store with release ordering on X86 will yield a regular mov instruction.

December 05, 2022

Re: Is core.internal.atomic.atomicFetchAdd implementation really lock free?

Posted by claptrap
in reply to max haughton

Permalink

claptrap

Posted in reply to max haughton

Permalink

On Sunday, 4 December 2022 at 12:39:44 UTC, max haughton wrote:
> On Sunday, 4 December 2022 at 10:15:48 UTC, claptrap wrote:
>> On Sunday, 4 December 2022 at 01:51:56 UTC, max haughton wrote:
>>> On Saturday, 3 December 2022 at 21:05:51 UTC, claptrap wrote:
>>
>
> X86 isn't the only processor

I think you'll struggle to find any processor that defines atomics in terms of memory ordering and not in terms of a read/modify/write sequence that is executed atomically.

> by atomics I am referring to the ones we use in programming languages (note that I said modern). The ones in D are basically inherited from C++11 (and LLVM), and we're drafted because working with memory ordering prior to them was wild-west.

I was working on lock free algorithms around 2008, there was an intel pdf at the time specifying memory ordering. Tbh it probably didn't need to be specified until multicore came about since the default memory ordering on x86 single core meant you you literally didn't need to worry about it.

C++.. wild west? sounds about right :)

> X86 is also still ordered with or without a LOCK prefix. It's a weaker kind of order but it's still defined

Yes the only difference a lock prefix makes is it adds extra ordering guarantees between cores. Basically the ordering grantees you get on a single core, you get the same with locked instructions between cores.

December 05, 2022

Re: Is core.internal.atomic.atomicFetchAdd implementation really lock free?

Posted by max haughton
in reply to claptrap

Permalink

max haughton

Posted in reply to claptrap

Permalink

On Monday, 5 December 2022 at 01:37:02 UTC, claptrap wrote:
> On Sunday, 4 December 2022 at 12:39:44 UTC, max haughton wrote:
>> [...]
>
> I think you'll struggle to find any processor that defines atomics in terms of memory ordering and not in terms of a read/modify/write sequence that is executed atomically.
>
>
>> [...]
>
> I was working on lock free algorithms around 2008, there was an intel pdf at the time specifying memory ordering. Tbh it probably didn't need to be specified until multicore came about since the default memory ordering on x86 single core meant you you literally didn't need to worry about it.
>
> C++.. wild west? sounds about right :)
>
>
>> [...]
>
> Yes the only difference a lock prefix makes is it adds extra ordering guarantees between cores. Basically the ordering grantees you get on a single core, you get the same with locked instructions between cores.

https://www.cl.cam.ac.uk/~pes20/weakmemory/x86tso-paper.tphols.pdf

Intel's initial efforts to specify the behaviour of the processors was not up to snuff at least in the eyes of academic researchers.

December 05, 2022

Re: Is core.internal.atomic.atomicFetchAdd implementation really lock free?

Posted by Johan
in reply to claptrap

Permalink

Johan

Posted in reply to claptrap

Permalink

claptrap, Max,

Keep in mind that D is not "portable assembly", neither is C or C++. The druntime library follows (afaict) that logic everywhere.
The documentation of atomicFetchAdd could be more clear and explain "lock free", because that does not have a clear meaning in D language context.

If you want the guarantee that the machine code contains specific instructions, you _have_ to write assembly instructions yourself (which you can do in D as in C and C++).

regards,
  Johan

Top | Forum index | About this forum

Forums