LDC optimized builds causing unexpected behaviour when sharing variable between threads

LDC optimized builds causing unexpected behaviour when sharing variable between threads
Jun 01, 2022 Keivan Shah
Jun 01, 2022 Johan
Jun 01, 2022 Keivan Shah
Jun 01, 2022 Kagamin
Jun 02, 2022 Keivan Shah
Jun 01, 2022 kinke
Jun 02, 2022 Keivan Shah
Jun 02, 2022 Johan
Jun 02, 2022 Keivan Shah
Jun 06, 2022 rm
Jun 15, 2022 Kagamin
Jun 04, 2022 David Nadlinger
Jun 05, 2022 Johan

June 01, 2022

Posted by Keivan Shah

Permalink

Keivan Shah

Permalink

Hello,
I am encountering a strange bug when making optimized build with LDC. Can someone please help me debug it.

void main()
{
    import std.stdio : writeln;
    import core.thread;

    // Runinng this code on LDC with `-O` leads to infinite loop

    writeln("Starting...");
    bool done = false; // Making this `shared` also doesn't work
    new Thread({ done = true; }).start();
    while (!done)
    {
    } // Wait for done
    writeln("Done!");
}

The program ends up in an infinite loop if -O is passed to LDC and works normally without it. Also works fine in both cases with dmd. I thought this was due to the variable done being shared between threads, so I tried marking the variable as shared but even that does not solve the problem. Also tried making the variable global and using __gshared doesn't seem to work as well. Finally I was able to solve the problem by using core.volatile, but that doesn't look like a very great solution. So I wanted to understand the main problem here and make sure and also verify that this is not a LDC bug.

June 01, 2022

Re: LDC optimized builds causing unexpected behaviour when sharing variable between threads

Posted by Johan
in reply to Keivan Shah

Permalink

Johan

Posted in reply to Keivan Shah

Permalink

On Wednesday, 1 June 2022 at 07:02:08 UTC, Keivan Shah wrote:

Hello,
I am encountering a strange bug when making optimized build with LDC. Can someone please help me debug it.

void main()
{
    import std.stdio : writeln;
    import core.thread;

    // Runinng this code on LDC with `-O` leads to infinite loop

    writeln("Starting...");
    bool done = false; // Making this `shared` also doesn't work
    new Thread({ done = true; }).start();
    while (!done)
    {
    } // Wait for done
    writeln("Done!");
}

This is not a bug.
The solution with volatileLoad works, but I think it technically isn't quite correct; documentation says: "They should not be used for communication between threads.".

There are many articles online about this kind of conditional loop on a variable that is modified by another thread. For example: https://stackoverflow.com/questions/10091112/using-a-global-variable-to-control-a-while-loop-in-a-separate-thread and https://devblogs.microsoft.com/oldnewthing/20180924-00/?p=99805

-Johan

June 01, 2022

Re: LDC optimized builds causing unexpected behaviour when sharing variable between threads

Posted by Keivan Shah
in reply to Johan

Permalink

Keivan Shah

Posted in reply to Johan

Permalink

On Wednesday, 1 June 2022 at 08:23:18 UTC, Johan wrote:

This is not a bug.
The solution with volatileLoad works, but I think it technically isn't quite correct; documentation says: "They should not be used for communication between threads.".

-Johan

Hey Johan,
Thanks for the quick response. As you mentioned even I don't feel the solution with volatileLoad is correct and so was looking for something else. The documentation stated that

Global data that is meant to be shared among multiple threads should be marked with the shared keyword

So to me it seemed like marking the variable as shared should have made it accessible as well as it takes care of the memory ordering. That plus the fact that the snippet works in dmd but not ldc made me think it maybe was a bug. Anyways, have gotten a better understanding of the issue now, probably will use core.atomic.atomicFence to ensure the memoryOrdering.

Thanks and Regards,
Keivan Shah.

June 01, 2022

Re: LDC optimized builds causing unexpected behaviour when sharing variable between threads

Posted by Kagamin
in reply to Keivan Shah

Permalink

Kagamin

Posted in reply to Keivan Shah

Permalink

On Wednesday, 1 June 2022 at 13:35:45 UTC, Keivan Shah wrote:

So to me it seemed like marking the variable as shared should have made it accessible as well as it takes care of the memory ordering.

No, shared is a low level primitive. If you want one size fits all concurrency, use mutex, synchronized statement or std.concurrency.

June 01, 2022

Re: LDC optimized builds causing unexpected behaviour when sharing variable between threads

Posted by kinke
in reply to Keivan Shah

Permalink

kinke

Posted in reply to Keivan Shah

Permalink

On Wednesday, 1 June 2022 at 13:35:45 UTC, Keivan Shah wrote:

probably will use core.atomic.atomicFence to ensure the memoryOrdering.

core.atomic.atomic{Load,Store} handle this:

void main()
{
    import std.stdio : writeln;
    import core.thread;
    import core.atomic;

    writeln("Starting...");
    shared bool done = false;
    new Thread({ atomicStore(done, true); }).start();
    while (!atomicLoad(done))
    {
    } // Wait for done
    writeln("Done!");
}
``

June 02, 2022

Re: LDC optimized builds causing unexpected behaviour when sharing variable between threads

Posted by Keivan Shah
in reply to kinke

Permalink

Keivan Shah

Posted in reply to kinke

Permalink

On Wednesday, 1 June 2022 at 16:46:51 UTC, kinke wrote:

On Wednesday, 1 June 2022 at 13:35:45 UTC, Keivan Shah wrote:

probably will use core.atomic.atomicFence to ensure the memoryOrdering.

core.atomic.atomic{Load,Store} handle this:

void main()
{
    import std.stdio : writeln;
    import core.thread;
    import core.atomic;

    writeln("Starting...");
    shared bool done = false;
    new Thread({ atomicStore(done, true); }).start();
    while (!atomicLoad(done))
    {
    } // Wait for done
    writeln("Done!");
}
``

Hey kinke,
Thanks for the snippet, I have some existing code that follows the pattern similar to the example I had given. There are a lot of variables that are shared between the 2 threads and so instead of changing the reads and writes for every variable to be atomic{Store,Load}, I was thinking if atomicFence might achieve the same effect with far lesser code change requirements. Would that be recommended or should I use atomic{Store,Load} instead?

Regards and Thanks,
Keivan Shah

June 02, 2022

Re: LDC optimized builds causing unexpected behaviour when sharing variable between threads

Posted by Keivan Shah
in reply to Kagamin

Permalink

Keivan Shah

Posted in reply to Kagamin

Permalink

On Wednesday, 1 June 2022 at 15:31:40 UTC, Kagamin wrote:

On Wednesday, 1 June 2022 at 13:35:45 UTC, Keivan Shah wrote:

So to me it seemed like marking the variable as shared should have made it accessible as well as it takes care of the memory ordering.

No, shared is a low level primitive. If you want one size fits all concurrency, use mutex, synchronized statement or std.concurrency.

Hey Kagamin,

Thanks, starting to understand that a little now. Seems like that I was relying on some incorrect assumptions and will need to make some overall logic changes. Will look into using synchronized, but seems like my use-case might be fine with core.atomic for now since need a fast lightweight way for data sharing for a few shared variables.

Regards and Thanks,
Keivan Shah.

June 02, 2022

Re: LDC optimized builds causing unexpected behaviour when sharing variable between threads

Posted by Johan
in reply to Keivan Shah

Permalink

Johan

Posted in reply to Keivan Shah

Permalink

On Thursday, 2 June 2022 at 04:33:39 UTC, Keivan Shah wrote:

On Wednesday, 1 June 2022 at 16:46:51 UTC, kinke wrote:

On Wednesday, 1 June 2022 at 13:35:45 UTC, Keivan Shah wrote:

probably will use core.atomic.atomicFence to ensure the memoryOrdering.

core.atomic.atomic{Load,Store} handle this:

void main()
{
    import std.stdio : writeln;
    import core.thread;
    import core.atomic;

    writeln("Starting...");
    shared bool done = false;
    new Thread({ atomicStore(done, true); }).start();
    while (!atomicLoad(done))
    {
    } // Wait for done
    writeln("Done!");
}
``

atomicFence only works for ordering memory operations on a single thread, not ordering between threads.

You can either use a mutex (to coordinate between threads), or need all store/loads to be atomic with atomic{Store,Load}.
I am surprised that there is no core.atomic!int to simplify your life. Perhaps you should make a feature request :)

-Johan

June 02, 2022

Re: LDC optimized builds causing unexpected behaviour when sharing variable between threads

Posted by Keivan Shah
in reply to Johan

Permalink

Keivan Shah

Posted in reply to Johan

Permalink

On Thursday, 2 June 2022 at 11:56:07 UTC, Johan wrote:

atomicFence only works for ordering memory operations on a single thread, not ordering between threads.

-Johan

Hey Johan,
Thanks for the reply, I finally ended up using atomic{Store,Load} with some memory ordering after reading about them a bit. Atomic variables (core.atomic!int) would have been a great feature, I have no further usecases for them and they probably are a bit more complex for my understanding to make a informed request right now, but will definitely make one when my understanding improves in the future.

Thanks and regards,
Keivan Shah.

June 04, 2022

Re: LDC optimized builds causing unexpected behaviour when sharing variable between threads

Posted by David Nadlinger
in reply to Johan

Permalink

David Nadlinger

Posted in reply to Johan

Attachments:

text/html part

Permalink

On 2 Jun 2022, at 7:56, Johan via digitalmars-d-ldc wrote:
> `atomicFence` only works for ordering memory operations on a single thread, not ordering between threads.

`atomicFence` should work across threads as well; it inserts actual processor fences as necessary, in addition to the requisite effect on compiler optimisations.

However, in practice, code using standalone fences often harder to make sense of than that using atomic loads/stores, unless the relevant accesses are explicitly documented in enough detail. Just sprinkling `atomicFence` over an existing code base to make it thread-safe is pretty much guaranteed to fail. Thus, preferring atomic loads/stores is very sound advice in practice.

(There are also fences that are effectively for compiler optimisations only and usually go by volatile fences or signal fences; perhaps that’s what Johan was thinking of.)

  — David

Top | Forum index | About this forum

Forums