January 12, 2021
On Monday, 11 January 2021 at 17:26:00 UTC, Arafel wrote:

> void f() {
>     assert(i == 0); // Expected
>     assert(j == 1); // Expected
>     assert(s.i == 0); // Expected
>     assert(s.j == 0); // Wait, what?
> }

At first sight this looks unexpected. But I think if you have a shared variable inside a struct it will not be TLS.
January 12, 2021
On 1/11/21 6:52 PM, Paul Backus wrote:
> On Monday, 11 January 2021 at 16:10:49 UTC, Steven Schveighoffer wrote:
>> There are some... odd rules.
>>
>> struct S
>> {
> [...]
>>    immutable int e = 5; // stored in data segment, not per instance!
> 
> Are you sure?
> 
> struct S
> {
>      immutable int n = 123;
>      this(int n) { this.n = n; }
> }
> 
> void main()
> {
>      S s1;
>      S s2 = 456;
>      assert(s1.n == 123);
>      assert(s2.n == 456);
> }

Yes, I was sure, but clearly wrong ;)

Hm... I guess it was changed! Deprecated in 2.065 [1] and finally changed in 2.067 [2]

Now you need static immutable to make it not part of the instance. Which makes sense (score 1 for consistency!)

-Steve

[1] https://dlang.org/changelog/2.065.0.html#staticfields2
[2] https://issues.dlang.org/show_bug.cgi?id=3449
January 12, 2021
On 1/11/21 12:26 PM, Arafel wrote:
> Thanks for the detailed explanation! I think this mixing of types and storage classes makes a very unfortunate combination:
> 
> ```
> import std;
> 
> int i = 0;
> shared int j = 0;
> 
> struct S {
>      int i = 0;
>      shared int j = 0;
> }
> 
> S s;
> 
> void main() {
>      i = 1;
>      j = 1;
>      s.i = 1;
>      s.j = 1;
>      spawn(&f);
> 
> }
> 
> void f() {
>      assert(i == 0); // Expected
>      assert(j == 1); // Expected
>      assert(s.i == 0); // Expected
>      assert(s.j == 0); // Wait, what?
> }
> ```
> 
> I agree that once you know the inner workings it makes sense, but a naïve approach might suggest that `s.j` would be... well, shared, just like `j`.

It's definitely confusing, if you don't know what shared means in all contexts.

shared as storage -> put in the shared globals
shared as type -> the thing can be shared between threads.

The second meaning does not mean it's automatically shared, just that it's shareable.

-Steve
January 12, 2021
On 1/11/21 8:49 PM, tsbockman wrote:
> On Monday, 11 January 2021 at 00:43:00 UTC, Tim wrote:
>> When MessageService calls the delegate for start, db is null. If I call start() in the Foo constructor it works just fine. Am I missing something here? Do delegates get called outside of their class context? I know I could just pass the db into start but I want to work out exactly why this is happening
> 
> The compiler and the physical CPU are both allowed to change the order in which instructions are executed to something different from what your code specifies, as long as the visible, "official" results and effects of the chosen order of execution are the same as those of your specified code, FROM THE PERSPECTIVE OF THE EXECUTING THREAD.
> 
> This is allowed so that the compiler can optimize to minimize negative "unofficial" effects such as the passage of time and memory consumption.
> 
> However, this re-ordering IS permitted to freely alter the behavior of your code from the perspective of OTHER threads. A likely cause of your bug is that the write to db by the constructor's thread is being committed to memory after the read of db by the MessageService thread.

I don't think this is valid.

1. the compiler MUST NOT reorder the storage of db to after you pass a delegate into an opaque function (array allocation).
2. The CPU is not going to reorder, because the memory allocation is going to take a global lock anyway (mutex locks should ensure memory consistency).

I think something weird is going on, but I don't know what.

> In order to RELIABLY fix this kind of problem, you must correctly use the only commands which the compiler and CPU are NOT allowed to reorder with respect to other threads, namely atomic operations, memory barriers and synchronization primitives.

I can't ever imagine creating a thread (which is likely what MessageService ctor is doing) to not have a consistent memory with the creating thread on construction. Why would the other thread not see the same memory when it didn't exist before? The CPU would have to go out of its way to make it inconsistent.

-Steve
January 12, 2021
On Tuesday, 12 January 2021 at 14:00:11 UTC, Steven Schveighoffer wrote:
> On 1/11/21 8:49 PM, tsbockman wrote:
>> However, this re-ordering IS permitted to freely alter the behavior of your code from the perspective of OTHER threads. A likely cause of your bug is that the write to db by the constructor's thread is being committed to memory after the read of db by the MessageService thread.
>
> I don't think this is valid.

You might be right, but your analysis below assumes the answers to a number of questions which aren't answered in the source code provided by the OP. Perhaps you are familiar with the implementations of the APIs in question, but I'm not and thought it unwise to assume too much, given that the whole reason we're having this discussion is that the code doesn't actually work...

Regardless, the simple way to find out if I'm on the right track or not is just to protect access to Foo's fields with a mutex and see if that fixes the problem. If it does, then either it's a memory ordering issue like I suggested (or a code gen bug), and the mutex can be replaced with something more efficient if necessary.

> 1. the compiler MUST NOT reorder the storage of db to after you pass a delegate into an opaque function (array allocation).

Is the function actually opaque, though? If the source code is available to the compiler for inlining (or maybe if it's marked `pure`?) then reordering is still allowed.

> 2. The CPU is not going to reorder, because the memory allocation is going to take a global lock anyway (mutex locks should ensure memory consistency).

This is not a safe assumption. It is quite easy to design a thread-safe allocator that does not take a global lock for every allocation, and indeed *necessary* if you want it to scale well to heavy loads on high core count systems.

Even if that's how it works today, I wouldn't write code that depends on this behavior, unless the language standard formally guaranteed it, because someone will change it sooner or later as core counts continue to climb.

> I can't ever imagine creating a thread (which is likely what MessageService ctor is doing) to not have a consistent memory with the creating thread on construction.

It seems reasonable to assume that thread creation includes a write barrier somewhere, but what if MessageService is using an existing thread pool?

> The CPU would have to go out of its way to make it inconsistent.

No, there are many levels of caching involved in the system, most of which are not shared by all cores. The CPU has to go out of its way to make memory appear consistent between cores, and this is expensive enough that it doesn't do so by default. That's why atomics and memory barriers exist, to tell the CPU to go out of its way to make things consistent.

You often don't have to deal with these issues directly when using higher-level multi-threading APIs, but that's because they try to include the appropriate atomics/barriers internally, not because the CPU has to "go out of its way" to make things inconsistent.
January 13, 2021
On Tuesday, 12 January 2021 at 01:49:11 UTC, tsbockman wrote:
> The compiler and the physical CPU are both allowed to change the order in which instructions are executed to something different from what your code specifies, as long as the visible, "official" results and effects of the chosen order of execution are the same as those of your specified code, FROM THE PERSPECTIVE OF THE EXECUTING THREAD.
>
> This is allowed so that the compiler can optimize to minimize negative "unofficial" effects such as the passage of time and memory consumption.
>
> However, this re-ordering IS permitted to freely alter the behavior of your code from the perspective of OTHER threads. A likely cause of your bug is that the write to db by the constructor's thread is being committed to memory after the read of db by the MessageService thread.
>
> In order to RELIABLY fix this kind of problem, you must correctly use the only commands which the compiler and CPU are NOT allowed to reorder with respect to other threads, namely atomic operations, memory barriers and synchronization primitives. A wide selection of these tools may be found in these D runtime library modules:
>
>     core.sync: http://dpldocs.info/experimental-docs/core.sync.html
>     core.atomic: http://dpldocs.info/experimental-docs/core.atomic.html
>     core.thread: http://dpldocs.info/experimental-docs/core.thread.html
>
> (I recommend using Adam D. Ruppe's unofficial but superior rendering of the D runtime documentation at dpldocs.info rather than the official dlang.org rendering, as I found some necessary pieces of the documentation are just mysteriously missing from the offical version.)
>
> Be warned that most models of multi-threaded programming are difficult to implement correctly, as opposed to ALMOST correctly with subtle heisen-bugs. You should either stick to one of the known simple models like immutable message passing with GC, or do some studying before writing too much code.
>
> Here are some resources which I have found very helpful in learning to understand this topic, and to avoid its pitfalls:
>
>     Short educational game: https://deadlockempire.github.io/
>     Tech talk by C++ expert Herb Sutter (D's core.atomic uses the C++ memory model):
>         https://channel9.msdn.com/Shows/Going+Deep/Cpp-and-Beyond-2012-Herb-Sutter-atomic-Weapons-1-of-2
>         https://channel9.msdn.com/Shows/Going+Deep/Cpp-and-Beyond-2012-Herb-Sutter-atomic-Weapons-2-of-2
>
> If you want to seriously dig into this, I suggest reviewing some or all of the content at the links above. If you're still confused about how to apply it in D, feel free to come back and ask for examples or code reviews. I'd rather not start with examples, though, because if you don't understand the rules and principles behind them, it's really easy to unknowingly introduce bugs into otherwise correct examples with seemingly innocent changes.

Fantastic response, thank you! I did some more digging and properly narrowed down where the issue is and created a test script that demonstrates the problem. Let me know what you think and if it could still be a similar problem to what you have stated above. I'll still read that info you sent to sharpen up on these concepts.

Basically, the program calls a function which modifies a document in the database. If it is called form it's own class' constructor, it works fine. If it is called by a thread, it never returns. I don't think that a member variable is going null or anything. But a strange problem that I can't seem to debug. The output is at the bottom.

----------------------------------------------------------------

import vibe.db.mongo.mongo;
import core.thread;
import std.stdio;

void main(){
    auto callable = new Callable();

    while(true){}
}

class Caller : Thread{
    void delegate() mFunc;

    this(void delegate() func){
        mFunc = func;
        super(&loop);
        start();
    }

    void loop(){
        while(true){
            mFunc();
        }
    }
}

class Callable{
    MongoClient db;
    Caller caller;

    this(){
        db = connectMongoDB("127.0.0.1");
        foo();
        caller = new Caller(&foo);
    }

    ~this(){
	db.cleanupConnections();
    }

    void foo(){
        writeln("Started");
        auto result = db.getCollection("test.collection").findAndModify([
	    "state": "running"],
	    ["$set": ["state": "stopped"]
	]);
        writeln(result);
        writeln("Finished");
    }
}

----------------------------------------------------------------

Output:
    Started
    {"_id":"5ff6705e21e91678c737533f","state":"running","knowledge":true}
    Finished
    Started
January 13, 2021
On Wednesday, 13 January 2021 at 02:15:49 UTC, Tim wrote:
> Basically, the program calls a function which modifies a document in the database. If it is called form it's own class' constructor, it works fine. If it is called by a thread, it never returns.
> ...
> class Caller : Thread{
>     void delegate() mFunc;
>
>     this(void delegate() func){
>         mFunc = func;
>         super(&loop);
>         start();
>     }
>
>     void loop(){
>         while(true){
>             mFunc();
>         }
>     }
> }
>
> class Callable{
>     MongoClient db;
>     Caller caller;
>
>     this(){
>         db = connectMongoDB("127.0.0.1");
>         foo();
>         caller = new Caller(&foo);
>     }
>
>     ~this(){
> 	db.cleanupConnections();
>     }
>
>     void foo(){
>         writeln("Started");
>         auto result = db.getCollection("test.collection").findAndModify([
> 	    "state": "running"],
> 	    ["$set": ["state": "stopped"]
> 	]);
>         writeln(result);
>         writeln("Finished");
>     }
> }

Note that if you are trying to debug a crash or hang of a program by printing messages to the console, you need to flush stdout multiple times in the vicinity of the problem, otherwise stdio's buffering may make it appear as though the program crashed or hung significantly earlier than it really did. (This is not a merely theoretical problem; I trip over it frequently myself.)

Anyway, I think your real problem is that MongoClient is not thread-safe. From the official vibe.d documentation (https://vibed.org/api/vibe.db.mongo.mongo/connectMongoDB):

> Note that the returned MongoClient uses a vibe.core.connectionpool.ConnectionPool internally to create and reuse connections as necessary. Thus, the MongoClient instance can - and should - be shared among all fibers in a thread by storing in in a thread local variable.

Note the "in a thread" part; you may only use a connection from the same thread that opened it.

(Why? I'm not familiar with vibe.d's API or code base, so I don't really know. But, I'd guess that the connection reuse mechanism mentioned in the docs requires some of the information that you might expect to be stored in the MongoClient instance itself to instead end up in thread-local storage (whether native or emulated). Or, there may simply be a manual "same thread" check built into the DB operations to prevent data races, and the error message isn't reaching you for some reason. `assert` messages don't print in release mode, and I've found the gdb debugger for D quite unreliable when trying to inspect multi-threaded code.)

Try moving the calls to `connectMongoDB` and `cleanupConnections` into the same thread as `foo`. (I can't think of a good reason to be doing these in the original thread, other than convenience.) If you want to loop in multiple threads simultaneously, just open a separate connection per thread, like the vibe.d docs suggest.
January 13, 2021
On 13/1/21 3:15, Tim wrote:
> 
> Fantastic response, thank you! I did some more digging and properly narrowed down where the issue is and created a test script that demonstrates the problem. Let me know what you think and if it could still be a similar problem to what you have stated above. I'll still read that info you sent to sharpen up on these concepts.
> 
> Basically, the program calls a function which modifies a document in the database. If it is called form it's own class' constructor, it works fine. If it is called by a thread, it never returns. I don't think that a member variable is going null or anything. But a strange problem that I can't seem to debug. The output is at the bottom.
> 
> ----------------------------------------------------------------
> 
> import vibe.db.mongo.mongo;
> import core.thread;
> import std.stdio;
> 
> void main(){
>      auto callable = new Callable();
> 
>      while(true){}
> }
> 
> class Caller : Thread{
>      void delegate() mFunc;
> 
>      this(void delegate() func){
>          mFunc = func;
>          super(&loop);
>          start();
>      }
> 
>      void loop(){
>          while(true){
>              mFunc();
>          }
>      }
> }
> 
> class Callable{
>      MongoClient db;
>      Caller caller;
> 
>      this(){
>          db = connectMongoDB("127.0.0.1");
>          foo();
>          caller = new Caller(&foo);
>      }
> 
>      ~this(){
>      db.cleanupConnections();
>      }
> 
>      void foo(){
>          writeln("Started");
>          auto result = db.getCollection("test.collection").findAndModify([
>          "state": "running"],
>          ["$set": ["state": "stopped"]
>      ]);
>          writeln(result);
>          writeln("Finished");
>      }
> }
> 
> ----------------------------------------------------------------
> 
> Output:
>      Started
> {"_id":"5ff6705e21e91678c737533f","state":"running","knowledge":true}
>      Finished
>      Started

Something that you could try for debugging is to add a try / catch block around your call to `db.getCollection` (and printing out the exception details). IIRC, if a thread throws, it will just end without printing anything until the thread is joined, when the exception will be rethrown [1].

The program hanging would be then the main thread waiting. This kind of problem has already bitten me more than once...

[1]: https://dlang.org/library/core/thread/osthread/thread.join.html
1 2
Next ›   Last »