Thread overview
handling shared objects
Nov 26, 2018
Alex
Nov 26, 2018
Alex
Nov 26, 2018
Alex
Nov 26, 2018
Alex
November 26, 2018
Hi all!
Can somebody explain to me, why the example below is not working in a way I'm expecting it to work?

My example is a little bit longer this time, however the half of it is taken from
https://dlang.org/library/std/concurrency/receive_only.html

´´´
import std.experimental.all;

struct D
{
	size_t d;
	static S s;
}
struct S
{
	D[] data;
}

struct Model
{
	auto ref s()
	{
		return D.s;
	}

	void run()
	{
		"I'm running".writeln;
		writeln(s.data.length);
	}
}

Model m;

void main()
{
	
	D.s.data.length = 4;
	m.run; //4

	auto childTid = spawn(&runner, thisTid);
	send(childTid, 0);
	receiveOnly!bool;
}

static void runner(Tid ownerTid)
{
	receive((size_t dummy){
		import core.thread : Thread;
				
		m.run;
        // Send a message back to the owner thread
        // indicating success.
        send(ownerTid, true);
    });
}
´´´

The idea is:
the model is something that I can declare deliberately in the application. And, I assumed that if it is (globally) shared, then so are all compartments of it, even if they are not explicitly part of the model.

Some problems arose:
1. Obviously, this is not the case, as the output is different, depending on the thread I start the model function.
2. If I declare the model object inside the main, the compiler aborts with the message "Aliases to mutable thread-local data not allowed."
3. If I mark the S instance as shared, it works. But I didn't intend to do this... Is this really how it meant to be?

As I'm writing the model object as well as all of its compartments, I can do almost everything... but what I to avoid is to declare the instances of compartments inside the model:
They are stored locally to their modules and the single elements of them have to know about the compound objects, like with the D and S structs shown.
November 26, 2018
On 11/26/18 9:00 AM, Alex wrote:
> Hi all!
> Can somebody explain to me, why the example below is not working in a way I'm expecting it to work?
> 
> My example is a little bit longer this time, however the half of it is taken from
> https://dlang.org/library/std/concurrency/receive_only.html
> 
> ´´´
> import std.experimental.all;
> 
> struct D
> {
>      size_t d;
>      static S s;
> }
> struct S
> {
>      D[] data;
> }
> 
> struct Model
> {
>      auto ref s()
>      {
>          return D.s;
>      }
> 
>      void run()
>      {
>          "I'm running".writeln;
>          writeln(s.data.length);
>      }
> }
> 
> Model m;
> 
> void main()
> {
> 
>      D.s.data.length = 4;
>      m.run; //4
> 
>      auto childTid = spawn(&runner, thisTid);
>      send(childTid, 0);
>      receiveOnly!bool;
> }
> 
> static void runner(Tid ownerTid)
> {
>      receive((size_t dummy){
>          import core.thread : Thread;
> 
>          m.run;
>          // Send a message back to the owner thread
>          // indicating success.
>          send(ownerTid, true);
>      });
> }
> ´´´
> 
> The idea is:
> the model is something that I can declare deliberately in the application. And, I assumed that if it is (globally) shared, then so are all compartments of it, even if they are not explicitly part of the model.
> 
> Some problems arose:
> 1. Obviously, this is not the case, as the output is different, depending on the thread I start the model function.

Yes, unless you declare the model to be shared, there is a copy made for each thread, independently managed.

> 2. If I declare the model object inside the main, the compiler aborts with the message "Aliases to mutable thread-local data not allowed."

Right, because you are not allowed to pass unshared data between threads.

> 3. If I mark the S instance as shared, it works. But I didn't intend to do this... Is this really how it meant to be?

Let's go over how the data is actually laid out:

Model has NO data in it, so it doesn't really matter if it's shared or not.

S has a single array of element type D's. There is no static data in S, so it has no static state (only instance state).

D has a single size_t, which is thread-local, but has a static instance of S in the TYPE. There is not a copy of an S for each D, just a single copy for each THREAD. If you make this shared, it's a shared copy for all threads. This means the array inside the shared S will be shared between all threads too.

What happens when you spawn a new thread is that the thread-local copy of m is created with Model.init (but it has no data, so it's not important). A thread-local copy of D.s is created with S.init (so an empty array). The reason your assignment of length in main doesn't work is because the init value is used, not the current value from the main thread.

So yes, you need to make it shared to have the sub-thread see the changes, if that's what you are after.

> As I'm writing the model object as well as all of its compartments, I can do almost everything... but what I to avoid is to declare the instances of compartments inside the model:
> They are stored locally to their modules and the single elements of them have to know about the compound objects, like with the D and S structs shown.

A static member is stored per thread. If you want a global that's shared between all threads, you need to make it shared. But the result may not be what you are looking for, shared can cause difficulty if your code wasn't written to deal with it (and a lot of code isn't).

-Steve
November 26, 2018
On Monday, 26 November 2018 at 14:28:33 UTC, Steven Schveighoffer wrote:
>> Some problems arose:
>> 1. Obviously, this is not the case, as the output is different, depending on the thread I start the model function.
>
> Yes, unless you declare the model to be shared, there is a copy made for each thread, independently managed.
>
>> 2. If I declare the model object inside the main, the compiler aborts with the message "Aliases to mutable thread-local data not allowed."
>
> Right, because you are not allowed to pass unshared data between threads.
>
>> 3. If I mark the S instance as shared, it works. But I didn't intend to do this... Is this really how it meant to be?
>
> Let's go over how the data is actually laid out:
>
> Model has NO data in it, so it doesn't really matter if it's shared or not.
>
> S has a single array of element type D's. There is no static data in S, so it has no static state (only instance state).
>
> D has a single size_t, which is thread-local,

especially, the size_t is local to an instance of D.

> but has a static instance of S in the TYPE.

Right, because to work properly a D instance has to know about the D's in the array of S

> There is not a copy of an S for each D, just a single copy for each THREAD. If you make this shared, it's a shared copy for all threads. This means the array inside the shared S will be shared between all threads too.

This is the point where my headaches begin: I do not need this level of sharedness, but I don't really care.

>
> What happens when you spawn a new thread is that the thread-local copy of m is created with Model.init (but it has no data, so it's not important). A thread-local copy of D.s is created with S.init (so an empty array). The reason your assignment of length in main doesn't work is because the init value is used, not the current value from the main thread.
>
> So yes, you need to make it shared to have the sub-thread see the changes, if that's what you are after.
>
>> As I'm writing the model object as well as all of its compartments, I can do almost everything... but what I to avoid is to declare the instances of compartments inside the model:
>> They are stored locally to their modules and the single elements of them have to know about the compound objects, like with the D and S structs shown.
>
> A static member is stored per thread. If you want a global that's shared between all threads, you need to make it shared. But the result may not be what you are looking for, shared can cause difficulty if your code wasn't written to deal with it (and a lot of code isn't).

Well, the only reason I use multithreading is this:
https://forum.dlang.org/thread/cfrtilrtbahollmazzfv@forum.dlang.org

So, even if my code is not really shared designed, this doesn't matter, as I wait for "the other" thread to end (or interrupt it). So, marking the model as shared is already a workaround, for being able to pass it to another thread, which I don't really need. However, now, if also all components of the model have to be marked shared, the workaround has to grow and expands over all components (?). This is the reason for this question...

>
> -Steve

November 26, 2018
On 11/26/18 10:16 AM, Alex wrote:
> On Monday, 26 November 2018 at 14:28:33 UTC, Steven Schveighoffer wrote:

>> A static member is stored per thread. If you want a global that's shared between all threads, you need to make it shared. But the result may not be what you are looking for, shared can cause difficulty if your code wasn't written to deal with it (and a lot of code isn't).
> 
> Well, the only reason I use multithreading is this:
> https://forum.dlang.org/thread/cfrtilrtbahollmazzfv@forum.dlang.org
> 
> So, even if my code is not really shared designed, this doesn't matter, as I wait for "the other" thread to end (or interrupt it). So, marking the model as shared is already a workaround, for being able to pass it to another thread, which I don't really need. However, now, if also all components of the model have to be marked shared, the workaround has to grow and expands over all components (?). This is the reason for this question...
> 

Well, if you want to run calculations in another thread, then send the result back to the original, you may be better off sending the state needed for the calculation to the worker thread, and receiving the result back via the messaging system. It's really hard to know the requirements with such toy examples, so maybe that's not workable for you.

What it seems like you need is a way to run the calculations in a separate thread. But with multiple threads comes all the dangers of concurrency and races. So you have to be very careful about how you design this.

At this point, std.concurrency does not have the ability to safely pass mutable data to another thread without it being shared.

Note that if you want to do it without safety in place, you can use the Thread class in core.thread which has no requirements for data to be immutable or shared. But you have to be even more careful about how you access the data.

-Steve
November 26, 2018
On Monday, 26 November 2018 at 15:26:43 UTC, Steven Schveighoffer wrote:
>
> Well, if you want to run calculations in another thread, then send the result back to the original, you may be better off sending the state needed for the calculation to the worker thread, and receiving the result back via the messaging system.

How to do this, if parts of the state are statically saved in a type?

> Note that if you want to do it without safety in place, you can use the Thread class in core.thread which has no requirements for data to be immutable or shared. But you have to be even more careful about how you access the data.
>

Ah... ok. But then, I will prefer to mark the appropriate parts as shared, I think...
November 26, 2018
On 11/26/18 10:37 AM, Alex wrote:
> On Monday, 26 November 2018 at 15:26:43 UTC, Steven Schveighoffer wrote:
>>
>> Well, if you want to run calculations in another thread, then send the result back to the original, you may be better off sending the state needed for the calculation to the worker thread, and receiving the result back via the messaging system.
> 
> How to do this, if parts of the state are statically saved in a type?

For instance, with your toy example, instead of saving the D[] as a static instance to share with all threads, use idup to make a complete copy, and then send that array directly to the new thread via spawn. When the result is done, instead of sending a bool to say it's complete, send the answer.

Sending an immutable copy is the easiest way to ensure you have no races. It may be more expensive than you want to make a deep copy of something, but probably less expensive than the headache of creating a non-debuggable monster race condition.

> 
>> Note that if you want to do it without safety in place, you can use the Thread class in core.thread which has no requirements for data to be immutable or shared. But you have to be even more careful about how you access the data.
>>
> 
> Ah... ok. But then, I will prefer to mark the appropriate parts as shared, I think...

Right :)

Threading is always very difficult to get right, and usually very difficult to find errors when you get it wrong. I remember working with pthreads about 20 years ago in a C++ project, and having a data race that caused a hang once every *2 weeks*. It took insane amounts of printouts and logging to figure out exactly why it happened (and the cycle was 2 weeks roughly), and the cause was (I think, not 100% sure) a place where a lock should have been but wasn't used.

-Steve
November 26, 2018
On Monday, 26 November 2018 at 16:27:23 UTC, Steven Schveighoffer wrote:
> On 11/26/18 10:37 AM, Alex wrote:
>> On Monday, 26 November 2018 at 15:26:43 UTC, Steven Schveighoffer wrote:
>>>
>>> Well, if you want to run calculations in another thread, then send the result back to the original, you may be better off sending the state needed for the calculation to the worker thread, and receiving the result back via the messaging system.
>> 
>> How to do this, if parts of the state are statically saved in a type?
>
> For instance, with your toy example, instead of saving the D[] as a static instance to share with all threads, use idup to make a complete copy, and then send that array directly to the new thread via spawn. When the result is done, instead of sending a bool to say it's complete, send the answer.
>
> Sending an immutable copy is the easiest way to ensure you have no races. It may be more expensive than you want to make a deep copy of something, but probably less expensive than the headache of creating a non-debuggable monster race condition.
>
Yeah... the problem is:
the D[] array is stored statically not because of threads, but because every element of it has to have an access to it. So not to store it statically is the very point I want to avoid.

But the idea is clear now, I think: I should delay the array expansion until the object is transferred to the other thread. Then, I expand the whole thing, (statically, as I would like to) do my calculations there and send back results, as you proposed.
In this way, the object to copy will be a simple copy, because everything that would need a deep copy will be created in the proper thread, after the transfer.

Thanks :)