Thread overview
Socket and spawn()
May 31
kdevel
Jun 02
bauss
May 31

I'm coding a server which takes TCP connections. I end up in the main thread with .accept() which hands me a Socket. I'd like to hand this off to a spawn()'ed thread to do the actual work.

Aliases to mutable thread-local data not allowed.

Is there some standard way to get something which isn't in TLS? Or do I have to drop back to file descriptors and do my own socket handling?

TIA,
Andy

May 31
On Friday, May 31, 2024 10:07:23 AM MDT Andy Valencia via Digitalmars-d-learn wrote:
> I'm coding a server which takes TCP connections.  I end up in the
> main thread with .accept() which hands me a Socket.  I'd like to
> hand this off to a spawn()'ed thread to do the actual work.
>
>      Aliases to mutable thread-local data not allowed.
>
> Is there some standard way to get something which _isn't_ in TLS?
>   Or do I have to drop back to file descriptors and do my own
> socket handling?
>
> TIA,
> Andy

Strictly speaking, unless you're dealing with a module or static-level variable, the object is not in TLS. It's treated as thread-local by the type system, and the type system will assume that no other thread has access to it, but you can freely cast it to shared or immutable and pass it across threads. It's just that it's up to you to make sure that you don't have a thread-local reference to shared data that isn't protected in a fashion that accessing the thread-local references is guarantee to be thread-safe (e.g. the appropriate mutex has been locked).

So, if you're just passing it to another thread, and that other thread is all that uses the object, then you would temporarily cast it to shared or immutable, give it to the other thread, and then that thread would cast it back to thread-local to use it, and the original thread would have nothing to do with it any longer.

On the other hand, if you're actively sharing an object across threads, then you cast it to shared and give it to the other thread. But then you have to use an appropriate thread-synchronization mechanism (likely a mutex in the case of a socket) to ensure that accessing the object is thread-safe.

So, typically, you would lock a mutex to ensure that no other thread is accessing the object, and then you temporarily cast away shared while the object is protected by the mutex so that you can do whatever you need to do with the object, and once you're ready to release the mutex, you make sure that no thread-local references to the object remain before releasing the mutex. So, any code actually operating on the object would do so while it's typed as thread-local, and the compiler would complain if you accidentally accessed it through the shared referenc to the data (though the compiler doesn't currently catch all such cases - the -preview=nosharedaccess switch turns on more of the checks, and that's supposed to be become the default eventually, but it hasn't yet).

In any case, you can freely cast between thread-local and shared. It's just that you need to be sure that when you do that, you're not violating the type system by having a thread-local reference to shared data access that shared data without first protecting it with a mutex. And that typically means not having any thread-local references to the shared data except when the mutex is locked. But if all you're doing is passing an object across threads, then it's pretty straightforward, since you just cast it to shared or immutable to be able to pass it across threads and then cast back to thread-local on the other side to use it as normal again (you just need to make sure that you don't leave a reference to the data on the original thread when you do that).

- Jonathan M Davis



May 31

On Friday, 31 May 2024 at 16:07:23 UTC, Andy Valencia wrote:

>

I'm coding a server which takes TCP connections. I end up in the main thread with .accept() which hands me a Socket. I'd like to hand this off to a spawn()'ed thread to do the actual work.

Have you taken into consideration that each of the (pre-spawned) threads
can call accept()? Your program may also accept in multiple processes
on the same socket. [1]

[1] https://stackoverflow.com/questions/11488453/can-i-call-accept-for-one-socket-from-several-threads-simultaneously

May 31

On Friday, 31 May 2024 at 19:48:37 UTC, kdevel wrote:

>

Have you taken into consideration that each of the (pre-spawned) threads
can call accept()? Your program may also accept in multiple processes on the same socket. [1]

Yes, but I am planning on some global behavior--mostly concerning resource consumption--where that would make implementing the policy harder.

I've indeed done the cast-to-shared and then cast-to-unshared and it's working fine.

BTW, if the strategy forward is where the type system is going to assist with flagging code paths requiring allowance for multiple threads, it would be nice if the modifiers were available symmetrically. "shared" and "unshared", "mutable" and "immutable", and so forth? I'm using:

alias Unshared(T) = T;
alias Unshared(T: shared U, U) = U;

and that's fine, but for core semantics of the language, it might make sense to treat these as first class citizens.

Andy

June 01

On Friday, 31 May 2024 at 16:59:08 UTC, Jonathan M Davis wrote:

>

Strictly speaking, unless you're dealing with a module or static-level variable, the object is not in TLS. It's treated as thread-local by the type system, and the type system will assume that no other thread has access to it, but you can freely cast it to shared or immutable and pass it across threads. It's just that it's up to you to make sure that you don't have a thread-local reference to shared data that isn't protected in a fashion that accessing the thread-local references is guarantee to be thread-safe (e.g. the appropriate mutex has been locked).

Thank you; this is the most complete explanation I've found yet for hwo to look at data sharing in D.

>

On the other hand, if you're actively sharing an object across threads, then you cast it to shared and give it to the other thread. But then you have to use an appropriate thread-synchronization mechanism (likely a mutex in the case of a socket) to ensure that accessing the object is thread-safe.

Speaking as an old kernel engineer for the Sequent multiprocessor product line, this is all very comfortable to me. I'm very glad that D has a suitable selection of spinlocks, process semaphores, and memory atomic operations. I can work with this!

>

In any case, you can freely cast between thread-local and shared. It's just that you need to be sure that when you do that, you're not violating the type system by having a thread-local reference to shared data access that shared data without first protecting it with a mutex.

That was the trick for me; TLS implied to me that an implementation would be free to arrange that the address of a variable in one thread's TLS would not necessarily be accessible from another thread. Now I'm clearer on the usage of the term WRT the D runtime. All good.

Thanks again,
Andy

May 31
On Friday, May 31, 2024 6:28:27 PM MDT Andy Valencia via Digitalmars-d-learn wrote:
> On Friday, 31 May 2024 at 16:59:08 UTC, Jonathan M Davis wrote:
>
> Speaking as an old kernel engineer for the Sequent multiprocessor product line, this is all very comfortable to me.  I'm very glad that D has a suitable selection of spinlocks, process semaphores, and memory atomic operations.  I can work with this!

The way that D handles all of this is at its core the same as C/C++. The main difference is that the type system assumes that anything that isn't marked as shared or immutable is thread-local and can do optimizations based on that (though I'm not sure that that actually happens in practice right now). And in turn, you're supposed to get errors when you do stuff with a shared object which isn't guaranteed to be thread-safe (though not all of those checks are enabled by default at the moment).

The result of this is supposed to be that the portions of your code which are operating on shared data are clearly segregated, whereas in C/C++, the type system doesn't give you any way of knowing what's actively being shared across threads. So, in principle, you're writing essentially what you want have written in C/C++, but the type system is helping you catch when you screw it up.

The annoying part is that because the compiler can't know when it's actually thread-safe to access shared data (e.g. it has no clue when the programm associates a mutex with a set of data), you have to use explicit casts to thread-local to then be able to operate on the data when it is thread-safe (unless the type is designed to be used as shared, in which case, it's doing any necessary casting internally), whereas in C/C++, since the type system doesn't know or care about thread safety, it'll just let you access data whether it's properly protected or not.

So, some folks get annoyed by the necessary casting, but in theory, it's the type system helping to minimize the risk of you shooting yourself in the foot. The idioms involved are basically the same as those used in C/C++ though, so anyone who understands those should be able to write thread-safe code in D fairly easily once they understand how shared works.

> > In any case, you can freely cast between thread-local and shared. It's just that you need to be sure that when you do that, you're not violating the type system by having a thread-local reference to shared data access that shared data without first protecting it with a mutex.
>
> That was the trick for me; TLS implied to me that an implementation would be free to arrange that the address of a variable in one thread's TLS would not necessarily be accessible from another thread.  Now I'm clearer on the usage of the term WRT the D runtime.  All good.

If you have a module-level or static variable that isn't shared or immutable, then each thread will get its own copy. So, those _might_ end up in TLS (I'm not sure if they are right now or not), but if you're just creating stuff on the fly, none of it is actually in TLS. And it's very common when dealing with shared or immutable data to create it first as thread-local and then cast it, since you know that it's not actually shared across threads at that point, and it's easier to construct that way. If we did want to start using TLS all over the place, then the casts would have to take that into account somehow, but I don't expect that that will happen.

- Jonathan M Davis



June 02

On Friday, 31 May 2024 at 16:07:23 UTC, Andy Valencia wrote:

>

I'm coding a server which takes TCP connections. I end up in the main thread with .accept() which hands me a Socket. I'd like to hand this off to a spawn()'ed thread to do the actual work.

Aliases to mutable thread-local data not allowed.

Is there some standard way to get something which isn't in TLS? Or do I have to drop back to file descriptors and do my own socket handling?

TIA,
Andy

I just want to point out that you should not spawn a thread for each accepted socket. That is very bad and expensive.

You should instead make it non-blocking and use something like select to handle it.

If anything you should use a thread pool that each handles a set of sockets, instead of each thread being a single socket.

June 02

On Sunday, 2 June 2024 at 17:46:09 UTC, bauss wrote:

>

If anything you should use a thread pool that each handles a set of sockets, instead of each thread being a single socket.

Yup, thread pool it is. I'm still fleshing out the data structure which manages the incoming work presented to the pool, but here's what I have so far:

https://sources.vsta.org:7100/dlang/file?name=tiny/rotor.d&ci=tip

Andy