On Friday, 28 January 2022 at 10:18:32 UTC, IGotD- wrote:
>On Wednesday, 26 January 2022 at 06:20:06 UTC, Elronnd wrote:
>Thread-local gc is a thing. Good for false sharing too (w/real threads); can move contended objects away from owned ones. But I see no reason why fibre-local heaps should need to be much different from thread-local heaps.
I would like to challenge the idea that thread aware GC would do much for performance. Pegging memory to one thread is unusual and doesn't often correspond to the reality.
For example a computer game with large amount of vertex data where you decide to split up the workload on several threads. You don't make a thread local copy of that data but keep the original vertex data global and even destination buffer would be global.
Which is why you would want ARC for shared objects and a local GC for tasks/actors.
Then what you need for more flexibility and optimization is static analysis that determines if local objects can be turned into shared objects. If that is possible you could put them in a separate region of the GC heap with space for a RC field at negative offset.
>What I can think of is a server with one thread per client with data that no other reason thread works on.
It shouldn't be per thread, but per actor/task/fiber.
>My experience is that this thread model isn't good programming and servers should instead be completely async meaning any thread might handle the next partial work.
You have experience with this model? From where?
Actually, it could be massively beneficial if you have short lived actors and most objects have trivial destructors. Then you can simply release the entire local heap with no scanning.
You basically get to configure the system to use arena-allocators with GC-fallback for out-of-memory situations. Useful for actors where most of the memory it holds are released towards the end of the actor's life time.
>As I see it thread aware GC doesn't do much for performance but complicates it for the programmer.
You cannot discuss performance without selecting a particular realistic application. Which is why system level programming requires multiple choices and configurations if you want automatic memory management. There is simply no model that works well with all scenarios.
What is needed for D is to find a combinations that works both for current high level programming D-users and also makes automatic memory management more useful in more system level programming scenarios.
Perfect should be considered as out-of-scope.