On Tuesday, 20 April 2021 at 09:52:07 UTC, Ola Fosheim Grøstad wrote:
> As computer memory grows, naive scan and sweep garbage collection becomes more and more a burden.
Also, languages have not really come up with a satisfactory way to simplify multi-threaded programming, except to split the workload into many single-threaded tasks that are run in parallel.
It seems to me that the obvious way to retain the easy of use that garbage collection provides without impeding performance is to limit the memory to scan, and preferably do the scanning when nobody is using the memory.
The actor model seems to be a good fit. Or call it a task, if you wish. If each actor/task has it's own GC pool then there is less memory to scan, and you can do the scanning when the actor/task is waiting on I/O or scheduling. So you would get less intrusive scanning pauses. It would also fit well with async-await/futures.
Another benefit is that if an actor is deleted before it is scanned, then no scanning is necessary at all. It can simply be released (assuming destructor-free classes are allocated in a separate area). This is of great benefit to web-services, they can simply implement a request-handler as an actor/task.
The downside is that you need a non-GC mechanism for dealing with inter-actor/task communication. Such as reference counting, however that should be quite ok, as you would expect the time-consuming stuff to happen within an actor/task as well as complex allocation patterns.
Is this a direction D is able to move in or is a new language needed?
A few years ago, when std.experimental.allocator
was still hot out of the oven, I considered that this would one of primary innovations that it would enable.
The basic idea is that since allocators are composable first-class objects, you can pass them to any function and that way you can override and customize its memory allocation policy, without resorting to global variables.
(The package does provide convenience thread-local and global variables, but IMO that's an anti-pattern, as if you prefer the simplicity, you can either use the GC (as always), or MAllocator
directly. IMO, if you're reaching for std.experimental.allocator
, you do so, in order to gain more control over the memory management. Also knowing whether theAllocator
points to GCAllocator
, or an actually separate thread-local allocator, can be critical for ensuring that code is lock-free. You either know what you're doing, or the code is not performance critical, so it doesn't matter, and you should be using the GC anyway.)
By passing the allocator as an object, you allow it to be used safely from pure
functions. (If pure
functions were to somehow be allowed to use those global allocator variables, you could have some ugly consequences. For example, a pure function can be preempted in the middle of its execution, only to have the global allocator replaced under its feet, thereby leaving all the memory allocated from the previous allocator dangling.)
Pure code (even in the relaxed D sense) is great for parallelism, as a scheduler can essentially assume that it's both lock-free and wait-free - it doesn't need to interact with any other thread/fiber/task to make progress.
Having multiple per thread/fiber/actor/task GC heaps fits naturally in the model you propose. There could be a new LocalGCAllocator, which the runtime / framework can simply pass to the actor on its creation. There two main challenges:
- Ensuring code doesn't brake the assumptions of the actor model by e.g. sharing memory between threads in an uncontrolled manner. This can be addressed in a variety of ways:
- The framework's build-system can prevent you from importing code that doesn't fit its model
- The framework can run a non-optional linter as part of the build process, which would ensure that you don't have:
@system
or @trusted
code
extern
function declarations (otherwise you could define @safe pure int printf(scope const char* format, scope const ...);
)
- reference capabilities like Pony's
- other type-system or language built-in static analysis
- Making it ergonomic and easy to use, as is using the GC. Essentially having all language and library features that currently require the GC use
LocalGCAllocator
automagically.
I think this can be done in several steps:
- Finish transitioning druntime's compiler interface from unchecked "magic" extern(C) functions to regular D (template) functions
- Add
context
as the last parameter to each of druntime function that may need to allocate memory set it's default value to the global GC context. This is a pure refactoring, no change in behavior.
- Add Scala
implicit
parameters ⁴ ⁵ ⁶ ⁷ ⁸ to the language and mark the context
parameters as implicit