Jump to page: 1 2
Thread overview
Memory management and local GC?
Oct 31, 2020
Kagamin
Nov 01, 2020
Kagamin
Nov 02, 2020
IGotD-
Nov 02, 2020
IGotD-
October 31, 2020
I wonder, what is keeping D from making full use of shared?

It seems to me that if you required everything externally "referential" reachable from non-thread-local globals to be typed as shared, then the compiler could assume that everything allocated that is not shared should go on the local GC.

If we then add reference counting for shared types, and make non-thread-local class instances ref counted then we no longer need to lock threads during GC collection. Then you require all shared resource handlers to be reference counted class objects.

You can still hand out local GC objects to other threads if you manually pin them first, can be done as a "pin-counter", or perhaps better named a "borrow count".

Since a thread local GC will have less pointers to scan through, collection would also be faster.

Are there some practical considerations I haven't thought about?

Caveat: you have to deal with more shared protected objects.

Solution: you add an isolated pointer type that automatically allows member access as long as the object has noe been actually shared yet. So when you allocate a new shared object you receive it as an isolated object in the global shared memory pool.

October 31, 2020
Say, allocate a dictionary, network connection or XML document and put it in shared context. How do you deal with them?
October 31, 2020
On Saturday, 31 October 2020 at 19:05:43 UTC, Kagamin wrote:
> Say, allocate a dictionary, network connection or XML document and put it in shared context. How do you deal with them?

Ok, so we assume @safe. When you allocate you have access to it as an "isolated shared" which taint all references obtained through it, so you cannot store references to the internals. BUT, you can configure it.

After you are done with the isolated configuration/usage you do a move operation to transfer it to a shared pointer (effectively encapsulating it). Then you need to obtain access to it through the implementation of the shared protocol for the object (not implemented as @safe).

So for a dictionary, where you want to store references to local objects you would need a dictionary written especially for shared access as you would need to have a borrow-pointer to keep it pinned in the "foreign" GC-pool. So it is up to the dictionary implementation to prevent multiple threads to access the same object at the same time. So that makes it possible to set up a shared cache and use it in @safe code, but the object accesses have to be limited somewhat.

We need to think of shared globals as facades in this model.

A network connection would probably best be implemented as a struct that you embed as a private field in a shared class object.

Or maybe you think about something else? I expect there to be pitfalls, so more feedback is good. :-)



October 31, 2020
On Saturday, 31 October 2020 at 19:31:23 UTC, Ola Fosheim Grøstad wrote:
> for shared access as you would need to have a borrow-pointer to keep it pinned in the "foreign" GC-pool. So it is up to the

The pinning has to be a @trusted operation though to prevent multiple threads to access the GC local object.

If you want to do everything in @safe code you could use the same "isolated" mechanisms to allow pinning in @safe code.

Borrowed pointers remain scope limited when other threads access them, since the pointer is tagged as borrowed the type system can ensure that it isn't retained.

Another option would be for the dictionary to allow access through a callback that scope-restricts the object.
November 01, 2020
The size and complexity of implementation of first class shared support in all types holds it from happening.
November 01, 2020
On Sunday, 1 November 2020 at 09:52:30 UTC, Kagamin wrote:
> The size and complexity of implementation of first class shared support in all types holds it from happening.

Shared should just be a simple type-wrapper that prevents access to internals unless you use specific methods marked as thread-safe for @safe code. Requiring atomic access to members do not uphold the invariants of the object, so that is the wrong solution.

If one add some kind of "accessible-shared" taint after the caller has obtained read or write access then you can use the type system to prevent disasters. One probably should have "readable-unshared" and "writeable-unshared", to distinguish between read-lock and write-lock protection. Then you map those to say "scope parameters", so that unshared references cannot be retained.

November 02, 2020
On 10/31/20 2:53 PM, Ola Fosheim Grøstad wrote:
> I wonder, what is keeping D from making full use of shared?
> 
> It seems to me that if you required everything externally "referential" reachable from non-thread-local globals to be typed as shared, then the compiler could assume that everything allocated that is not shared should go on the local GC.
> 
> If we then add reference counting for shared types, and make non-thread-local class instances ref counted then we no longer need to lock threads during GC collection. Then you require all shared resource handlers to be reference counted class objects.

What about cycles in shared data?

Typically, when people talk about a "thread local GC", I point out that this doesn't help, because a thread-local GC can point at the shared GC, which means that you still have to stop the world to scan the shared GC.

But your idea is interesting in that there is no GC for shared data. If it could be made to work, it might be a nice upgrade. With a good reference counting system, you can also designate different memory management systems for different threads.

> 
> You can still hand out local GC objects to other threads if you manually pin them first, can be done as a "pin-counter", or perhaps better named a "borrow count".

Hm... this means that they now become shared. How does that work? Handing unshared references to other threads is going to be a problem.

What is the problem with allocating them shared in the first place? Ideally, you will NEVER transfer thread-local data to a shared context.

An exception might be immutable data. It also might make sense to move the data to the shared heap before sharing.

> Caveat: you have to deal with more shared protected objects.
> 
> Solution: you add an isolated pointer type that automatically allows member access as long as the object has noe been actually shared yet. So when you allocate a new shared object you receive it as an isolated object in the global shared memory pool.
> 

This is part of the -preview=nosharedaccess switch -- you need to provide mechanisms on how to actually use shared data.

Such a wrapper can be written with this in mind.

-Steve
November 02, 2020
On Monday, 2 November 2020 at 12:28:34 UTC, Steven Schveighoffer wrote:
>
> What is the problem with allocating them shared in the first place? Ideally, you will NEVER transfer thread-local data to a shared context.
>

How would you distinguish between global and local GC allocated data in the language? Many times you need GC allocated data that can be used globally, so we need new D local key words like "localnew" or "globalnew".

Then since we are talking on GC operations on GC allocated memory should be disallowed between threads but what mechanisms does D have in order to prevent pure access to that data? Only disallowing GC operations is just going half way as I see it.

In general these multithreaded allocators usually use "stages" meaning that they have several memory regions to play with hoping that it will reduce mutex locking. Usually there is an upper limit how many stages you want before there isn't any performance advantage. Also tracking stages requires resources itself. If D would open up such stage for every thread, there will be a new stage for every thread with no upper limit. This will require extra metadata as well as a new region. I guess that implementation wise this will be complicated and also require even more memory than it does today.

November 02, 2020
On Monday, 2 November 2020 at 12:28:34 UTC, Steven Schveighoffer wrote:
> What about cycles in shared data?

I am making the assumption that global shared facades (caches etc) are written by library authors that know what they do, so they would use weak pointers where necessary.

However, your idea of using the current GC infrastructure for sanitization could be helpful! So in development builds you could detect such flaws at runtime during testing.

> But your idea is interesting in that there is no GC for shared data. If it could be made to work, it might be a nice upgrade. With a good reference counting system, you can also designate different memory management systems for different threads.

I am making the assumption that most @safe-programmers should be discouraged from making non-TLS globals and typically not design their own shared facades (caches, databases etc). So that those global "hubs" would ideally be part of a framework/library with an underlying strategic model for parallelism.

>> You can still hand out local GC objects to other threads if you manually pin them first, can be done as a "pin-counter", or perhaps better named a "borrow count".
>
> Hm... this means that they now become shared. How does that work? Handing unshared references to other threads is going to be a problem.

Not if they are isolated. For instance, you might have a framework for scraping websites. Then a thread hands a thread local request object to the framework and when the framework has fetched the data the origin thread receive the request object with the data. You could even let the framework allocate local GC data for the thread if the GC is written with a special pool for that purpose.

Think of it like this: A @safe thread is like an actor. But there are advanced global services at your disposal that can provide for you using the "building materials" most useful to you (like thread local GC memory).

> What is the problem with allocating them shared in the first place? Ideally, you will NEVER transfer thread-local data to a shared context.

I don't know if that is true. Think for instance of data-science computations. The scientist only wants to write code in a scripty way in his @safe thread, and receive useful stuff from the massively parallel framework without having to deal with shared himself?

That would be more welcoming to newbies. I think. If you could just tell them:
1. never put things in globals unless they are immutable (lookup tables).
2. don't deal with shared until you become more experienced.
3. use these ready made worker-frameworks for parallel computing.

> This is part of the -preview=nosharedaccess switch -- you need to provide mechanisms on how to actually use shared data.
>
> Such a wrapper can be written with this in mind.

Interesting!

November 02, 2020
On 11/2/20 8:23 AM, IGotD- wrote:
> On Monday, 2 November 2020 at 12:28:34 UTC, Steven Schveighoffer wrote:
>>
>> What is the problem with allocating them shared in the first place? Ideally, you will NEVER transfer thread-local data to a shared context.
>>
> 
> How would you distinguish between global and local GC allocated data in the language? Many times you need GC allocated data that can be used globally, so we need new D local key words like "localnew" or "globalnew".

It's possible to allocate without using `new`.

Something like:

GlobalPool.allocate!T(ctor_args) // returns shared(T)*

> Then since we are talking on GC operations on GC allocated memory should be disallowed between threads but what mechanisms does D have in order to prevent pure access to that data? Only disallowing GC operations is just going half way as I see it.

D has shared qualifier, which indicates whether any other thread has access to it.

This is kind of the lynch pin of all these schemes. Without that enforced properly, you can't build anything.

But with it enforced properly, you have options.

There are still problems. Like immutable is implicitly shared. Or you can have a type which contains both shared and unshared members, where is that allocated?

> 
> In general these multithreaded allocators usually use "stages" meaning that they have several memory regions to play with hoping that it will reduce mutex locking. Usually there is an upper limit how many stages you want before there isn't any performance advantage. Also tracking stages requires resources itself. If D would open up such stage for every thread, there will be a new stage for every thread with no upper limit. This will require extra metadata as well as a new region. I guess that implementation wise this will be complicated and also require even more memory than it does today.
> 

You don't necessarily need to assign regions to threads. You could potentially use one region and assign different pools to different threads. If the GC doesn't need to scan shared data at all, then it's a matter of skipping the pools that aren't interesting to your local GC.

As long as a pool never gets moved to another thread, you can avoid most locking.

-Steve
« First   ‹ Prev
1 2