Shared (page 2) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » Shared (page 2)

May 14, 2019

Posted by Jonathan M Davis
in reply to Dominikus Dittes Scherkl

Jonathan M Davis

Posted in reply to Dominikus Dittes Scherkl

On Tuesday, May 14, 2019 6:22:20 AM MDT Dominikus Dittes Scherkl via Digitalmars-d wrote:
> On Monday, 13 May 2019 at 16:20:30 UTC, Jonathan M Davis wrote:
> > On Monday, May 13, 2019 9:52:02 AM MDT Dominikus Dittes Scherkl
> >
> > via Digitalmars-d wrote:
> >> On Saturday, 11 May 2019 at 15:24:19 UTC, Jonathan M Davis
> >>
> >> wrote:
> >> > All that really would do is add a bit of extra syntax around locking a mutex and casting away shared.
> >>
> >> No, it makes it possible to use mutex in safe functions without needing to cast and which the compiler CAN guarantee to really be safe.
> >
> > Actually, it doesn't. All you've done is lock a mutex and cast away shared. There is no guarantee that that mutex is always used when that object is accessed. There could easily be another piece of code somewhere that casts away shared without locking anything at the same time.
>
> But that piece of code is system, which explicitly allows you to shoot into your foot. If you want to stay safe, don't do that.

The point is that if that code can legally exist, then the compiler simply cannot guarantee that removing shared from the object is thread-safe even with the locking mechanism you're proposing.

> > And even if this were the only mechanism for removing shared, you could easily use it with the same object and a completely different mutex in another piece of code
>
> Yes, the lock block need a list of vars that it allows to be modified
>
> lock(var1, var2, ...)
> {
> }
>
> two mutexes can only be executed at parallel if their parameter set is disjunct.

Sure, but another thread could be using a completely different mutex with one or more of those variables. So, even if just your locking mechanism is used, there's no guarantee that the same mutex is always used with the same set of variables, meaning that the compiler can't guarantee that the data is actually protected.

> > And even if the compiler could check that all uses of a particular variable were locked with a mutex, that still wouldn't be enough, because other references to the same data could exist.
>
> ok, so we need in addition that a reference to a shared var need not be lived beyond the end of the locked block or be immutable. Bad, but seems necessary.

A reference could already exist before your proposed locking mechanism was reached in the code. If the type is a class or pointer, then there could be other class references or pointers to the same data in @safe code. And in @system/@trusted code, the address of the object could have been taken to create a pointer to the object (and that could have been done in code for removed from the code that's using the lock with all of the code around the lock being @safe). Heck, there could even be references to data within the object rather than to the object itself which are available elsewhere, meaning that part of the object is protected by the lock and part isn't. If any reference to any part of the data exists anywhere in the program, then it's possible for another thread to access the data at the same time that it's locked by the mechanism that you've proposed.

In order for the compiler to be able to actually guarantee that it's safe to remove shared from an object, it has to be able to guarantee that there are no other references to any part of that object which exist in the program. That's why TDPL synchronized classes are so locked down. Without that, other references to the data could exist. And even with all of the restrictions that they have, the compiler would still only be able to remove the outer layer of shared - the layer directly in the class - not any more than that. With something as free-form as you're proposing, the compiler can't guarantee anything, let alone that it's thread-safe to completely remove shared from an object within that block.

If D had ownership semantics baked into its type system, then we could probably do more, but as it is, the compiler is _very_ limited in its ability to know that no other references to any portion of an object exist. All it's dealing with are the types, not what owns them or what else could refer to them. Even scope is only able to do its job by restricting the operations that are allowed on a scope object, not by actually tracking an object's lifetime. So, it's incredibly difficult for the compiler to have enough information to know that an object is actually fully protected by a mutex when it's locked. And if the compiler can't guarantee that accessing the shared object is thread-safe within a particular piece of code, then it can't safely remove shared.

For any proposal you might have with regards to how we might be able to have the compiler safely remove shared from an object for us, you're going to have to be able to prove that no other references to any piece of that object could possibly exist or that there's no way that any reference to any portion of that object could be accessed without the same mutex being locked at every point that it's accessed. And it wouldn't surprise me if someone else were able to point out why even that wasn't enough because of some detail I'm not thinking of at the moment. Having the compiler be able to prove that a piece of code is thread-safe such that shared can be safely removed automatically from anything is incredibly difficult.

- Jonathan M Davis

May 15, 2019

Posted by Dominikus Dittes Scherkl
in reply to Jonathan M Davis

Dominikus Dittes Scherkl

Posted in reply to Jonathan M Davis

On Tuesday, 14 May 2019 at 21:31:57 UTC, Jonathan M Davis wrote:
> On Tuesday, May 14, 2019 6:22:20 AM MDT Dominikus Dittes Scherkl via Digitalmars-d wrote:
>> On Monday, 13 May 2019 at 16:20:30 UTC, Jonathan M Davis wrote:
>> > On Monday, May 13, 2019 9:52:02 AM MDT Dominikus Dittes Scherkl
>> >
>> > via Digitalmars-d wrote:
>> >> On Saturday, 11 May 2019 at 15:24:19 UTC, Jonathan M Davis
>> >>
>> >> wrote:
>> >> > All that really would do is add a bit of extra syntax around locking a mutex and casting away shared.
>> >>
>> >> No, it makes it possible to use mutex in safe functions without needing to cast and which the compiler CAN guarantee to really be safe.
>> >
>> > Actually, it doesn't. All you've done is lock a mutex and cast away shared. There is no guarantee that that mutex is always used when that object is accessed. There could easily be another piece of code somewhere that casts away shared without locking anything at the same time.
>>
>> But that piece of code is system, which explicitly allows you to shoot into your foot. If you want to stay safe, don't do that.
>
> The point is that if that code can legally exist, then the compiler simply cannot guarantee that removing shared from the object is thread-safe even with the locking mechanism you're proposing.

with system code you can always destroy safety assumptions of any other written code. This is why system code should be avoided where ever possible and the unavoidable remains need to be reviewed very carefully to not spoil the guarantees that are valid otherwise.

>
>> > And even if this were the only mechanism for removing shared, you could easily use it with the same object and a completely different mutex in another piece of code
>>
>> Yes, the lock block need a list of vars that it allows to be modified
>>
>> lock(var1, var2, ...)
>> {
>> }
>>
>> two mutexes can only be executed at parallel if their parameter set is disjunct.
>
> Sure, but another thread could be using a completely different mutex with one or more of those variables.

No, it can't. Disjunct means: It cannot be called unless all of the given variables are free (not locked by any other mutex).

>> ok, so we need in addition that a reference to a shared var need not be lived beyond the end of the locked block or be immutable. Bad, but seems necessary.
>
> A reference could already exist before your proposed locking mechanism was reached in the code. If the type is a class or pointer, then there could be other class references or pointers to the same data in @safe code. And in @system/@trusted code, the address of the object could have been taken to create a pointer to the object (and that could have been done in code for removed from the code that's using the lock with all of the code around the lock being @safe). Heck, there could even be references to data within the object rather than to the object itself which are available elsewhere, meaning that part of the object is protected by the lock and part isn't. If any reference to any part of the data exists anywhere in the program, then it's possible for another thread to access the data at the same time that it's locked by the mechanism that you've proposed.
Ok, that whole reference stuff is always a problem. Why not simply forbid it? You can't reference shared variables (outside locked blocks), you can only copy them. We can later relax that rule if some safe ways to allow that are found. I can't see why that should hinder us to make the more practical usecases safe for now.

>
> In order for the compiler to be able to actually guarantee that it's safe to remove shared from an object, it has to be able to guarantee that there are no other references to any part of that object which exist in the program. That's why TDPL synchronized classes are so locked down. Without that, other references to the data could exist. And even with all of the restrictions that they have, the compiler would still only be able to remove the outer layer of shared - the layer directly in the class - not any more than that. With something as free-form as you're proposing, the compiler can't guarantee anything, let alone that it's thread-safe to completely remove shared from an object within that block.
>
> If D had ownership semantics baked into its type system, then we could probably do more, but as it is, the compiler is _very_ limited in its ability to know that no other references to any portion of an object exist.
If we forbid them (for now), the compiler is well able to.

> Even scope is only able to do its job by restricting the operations that are allowed on a scope object, not by actually tracking an object's lifetime.
Yes, and that's fine. It isn't necessary that everything is possible with an object, but it should at least be useful for SOME task.

> For any proposal you might have with regards to how we might be able to have the compiler safely remove shared from an object for us, you're going to have to be able to prove that no other references to any piece of that object could possibly exist or that there's no way that any reference to any portion of that object could be accessed without the same mutex being locked at every point that it's accessed.
Understood.

> And it wouldn't surprise me if someone else were able to point out why even that wasn't enough because of some detail I'm not thinking of at the moment. Having the compiler be able to prove that a piece of code is thread-safe such that shared can be safely removed automatically from anything is incredibly difficult.
shared shouldn't be removed from an object, but it can only be modyfied if it is locked. Removing shared (with a cast) is system stuff and should be out of scope for any safety related proposal (including mine), because with system stuff you can destroy any kind of safety.

If you don't remove shared, you can easily apply rules like forbid to take it's address or such. If you remove it, that makes it much harder (and isn't useful anyway).

I still think my proposal could work (provide provable thread-safety for shared objects) in a limited but useful way (only mutex, no references), and should be relatively easy to implement.
If you want more complex stuff, that's still possible in the same way it currently is: cast shared away together with all guarantees and verify manually that it works, just like in C++.

May 15, 2019

Posted by Radu
in reply to Jonathan M Davis

Radu

Posted in reply to Jonathan M Davis

On Tuesday, 14 May 2019 at 21:02:10 UTC, Jonathan M Davis wrote:
> On Tuesday, May 14, 2019 8:32:45 AM MDT Radu via Digitalmars-d wrote:
>> On Monday, 13 May 2019 at 16:20:30 UTC, Jonathan M Davis wrote:
>> > On Monday, May 13, 2019 9:52:02 AM MDT Dominikus Dittes Scherkl
>> >
>> > via Digitalmars-d wrote:
>> >> [...]
>> >
>> > Actually, it doesn't. All you've done is lock a mutex and cast away shared. There is no guarantee that that mutex is always used when that object is accessed. There could easily be another piece of code somewhere that casts away shared without locking anything at the same time. And even if this were the only mechanism for removing shared, you could easily use it with the same object and a completely different mutex in another piece of code, thereby making the mutex useless. The type system has no concept of ownership and no concept of a mutex being associated with any particular object aside from synchronized classes (and those don't even currently require that the mutex always be used). And even if the compiler could check that all uses of a particular variable were locked with a mutex, that still wouldn't be enough, because other references to the same data could exist. So, with the construct you've proposed, there's no way for the compiler to guarantee that no other thread is accessing the data at the same time. All it guarantees is that a mutex has been locked, not that it's actually protecting the data.
>> >
>> > [...]
>>
>> I had this idea for some time, not sure I can best articulated now, but I think a workable solution for shared is closely linked with the dip1000 - scoped pointers.
>>
>> What I mean is that something like:
>>
>> void doSharedStuff(scope shared(Foo) foo)
>> {
>> }
>>
>> Will be able to safely lock/unlock foo and cast away shared'ness
>> in the function's scope.
>> The compiler can provide guarantees here that foo will not escape.
>
> Sure, it can guarantee that no reference will escape that function, but all that's required is that another reference to the same data exist elsewhere, and another thread could muck with the object while the mutex was locked. There's no question that helpers could be created which would help users avoid mistakes when casting when casting away shared, but the compiler can't actually make the guarantee that casting away shared is thread-safe.
>
> - Jonathan M Davis

My view is that the compiler could automatically insert locking logic (ala synchronized) when the shared parameters gets references inside the function, and also automatically cast away shared so for the function internals it would be like working with local non-shared data.

Given that compiler inferes lifetime, it could safely elide locking if the parameter is passed to other functions that have the same signature (scope is explicit or inferred).

The idea is to provide the tools that simplify concurrent programming, the compiler will need to insert all the checks automatically using the lifetime tracking.

May 15, 2019

Posted by Jonathan M Davis
in reply to Radu

Jonathan M Davis

Posted in reply to Radu

On Wednesday, May 15, 2019 1:56:12 AM MDT Radu via Digitalmars-d wrote:
> On Tuesday, 14 May 2019 at 21:02:10 UTC, Jonathan M Davis wrote:
> > On Tuesday, May 14, 2019 8:32:45 AM MDT Radu via Digitalmars-d
> >
> > wrote:
> >> On Monday, 13 May 2019 at 16:20:30 UTC, Jonathan M Davis wrote:
> >> > On Monday, May 13, 2019 9:52:02 AM MDT Dominikus Dittes Scherkl
> >> >
> >> > via Digitalmars-d wrote:
> >> >> [...]
> >> >
> >> > Actually, it doesn't. All you've done is lock a mutex and cast away shared. There is no guarantee that that mutex is always used when that object is accessed. There could easily be another piece of code somewhere that casts away shared without locking anything at the same time. And even if this were the only mechanism for removing shared, you could easily use it with the same object and a completely different mutex in another piece of code, thereby making the mutex useless. The type system has no concept of ownership and no concept of a mutex being associated with any particular object aside from synchronized classes (and those don't even currently require that the mutex always be used). And even if the compiler could check that all uses of a particular variable were locked with a mutex, that still wouldn't be enough, because other references to the same data could exist. So, with the construct you've proposed, there's no way for the compiler to guarantee that no other thread is accessing the data at the same time. All it guarantees is that a mutex has been locked, not that it's actually protecting the data.
> >> >
> >> > [...]
> >>
> >> I had this idea for some time, not sure I can best articulated now, but I think a workable solution for shared is closely linked with the dip1000 - scoped pointers.
> >>
> >> What I mean is that something like:
> >>
> >> void doSharedStuff(scope shared(Foo) foo)
> >> {
> >> }
> >>
> >> Will be able to safely lock/unlock foo and cast away
> >> shared'ness
> >> in the function's scope.
> >> The compiler can provide guarantees here that foo will not
> >> escape.
> >
> > Sure, it can guarantee that no reference will escape that function, but all that's required is that another reference to the same data exist elsewhere, and another thread could muck with the object while the mutex was locked. There's no question that helpers could be created which would help users avoid mistakes when casting when casting away shared, but the compiler can't actually make the guarantee that casting away shared is thread-safe.
> >
> > - Jonathan M Davis
>
> My view is that the compiler could automatically insert locking logic (ala synchronized) when the shared parameters gets references inside the function, and also automatically cast away shared so for the function internals it would be like working with local non-shared data.
>
> Given that compiler inferes lifetime, it could safely elide locking if the parameter is passed to other functions that have the same signature (scope is explicit or inferred).
>
> The idea is to provide the tools that simplify concurrent programming, the compiler will need to insert all the checks automatically using the lifetime tracking.

The compiler does not have enough information to know which mutexes to use when even if we wanted it to insert locks. TDPL synchronized classes are a special case in that they would provide a way in the language to associate a specific lock with a specific set of variables in a way that the compiler could then guarantee that it's safe to remove the outer layer of shared within a synchronized function. Without a similar mechanism to associate a mutex with one or more variables and guarantee that that mutex is always locked when they're accessed, the compiler won't be able to even know what to lock, let alone that it's safe to remove shared within a particular section of code.

And the issue with what you're proposing with scope is that unless the compiler can actually guarantee that no other references to the shared object exist which could be used to access the object at the same time, then the compiler cannot safely remove shared even temporarily. scope is just enough to guarantee that that particular function can't escape any references, not enough to guarantee that they don't exist. So, while features like scope can be used to make it easier to reason about the code and cast away shared in a way that your code is thread-safe, I fully expect that it's going to have to be up to the programmer to know when it's safe to remove shared, thus requiring a cast or some other @trusted mechanism to temporarily remove shared. TDPL synchronized classes are the only proposal I've seen that would be able to guarantee thread-safety, and it can only do it for what's directly in the class, making TDPL synchronized classes arguably pretty useless (on top of the fact that it means requiring classes when most D code wouldn't normally use classes for something like this).

- Jonathan M Davis

May 15, 2019

Posted by Radu
in reply to Jonathan M Davis

Radu

Posted in reply to Jonathan M Davis

On Wednesday, 15 May 2019 at 08:52:23 UTC, Jonathan M Davis wrote:
> On Wednesday, May 15, 2019 1:56:12 AM MDT Radu via Digitalmars-d wrote:
>> On Tuesday, 14 May 2019 at 21:02:10 UTC, Jonathan M Davis wrote:
>> > On Tuesday, May 14, 2019 8:32:45 AM MDT Radu via Digitalmars-d
>> >
>> > wrote:
>> >> [...]
>> >
>> > Sure, it can guarantee that no reference will escape that function, but all that's required is that another reference to the same data exist elsewhere, and another thread could muck with the object while the mutex was locked. There's no question that helpers could be created which would help users avoid mistakes when casting when casting away shared, but the compiler can't actually make the guarantee that casting away shared is thread-safe.
>> >
>> > - Jonathan M Davis
>>
>> My view is that the compiler could automatically insert locking logic (ala synchronized) when the shared parameters gets references inside the function, and also automatically cast away shared so for the function internals it would be like working with local non-shared data.
>>
>> Given that compiler inferes lifetime, it could safely elide locking if the parameter is passed to other functions that have the same signature (scope is explicit or inferred).
>>
>> The idea is to provide the tools that simplify concurrent programming, the compiler will need to insert all the checks automatically using the lifetime tracking.
>
> The compiler does not have enough information to know which mutexes to use when even if we wanted it to insert locks. TDPL synchronized classes are a special case in that they would provide a way in the language to associate a specific lock with a specific set of variables in a way that the compiler could then guarantee that it's safe to remove the outer layer of shared within a synchronized function. Without a similar mechanism to associate a mutex with one or more variables and guarantee that that mutex is always locked when they're accessed, the compiler won't be able to even know what to lock, let alone that it's safe to remove shared within a particular section of code.
>
> And the issue with what you're proposing with scope is that unless the compiler can actually guarantee that no other references to the shared object exist which could be used to access the object at the same time, then the compiler cannot safely remove shared even temporarily. scope is just enough to guarantee that that particular function can't escape any references, not enough to guarantee that they don't exist. So, while features like scope can be used to make it easier to reason about the code and cast away shared in a way that your code is thread-safe, I fully expect that it's going to have to be up to the programmer to know when it's safe to remove shared, thus requiring a cast or some other @trusted mechanism to temporarily remove shared. TDPL synchronized classes are the only proposal I've seen that would be able to guarantee thread-safety, and it can only do it for what's directly in the class, making TDPL synchronized classes arguably pretty useless (on top of the fact that it means requiring classes when most D code wouldn't normally use classes for something like this).
>
> - Jonathan M Davis

Locks for classes can be done using the class monitor field (like synchronized works today). For other types the compiler could probably use the typeinfo class for locking, that might be to much for some cases. But there may be optimization opportunities, like for example use atomics when the target supports it and the type is a primitive, or assignment for the reference/pointer, same for binary ops when dereferencing a pointer to a primitive.

scope role is to ensure lock/unlock semantics and to help ellide superfluous locking when chaining the calls. I'm not proposing a ownership system here.
The shared castaway should be performed only inside the locked scope, the difference is that the compiler will automatically do that for you.

I think this should be enabled only for @safe functions, and not for @system.

May 15, 2019

Posted by Jonathan M Davis
in reply to Radu

Jonathan M Davis

Posted in reply to Radu

On Wednesday, May 15, 2019 3:09:36 AM MDT Radu via Digitalmars-d wrote:
> On Wednesday, 15 May 2019 at 08:52:23 UTC, Jonathan M Davis wrote:
> > On Wednesday, May 15, 2019 1:56:12 AM MDT Radu via
> >
> > Digitalmars-d wrote:
> >> On Tuesday, 14 May 2019 at 21:02:10 UTC, Jonathan M Davis
> >>
> >> wrote:
> >> > On Tuesday, May 14, 2019 8:32:45 AM MDT Radu via Digitalmars-d
> >> >
> >> > wrote:
> >> >> [...]
> >> >
> >> > Sure, it can guarantee that no reference will escape that function, but all that's required is that another reference to the same data exist elsewhere, and another thread could muck with the object while the mutex was locked. There's no question that helpers could be created which would help users avoid mistakes when casting when casting away shared, but the compiler can't actually make the guarantee that casting away shared is thread-safe.
> >> >
> >> > - Jonathan M Davis
> >>
> >> My view is that the compiler could automatically insert locking logic (ala synchronized) when the shared parameters gets references inside the function, and also automatically cast away shared so for the function internals it would be like working with local non-shared data.
> >>
> >> Given that compiler inferes lifetime, it could safely elide locking if the parameter is passed to other functions that have the same signature (scope is explicit or inferred).
> >>
> >> The idea is to provide the tools that simplify concurrent programming, the compiler will need to insert all the checks automatically using the lifetime tracking.
> >
> > The compiler does not have enough information to know which mutexes to use when even if we wanted it to insert locks. TDPL synchronized classes are a special case in that they would provide a way in the language to associate a specific lock with a specific set of variables in a way that the compiler could then guarantee that it's safe to remove the outer layer of shared within a synchronized function. Without a similar mechanism to associate a mutex with one or more variables and guarantee that that mutex is always locked when they're accessed, the compiler won't be able to even know what to lock, let alone that it's safe to remove shared within a particular section of code.
> >
> > And the issue with what you're proposing with scope is that unless the compiler can actually guarantee that no other references to the shared object exist which could be used to access the object at the same time, then the compiler cannot safely remove shared even temporarily. scope is just enough to guarantee that that particular function can't escape any references, not enough to guarantee that they don't exist. So, while features like scope can be used to make it easier to reason about the code and cast away shared in a way that your code is thread-safe, I fully expect that it's going to have to be up to the programmer to know when it's safe to remove shared, thus requiring a cast or some other @trusted mechanism to temporarily remove shared. TDPL synchronized classes are the only proposal I've seen that would be able to guarantee thread-safety, and it can only do it for what's directly in the class, making TDPL synchronized classes arguably pretty useless (on top of the fact that it means requiring classes when most D code wouldn't normally use classes for something like this).
> >
> > - Jonathan M Davis
>
> Locks for classes can be done using the class monitor field (like synchronized works today). For other types the compiler could probably use the typeinfo class for locking, that might be to much for some cases. But there may be optimization opportunities, like for example use atomics when the target supports it and the type is a primitive, or assignment for the reference/pointer, same for binary ops when dereferencing a pointer to a primitive.
>
> scope role is to ensure lock/unlock semantics and to help ellide
> superfluous locking when chaining the calls. I'm not proposing a
> ownership system here.
> The shared castaway should be performed only inside the locked
> scope, the difference is that the compiler will automatically do
> that for you.
>
> I think this should be enabled only for @safe functions, and not for @system.

The problem is that unless the compiler can absolutely guarantee that no other references to the data exist, having the compiler automatically removing shared at all - including in @safe code - can't be safely done. Without an ownership model, the compiler basically doesn't have the information that it needs to know that such references don't exist. Just because code is @safe doesn't mean that it's thread-safe, and scope does not provide enough information for the compiler to know that it's thread-safe. Ultimately, you need a programmer to examine the code to verify the thread-safety - which arguably means that anything that removes shared needs to be @system so that the programmer knows that they need to verify the code.

- Jonathan M Davis

May 15, 2019

Posted by Jonathan M Davis
in reply to Dominikus Dittes Scherkl

Jonathan M Davis

Posted in reply to Dominikus Dittes Scherkl

On Wednesday, May 15, 2019 12:59:00 AM MDT Dominikus Dittes Scherkl via Digitalmars-d wrote:
> On Tuesday, 14 May 2019 at 21:31:57 UTC, Jonathan M Davis wrote:
> > The point is that if that code can legally exist, then the compiler simply cannot guarantee that removing shared from the object is thread-safe even with the locking mechanism you're proposing.
>
> with system code you can always destroy safety assumptions of any other written code. This is why system code should be avoided where ever possible and the unavoidable remains need to be reviewed very carefully to not spoil the guarantees that are valid otherwise.

@safety and thread-safety are very different. @system mechanisms such as casting are involved with thread-safety, meaning that @system and @trusted get involved, but what they mean and how they're verified are very different. For @safety, if you want to mark something as @trusted, you just need to look at that piece of code to verify that what it's doing is memory safe. You don't have to look at other @safe parts of the program to verify that what it's doing is correct.

On the other hand, with thread-safety, you have to look at _everywhere_ that a particular piece of shared data is accessed if you want to be able to guarantee that it's accessed in a thread-safe manner. You can't just assume that other code is doing the right thing and just verify that one piece of code that's using @system mechanisms to interact with shared data. So, you can't rely on @safe to tell you whether anything is thread-safe.

> >> > And even if this were the only mechanism for removing shared, you could easily use it with the same object and a completely different mutex in another piece of code
> >>
> >> Yes, the lock block need a list of vars that it allows to be modified
> >>
> >> lock(var1, var2, ...)
> >> {
> >> }
> >>
> >> two mutexes can only be executed at parallel if their parameter set is disjunct.
> >
> > Sure, but another thread could be using a completely different mutex with one or more of those variables.
>
> No, it can't. Disjunct means: It cannot be called unless all of the given variables are free (not locked by any other mutex).

So, you're proposing that something in the runtime keeps track of which variables are currently associated with a locked mutex in order to guarantee that no other lock block is able to access any of those variables at the same time? That would probably require adding a global lock used by the runtime when any code enters or exists a lock block. I'd be _very_ surprised if anything like that were deemed acceptable for D. And it still doesn't solve the problem of other references referring to any part of the objects referred to by those variables existing and potentially being used elsewhere. If you had

lock(mutex, var1)
{
}

and elsewhere

lock(mutex, var2)
{
}

when var1 and var2 were references to the same object or when var2 referred to a piece of data inside of var1, then the lock wouldn't be providing thread-safety. If you only had one mutex for the entire program, and casting away shared were illegal, then something like this could probaly work, but having one mutex for the entire program would clearly be unworkable - especially for a systems language - and since mutexes (let alone this particular use pattern for mutexes) aren't the only way to protect shared data when accessing it, requiring that this particular construct be used for accessing shared data wouldn't work anyway.

> >> ok, so we need in addition that a reference to a shared var need not be lived beyond the end of the locked block or be immutable. Bad, but seems necessary.
> >
> > A reference could already exist before your proposed locking mechanism was reached in the code. If the type is a class or pointer, then there could be other class references or pointers to the same data in @safe code. And in @system/@trusted code, the address of the object could have been taken to create a pointer to the object (and that could have been done in code for removed from the code that's using the lock with all of the code around the lock being @safe). Heck, there could even be references to data within the object rather than to the object itself which are available elsewhere, meaning that part of the object is protected by the lock and part isn't. If any reference to any part of the data exists anywhere in the program, then it's possible for another thread to access the data at the same time that it's locked by the mechanism that you've proposed.
>
> Ok, that whole reference stuff is always a problem. Why not simply forbid it? You can't reference shared variables (outside locked blocks), you can only copy them. We can later relax that rule if some safe ways to allow that are found. I can't see why that should hinder us to make the more practical usecases safe for now.

You can't forbid references to the same data. All that would be required would be something like

shared foo = new shared(Foo)(42);
auto bar = foo;

and you have two references to the same object without doing anything unsafe. You could also have stuff like

shared baz = foo.getBaz();

resulting in a reference to some piece of data that the foo object contains (possibly simply a shared class object that it has a reference to as a member). In general, to be able to verify that no other references to data existed, we'd probably have to add some sort of ownership semantics to the language so that the compiler could know that nothing else can possibly have access to the data (as I understand it, Rust manages something along those lines, but they have a much more restrictive type system).

> > And it wouldn't surprise me if someone else were able to point out why even that wasn't enough because of some detail I'm not thinking of at the moment. Having the compiler be able to prove that a piece of code is thread-safe such that shared can be safely removed automatically from anything is incredibly difficult.
>
> shared shouldn't be removed from an object, but it can only be modyfied if it is locked. Removing shared (with a cast) is system stuff and should be out of scope for any safety related proposal (including mine), because with system stuff you can destroy any kind of safety.
>
> If you don't remove shared, you can easily apply rules like forbid to take it's address or such. If you remove it, that makes it much harder (and isn't useful anyway).
>
> I still think my proposal could work (provide provable
> thread-safety for shared objects) in a limited but useful way
> (only mutex, no references), and should be relatively easy to
> implement.
> If you want more complex stuff, that's still possible in the same
> way it currently is: cast shared away together with all
> guarantees and verify manually that it works, just like in C++.

As soon as casting away shared is legal (and I think that it has to be for many common thread-synchronization idioms to be used), any mechanism like you're suggesting isn't enough to guarantee that it's safe to remove shared even temporarily. Either way, the ability to get multiple references to the same data defeats what you're proposing, and I don't see how it would be possible to make it so that they're can't be multiple references to the data given D's type system. TDPL synchronized classes are only able to do it with the data that lives directly in the class, because they're a very restrictive construct. But even then, what the member variables refer to can't have shared removed, because references to the same data could have escaped the class or be passed into the class from elsewhere. And if something as restrictive as TDPL synchronized classes aren't able to restrict references sufficiently to be able to just outright remove shared from the class' member variables, there's no way that something as free-form as locking a mutex on a set of variables that aren't encapsulated in anything is going to be able to guarantee that other references to the same data don't exist.

- Jonathan M Davis

May 15, 2019

Posted by Manu

Manu

On Wed, May 15, 2019 at 2:34 AM Jonathan M Davis via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
>
> [... everything that jonathan said ...]
> ...there's no way that something as free-form
> as locking a mutex on a set of variables that aren't encapsulated in
> anything is going to be able to guarantee that other references to the same
> data don't exist.

This is hard-facts.
We talked about this topic a lot at dconf. Here's where we got to.

Truth #1: shared must not have read/write access. I don't believe
there's any future design where this isn't true. We should do this
now.
Truth #2: Casting shared is not @safe, ever. The language can never
have enough context to know.
Truth #3: If you want something that's useful and @safe, you need to
start building some serious constructions above. Any such construction
necessarily requires tight encapsulation to keep references contained
and only issue leases appropriately.
Truth #4: DIP1000 has a very important part of this; any time you case
shared away (take a lease), the resulting pointer must be `scope`,
otherwise it could be sequestered away, and all bets are off!
Truth #5: I don't think there's anything that can reasonably be
offered at the _language_ level to use shared @safely (other than #1).
Constructions are too varied, and too complex. But I think we know how
to start to write some useful libraries that can safely share data
assuming #1.

I spent some time with Amaury talking through designs for `shared`
shared pointers. We concluded that with DIP1000, there is enough
language available to write a @safe shared object as a library (and no
mutex-es!).
It is clear that any such design requires lease-tracking. It's not
hard to imagine a shared shared-pointer with the same rules as Rust;
"write access may only have 1 lease", "read access many have N
leases", "read/write access are mutually exclusive". The container
would have functions to capture a lease into some raii object, and the
returned object must be DIP1000 `scope`.

May 16, 2019

Posted by Dominikus Dittes Scherkl
in reply to Jonathan M Davis

Dominikus Dittes Scherkl

Posted in reply to Jonathan M Davis

On Wednesday, 15 May 2019 at 09:33:19 UTC, Jonathan M Davis wrote:
> You can't forbid references to the same data. All that would be required would be something like
>
> shared foo = new shared(Foo)(42);
> auto bar = foo;
Meep. This is either a compile error outside locked block (cannot read shared) or bar would be scope (it's lifetime will end after lock block is left).

> and you have two references to the same object without doing anything unsafe. You could also have stuff like
>
> shared baz = foo.getBaz();
This will be forced to a copy of the returned value, not a reference.

May 16, 2019

Posted by Dominikus Dittes Scherkl
in reply to Jonathan M Davis

Dominikus Dittes Scherkl

Posted in reply to Jonathan M Davis

On Wednesday, 15 May 2019 at 09:33:19 UTC, Jonathan M Davis wrote:
> On Wednesday, May 15, 2019 12:59:00 AM MDT Dominikus Dittes Scherkl via Digitalmars-d wrote:
>> No, it can't. Disjunct means: It cannot be called unless all of the given variables are free (not locked by any other mutex).
>
> So, you're proposing that something in the runtime keeps track of which variables are currently associated with a locked mutex in order to guarantee that no other lock block is able to access any of those variables at the same time? That would probably require adding a global lock used by the runtime when any code enters or exists a lock block. I'd be _very_ surprised if anything like that were deemed acceptable for D.
Ok, that can be a problem. But yes, this is what I suggest.
But why the resistence? Nobody is forced to use mutexes, and if you don't that part of the runtime could (should!) be eliminated from the program.

> And it still doesn't solve the problem of other references referring to any part of the objects referred to by those variables existing and potentially being used elsewhere. If you had
>
> lock(mutex, var1)
> {
> }
>
> and elsewhere
>
> lock(mutex, var2)
> {
> }
>
> when var1 and var2 were references to the same object
Forbidden. See last post.

> or when var2 referred to a piece of data inside of var1,
This would also be impossible. shared vars are no references.

> If you only had one mutex for the entire program, and casting away shared were illegal, then something like this could probably work,
Thank you, but why this restriction? There can be as many mutexes as you like. The runtime has only to ensure that any running locked block doesn't modify any of the vars in the other running locked blocks.

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation