Thread overview
Re: Thread-safe attribution
Oct 07, 2018
Manu
Oct 07, 2018
Stanislav Blinov
Oct 07, 2018
Manu
Oct 07, 2018
Stanislav Blinov
Oct 07, 2018
Boris-Barboris
Oct 07, 2018
Manu
Oct 07, 2018
Manu
October 06, 2018
On Sat, Oct 6, 2018 at 6:59 PM Manu <turkeyman@gmail.com> wrote:
>
> So I'm working on a SMT infrastructure, and expression of
> thread-safety is a core design mechanic... but I'm really struggling
> to express it in terms of the type system.
> I figure I'll throw some design challenges out here and see if anyone
> can offer some good ideas.
>
> The thing I'm trying to model is an attribute along the lines of
> `shared`, but actually useful ;)
> I'll use the attribute `threadsafe` in place of `shared`, and see
> where that goes.
>
> Consider:
> struct Bob
> {
>   int x;
>   threadsafe Atomic!int y;
>
>   void m1();
>   void m2() threadsafe;;
>
>   void overloaded();
>   void overloaded() threadsafe;
> }
>
> void func(Bob x, threadsafe Bob y)
> {
>   x.x = 10; // fine
>   x.y = 10; // fine
>   x.m1(); // fine
>   x.m2(); // fine
>   x.overloaded(); // fine, use the un-threadsafe overload
>
>   y.x = 10; // ERROR, a threadsafe reference can NOT modify an
> un-threadsafe member
>   y.y = 10; // fine
>   x.m1(); // ERROR, method not threadsafe
>   x.m2(); // fine
>   x.overloaded(); // fine, use the threadsafe overload
>
>   threadsafe Bob* p = &x; // can take threadsafe reference to
> thread-local object
> }
>
> This is loosely what `shared` models, but there's a few differences:
> 1. thread-local can NOT promote to shared
> 2. shared `this` applies to members
>
> For `shared` to be useful, it should be that a shared reference to something inhibits access to it's thread-local stuff. And in that world, then I believe that thread-local promotion to shared would work like const does.
>
> I guess I'm wondering; should `shared` be transitive? Perhaps that's what's wrong with it...?

*** the function arguments should be `ref`!
October 06, 2018
On Sat, Oct 6, 2018 at 7:01 PM Manu <turkeyman@gmail.com> wrote:
>
> On Sat, Oct 6, 2018 at 6:59 PM Manu <turkeyman@gmail.com> wrote:
> >
> > So I'm working on a SMT infrastructure, and expression of
> > thread-safety is a core design mechanic... but I'm really struggling
> > to express it in terms of the type system.
> > I figure I'll throw some design challenges out here and see if anyone
> > can offer some good ideas.
> >
> > The thing I'm trying to model is an attribute along the lines of
> > `shared`, but actually useful ;)
> > I'll use the attribute `threadsafe` in place of `shared`, and see
> > where that goes.
> >
> > Consider:
> > struct Bob
> > {
> >   int x;
> >   threadsafe Atomic!int y;
> >
> >   void m1();
> >   void m2() threadsafe;;
> >
> >   void overloaded();
> >   void overloaded() threadsafe;
> > }
> >
> > void func(Bob x, threadsafe Bob y)
> > {
> >   x.x = 10; // fine
> >   x.y = 10; // fine
> >   x.m1(); // fine
> >   x.m2(); // fine
> >   x.overloaded(); // fine, use the un-threadsafe overload
> >
> >   y.x = 10; // ERROR, a threadsafe reference can NOT modify an
> > un-threadsafe member
> >   y.y = 10; // fine
> >   x.m1(); // ERROR, method not threadsafe
> >   x.m2(); // fine
> >   x.overloaded(); // fine, use the threadsafe overload
> >
> >   threadsafe Bob* p = &x; // can take threadsafe reference to
> > thread-local object
> > }
> >
> > This is loosely what `shared` models, but there's a few differences:
> > 1. thread-local can NOT promote to shared
> > 2. shared `this` applies to members
> >
> > For `shared` to be useful, it should be that a shared reference to something inhibits access to it's thread-local stuff. And in that world, then I believe that thread-local promotion to shared would work like const does.
> >
> > I guess I'm wondering; should `shared` be transitive? Perhaps that's what's wrong with it...?
>
> *** the function arguments should be `ref`!

Thinking on this more... perhaps it's the case that shared is transitive, but the effect shared has, is to inhibit read/write to data members.

In my example above, there is an atomic data member:
struct Bob
{
  int x;
  Atomic!int y; // not marked `shared` in this example
}

void fun(ref Bob a, ref shared Bob b)
{
  a.x = 10; // fine
  a.y = 10; // fine
  b.x = 10; // error! b is shared, and member 'x' is NOT shared, no access!
  b.y = 10; // this gets interesting...
}

So, `b` can't access `y` because it's not shared... but consider this possibility:

struct Atomic(T)
{
  T data;
  void opAssign(T val) shared
  {
    // implement assignment using atomic operations
  }
}

`b.y.data` is not accessible, but the assignment operator has been attributed shared, which means `b.y = 10` becomes a legal function call.

At this point, shared is now useful.

So, continue the thought experiment that goes:
1. `shared` instances can NOT access non-shared members (opposite of
current behaviour)
2. non-shared can promote to `shared` (like const)

I think this restriction in access corrects the issues associated with allowing non-shared -> shared promotion.
October 07, 2018
On Sunday, 7 October 2018 at 02:01:17 UTC, Manu wrote:

>> The thing I'm trying to model is an attribute along the lines of
>> `shared`, but actually useful ;)
>> I'll use the attribute `threadsafe` in place of `shared`, and see
>> where that goes.
>>
>> Consider:
>> struct Bob
>> {
>>   int x;
>>   threadsafe Atomic!int y;

Storing shared and local data together? Ew, cache poison. Be that as it may, given that definition:

1. Can you copy Bob or assign to it? If so, how?
2. Can Bob have a destructor? Who calls it if it can?

[snip]

>>   x.m1(); // ERROR, method not threadsafe
>>   x.m2(); // fine
>>   x.overloaded(); // fine, use the threadsafe overload

I guess these three should be y., not x.?

>>   threadsafe Bob* p = &x; // can take threadsafe reference to
>> thread-local object

I'm not sure what's that supposed to accomplish. It looks like a silent cast, which is already a red flag. What is the purpose of this? Give that pointer to another thread while the original continues to treat 'x' as thread-local? I.e. if the arguments were indeed `ref`, the caller would be blissfully unaware of such a "transaction" taking place.

>> This is loosely what `shared` models, but there's a few differences:
>> 1. thread-local can NOT promote to shared
>> 2. shared `this` applies to members
>>
>> For `shared` to be useful, it should be that a shared reference to something inhibits access to it's thread-local stuff. And in that world, then I believe that thread-local promotion to shared would work like const does.
>>
>> I guess I'm wondering; should `shared` be transitive? Perhaps that's what's wrong with it...?

IHMO, `shared` should be transitive, but... most uses of `shared` should just boil down to primitive types (i.e. atomics) and pointers to shared primitives and structs. With dereferencing latter requiring to be synchronized, casting away `shared` in the process, and I don't see how the language as it is can help with that, as accessing different kinds of data requires different code. We can't just stuff all the nuances of thread-safety and multi-core communication into one keyword, whatever that keyword may be.
What could be the static (compile-time) guarantees of `threadsafe` methods?
October 06, 2018
On Sat, Oct 6, 2018 at 8:10 PM Stanislav Blinov via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
>
> On Sunday, 7 October 2018 at 02:01:17 UTC, Manu wrote:
>
> >> The thing I'm trying to model is an attribute along the lines
> >> of
> >> `shared`, but actually useful ;)
> >> I'll use the attribute `threadsafe` in place of `shared`, and
> >> see
> >> where that goes.
> >>
> >> Consider:
> >> struct Bob
> >> {
> >>   int x;
> >>   threadsafe Atomic!int y;
>
> Storing shared and local data together? Ew, cache poison. Be that as it may, given that definition:

I don't think mixing would happen in practise, but the rules need to be defined.

> 1. Can you copy Bob or assign to it? If so, how?
> 2. Can Bob have a destructor? Who calls it if it can?

If they're mutable, sure.
If Bob was threadsafe, then it would have to have a matching
attributed constructor.destructor/copy-ctor, etc.

> >>   x.m1(); // ERROR, method not threadsafe
> >>   x.m2(); // fine
> >>   x.overloaded(); // fine, use the threadsafe overload
>
> I guess these three should be y., not x.?

Yes, sorry.


> >>   threadsafe Bob* p = &x; // can take threadsafe reference to
> >> thread-local object
>
> I'm not sure what's that supposed to accomplish. It looks like a silent cast, which is already a red flag. What is the purpose of this?

This is the primitive that allows calling a threadsafe method. You can call a const method because of this same mechanic.

> Give that pointer to another thread while the original
> continues to treat 'x' as thread-local? I.e. if the arguments
> were indeed `ref`, the caller would be blissfully unaware of such
> a "transaction" taking place.

I'm going to move to the position from the most recent post in the
thread (you are responding to the first post).
If shared instances were unable to access their members, then it's
possible that you *could* pass this to another thread (I don't think
you should, but I don't think it's dangerous anymore).
Any shared instance to the object is not able to mutate the object...
they would need to case shared away, which is unsafe... they would
need to do that deliberately by acquiring a lock, or by whatever other
mechanism gives them temporary thread-local access to the object.

> >> This is loosely what `shared` models, but there's a few
> >> differences:
> >> 1. thread-local can NOT promote to shared
> >> 2. shared `this` applies to members
> >>
> >> For `shared` to be useful, it should be that a shared reference to something inhibits access to it's thread-local stuff. And in that world, then I believe that thread-local promotion to shared would work like const does.
> >>
> >> I guess I'm wondering; should `shared` be transitive? Perhaps that's what's wrong with it...?
>
> IHMO, `shared` should be transitive, but... most uses of `shared`
> should just boil down to primitive types (i.e. atomics) and
> pointers to shared primitives and structs. With dereferencing
> latter requiring to be synchronized, casting away `shared` in the
> process, and I don't see how the language as it is can help with
> that, as accessing different kinds of data requires different
> code. We can't just stuff all the nuances of thread-safety and
> multi-core communication into one keyword, whatever that keyword
> may be.
> What could be the static (compile-time) guarantees of
> `threadsafe` methods?

We're not trying to 'stuff the nuances' into a keyword... what I'm
trying to achieve is a mechanism for attributing that a function has
implemented thread-safety *in some way*, and how that works is a
detail for the function.
What the attribute needs to do, is control the access rights to the
object appropriately, so that if you're in custody of a shared object,
you should exclusively be limited to performing guaranteed thread-safe
operations on the object, OR, you must perform synchronisation and
cast shared away (as is current use of shared).

You need to read the rest of the thread... I've moved on from this initial post.
I agree, shared must be transitive, but I think the effect that it
can't access un-shared members may be the innovation we're looking for
which will make shared useful.
October 07, 2018
On Sunday, 7 October 2018 at 04:16:43 UTC, Manu wrote:

> We're not trying to 'stuff the nuances' into a keyword... what I'm trying to achieve is a mechanism for attributing that a function has
> implemented thread-safety *in some way*, and how that works is a
> detail for the function.
> What the attribute needs to do, is control the access rights to the
> object appropriately, so that if you're in custody of a shared object,
> you should exclusively be limited to performing guaranteed thread-safe
> operations on the object, OR, you must perform synchronisation and
> cast shared away (as is current use of shared).

Then I maintain that `T*` should *not* implicitly cast to `threadsafe T*`. It should at least be an explicit cast that you could grep for. Consider:

struct Bob {
    int x;
    int y;

    void readAndMutate() /* thread-local */ {
        if (x) y += 1;
    }

    void readAndMutate() threadsafe {
        auto lock = getSomeLockThatPresumablyLocksThisInstance();
        auto unshared = cast(Bob*) &this;
        unshared.readAndMutate();
    }
}

void sendToAnotherThread(threadsafe Bob* bob) {
    // pass the pointer to some thread...
}

Bob bob; // not marked `threadsafe`

void main() {

    bob.x = 1;

    sendToAnotherThread(&bob);

    bob.x = 0; // <-- that's a bug
    auto copyOfBob = bob; // <-- that's another bug

    // do the rest of main...
}

Basically, *any* non-`threadsafe` access to `bob` should ideally be a compiler error after the cast, but I don't think the language could enforce that.

These contrived examples aren't that convincing, I'm sure, but imagine this implicit cast hiding somewhere in a 10Kloc module.
In your own words: "I don't think that you should...": then we would need at least some way to succinctly find such problems.

There's actually an even more subtle bug in `main` above. Since the `bob` instance is not marked in any way, the compiler (optimizer) would have no idea that e.g. moving reads and writes to bob's fields across that implicit cast must be illegal. I guess a cast should then act as a compiler barrier.

> You need to read the rest of the thread... I've moved on from this initial post.

Yeah, sorry about that. The thread also got split :\

October 07, 2018
On Sunday, 7 October 2018 at 02:01:17 UTC, Manu wrote:
> ... but I'm really struggling
> to express it in terms of the type system...

I'm pretty sure no simple attribute system is any more useful than current const\shared idiom. I am yet to see a language with semantics that actually help with concurrency issues (serializability, lock domains, lock ordering, deadlock prevention\detection, consistency, write scew and hundreds of other problems already known in the domain) on mutable state. SQL didn't solve it, I still always have to grab pen and paper and brute-force simulate concurrent access to my data in order to have any degree of prior knowledge about safety, and unfortunately this problem does not look like an easy one to crack for language designers.

This is why I mainly ignore shared as a feature completely unless I'm forced to. A simple indication (type system attribute) of the fact that data is shared is useless for my recent projects just adds unneeded casts and lines of code without solving any problems.

October 07, 2018
On Sun, Oct 7, 2018 at 10:00 AM Boris-Barboris via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
>
> On Sunday, 7 October 2018 at 02:01:17 UTC, Manu wrote:
> > ... but I'm really struggling
> > to express it in terms of the type system...
>
> I'm pretty sure no simple attribute system is any more useful than current const\shared idiom. I am yet to see a language with semantics that actually help with concurrency issues (serializability, lock domains, lock ordering, deadlock prevention\detection, consistency, write scew and hundreds of other problems already known in the domain) on mutable state. SQL didn't solve it, I still always have to grab pen and paper and brute-force simulate concurrent access to my data in order to have any degree of prior knowledge about safety, and unfortunately this problem does not look like an easy one to crack for language designers.
>
> This is why I mainly ignore shared as a feature completely unless I'm forced to. A simple indication (type system attribute) of the fact that data is shared is useless for my recent projects just adds unneeded casts and lines of code without solving any problems.

In our ecosystem within the structure of our scheduling, we would
absolutely get loads of use out of shared if it was unable to access
non-shared members.
That's a major step forward in usefulness compared to now, where you
can just read/write to members arbitrarily, which is just plain wrong,
and not useful in any way.
If shared inhibited access to non-shared members, thereby requiring
you interface the instance only via shared attributed methods (or the
traditional acquire-lock-and-cast-shared-away method), then it would
be infinitely more useful than it is now.

If anyone wanted to make an experimental patch with that rule, I could
test-drive and see how it goes.
Point is, shared is totally useless. Making it useful, even if it's
not 100% solving threading issues, is a better place to be, and opens
up doors for some new designs, and get more experience.