October 19, 2018 Re: shared - i need it to be useful | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright Attachments:
| On Fri., 19 Oct. 2018, 3:10 am Walter Bright via Digitalmars-d, < digitalmars-d@puremagic.com> wrote: > On 10/17/2018 12:20 AM, Manu wrote: > > What does it mean 'aliased' precisely? > > Aliasing means there are two paths to the same piece of data. That could > be two > pointers pointing to the same data, or one pointer to a variable that is > accessible by name. > The reason I ask is because, by my definition, if you have: int* a; shared(int)* b = a; While you have 2 numbers that address the same data, it is not actually aliased because only `a` can access it. It is not aliased in any practical sense. > It doesn't really give us > > anything in practice that we don't have in C++. > > It provides a standard, enforced way to distinguish shared data from > unshared > data, and no way to bypass it in @safe code. There's no way to do that in > C++. > Right, but we can do so much better. I want shared to model "what is thread-@safe to do", because that models what you are able to do, and what API's should encourage when operating on `shared` things. Exclusively distinguishing shared and unshared data is not an interesting distinction if shared data has no access. I've been trying to say over and over; ignore what you think you know about that definition, accept my rules strictly as given (they're very simple and concise, there's only 2 rules), such that shared will mean "is threadsafe to call with this data" when applied to function args... Build the thought experiment outward from there. That's an interesting and useful definition for shared, and it leads to a framework where shared is useful in a fully @safe SMP program, and even models @safe transitions across unshared -> shared boundaries (parallel for, map/reduce, etc, fork and join style workloads), which are typical lock-free patterns. Lock-and-cast semantics are preserved unchanged for those that interact in the way shared is prescribed today, but strictly modeling that workflow is uninteresting, because it's unsafe by definition. I'm not losing that, but I'm trying to introduce a safe workflow that exists in complement, and my model works. I don't even know if we have a mutex defined in our codebase, we don't use them... but we can max out a 64core thread ripper. > |
October 19, 2018 Re: shared - i need it to be useful | ||||
---|---|---|---|---|
| ||||
Posted in reply to Steven Schveighoffer | On Fri, Oct 19, 2018 at 9:45 AM Steven Schveighoffer via Digitalmars-d <digitalmars-d@puremagic.com> wrote: > > On 10/18/18 9:09 PM, Manu wrote: > > On Thu, Oct 18, 2018 at 5:30 PM Timon Gehr via Digitalmars-d <digitalmars-d@puremagic.com> wrote: > >> > >> On 18.10.18 23:34, Erik van Velzen wrote: > >>> If you have an object which can be used in both a thread-safe and a thread-unsafe way that's a bug or code smell. > >> > >> Then why do you not just make all members shared? Because with Manu's proposal, as soon as you have a shared method, all members effectively become shared. > > > > No they don't, only facets that overlap with the shared method. I tried to present an example before: > > > > struct Threadsafe > > { > > int x; > > Atomic!int y; > > void foo() shared { ++y; } // <- shared interaction only affects 'y' > > void bar() { ++x; ++y; } // <- not threadsafe, but does not violate > > foo's commitment; only interaction with 'y' has any commitment > > associated with it > > void unrelated() { ++x; } // <- no responsibilities are transposed > > here, you can continue to do whatever you like throughout the class > > where 'y' is not concerned > > } > > > > In practise, and in my direct experience, classes tend to have exactly one 'y', and either zero (pure utility), or many such 'x' members. Threadsafe API interacts with 'y', and the rest is just normal thread-local methods which interact with all members thread-locally, and may also interact with 'y' while not violating any threadsafety commitments. > > I promised I wouldn't respond, I'm going to break that (obviously). > > But that's because after reading this description I ACTUALLY understand what you are looking for. > > I'm going to write a fuller post later, but I can't right now. But the critical thing here is, you want a system where you can divvy up a type into pieces you share and pieces you don't. But then you *don't* want to have to share only the shared pieces. You want to share the whole thing and be sure that it can't access your unshared pieces. > > This critical requirement makes things a bit more interesting. For the record, the most difficult thing to reaching this understanding was that whenever I proposed anything, your answer was something like 'I just can't work with that', and when I asked why, you said 'because it's useless', etc. Fully explaining this point is very key to understanding your thinking. > > To be continued... I'm glad that there's movement here... but I'm still not 100% convinced you understood me; perhaps getting close though. I only say that because your description above has a whole lot more words and complexity than is required to express my proposal. If you perceive that complexity in structural terms, then I am still not clearly understood. > "divvy up a type into pieces" This is an odd mental model of what I'm saying, and I can't sympathise with those words, but if they work for you, and we agree on the semantics, then sure... If you write an object with some const methods and some non-const methods, then take a const instance of the object... you can only call the const methods. Have you 'divvied up the type' into a const portion and a non-const portion? If the answer is yes, then I can accept your description. I would talk in terms of restriction: An object has 4 functions, 2 are mutable, 2 are const... you apply const to the type and you are *restricted* to only calling the 2 const functions. An object has 4 functions, 2 are unsahred, 2 are shared... you apply shared to the type and you are *restricted* to only calling the 2 shared (threadsafe) functions. I haven't 'broken the type up', I'm just restricting what you can do to it from within a particular context. In the const context, you can't mutate it. In the shared context, you can't do un-threadsafe to it, and the guarantee of that is embedded in the rules: 1. shared data can not be read or written 2. shared methods must be threadsafe a. this may require that they be @trusted at the low-level, like the methods of `Atomic(T)` b. no other method may violate the shared method's promise, otherwise it does not actually deliver its promise i. I have extensive experience with this and it's just not a problem in practise, but compiler technology to assist would be welcome! ii. This does NOT mean the not-shared methods are somehow shared; they just need to be careful when interacting with some (usually small) subset of members This design implies strong encapsulation, but that's a naturally occurring tendency implementing anything that's threadsafe. As a helpful best-practise to assure that non-shared methods don't undermine a shared method's commitment; prefer to interact with volatile members via accessors or properties that are themselves shared (can be private if you like). You will find that it's not actually hard to deliver on the object's commitment. If you write tooling that is at the level one-up from the bottom of the stack or higher, it should be unusual that you ever need to write a @trusted method, in which case delivering on *your* shared methods promise is implicit. |
October 20, 2018 Re: shared - i need it to be useful | ||||
---|---|---|---|---|
| ||||
Posted in reply to Manu | On 10/19/2018 11:18 PM, Manu wrote: > The reason I ask is because, by my definition, if you have: > int* a; > shared(int)* b = a; > > While you have 2 numbers that address the same data, it is not actually aliased because only `a` can access it. They are aliased, by code that believes it is unshared, and code that believes it is shared. This is not going to work. > Exclusively distinguishing shared and unshared data is not an interesting distinction if shared data has no access. Somehow, you still have to find a way to give the shared path access, through a gate or a cast or a lock or whatever. And then it breaks, because two different threads are accessing the same data each thinking that data is not shared. |
October 20, 2018 Re: shared - i need it to be useful | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | On Saturday, 20 October 2018 at 09:04:17 UTC, Walter Bright wrote:
> Somehow, you still have to find a way to give the shared path access, through a gate or a cast or a lock or whatever. And then it breaks, because two different threads are accessing the same data each thinking that data is not shared.
When you say that, then under Manu's proposal and the code below:
class C {
void f();
void g() shared;
}
void t1(shared C c) {
c.g; // ok
c.f; // error
}
void t2(shared C c) {
c.g; // ok
c.f; // error
}
auto c = new C();
spawn(&t1, c);
spawn(&t2, c);
c.f; // ok
c.g; // ok
Do you mean the implementation of C.g? Since that is shared wouldn't that just be a normal understanding that you'd need to synchronize the data access in shared since it's a shared method? And if you mean C.f, then if that accessed data (that was accessed by C.g) unsafely, then that's just a bad implementation no? Or?
Cheers,
- Ali
|
October 20, 2018 Re: shared - i need it to be useful | ||||
---|---|---|---|---|
| ||||
Posted in reply to aliak | On Saturday, 20 October 2018 at 16:18:53 UTC, aliak wrote:
> class C {
> void f();
> void g() shared;
> }
>
> void t1(shared C c) {
> c.g; // ok
> c.f; // error
> }
>
> void t2(shared C c) {
> c.g; // ok
> c.f; // error
> }
>
> auto c = new C();
> spawn(&t1, c);
> spawn(&t2, c);
> c.f; // ok
> c.g; // ok
Those are not "ok". They're only "ok" under Manu's proposal so long as the author of C promises (via documentation) that that's indeed "ok". There can be no statically-enforced guarantees that those calls are "ok", or that issuing them in that order is "ok". Yet Manu keeps insisting that somehow there is.
|
October 20, 2018 Re: shared - i need it to be useful | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | On Saturday, 20 October 2018 at 09:04:17 UTC, Walter Bright wrote: > On 10/19/2018 11:18 PM, Manu wrote: >> The reason I ask is because, by my definition, if you have: >> int* a; >> shared(int)* b = a; >> >> While you have 2 numbers that address the same data, it is not actually aliased because only `a` can access it. > > They are aliased, Quoting Wikipedia: >two pointers A and B which have the same value, then the name A[0] aliases the name B[0]. In this case we say the pointers A and B alias each other. Note that the concept of pointer aliasing is not very well-defined – two pointers A and B may or may not alias each other, depending on what operations are performed in the function using A and B. In this case given the above: `a[0]` does not alias `b[0]` because `b[0]` is ill defined under Manu's proposal, because the memory referenced by `a` is not reachable through `b` because you can't read or write through `b`. > by code that believes it is unshared you cannot `@safe`ly modify the memory through `b`, `a`'s view of the memory is unchanged in @safe code. > and, code that believes it is shared. you cannot have non-atomic access though `b`, `b` has no @safe view of the memory, unless it is atomic (which by definition is synchronised). >This is not going to work. Aú contraire. |
October 20, 2018 Re: shared - i need it to be useful | ||||
---|---|---|---|---|
| ||||
Posted in reply to Nicholas Wilson | On Saturday, 20 October 2018 at 16:48:05 UTC, Nicholas Wilson wrote: > On Saturday, 20 October 2018 at 09:04:17 UTC, Walter Bright wrote: >> On 10/19/2018 11:18 PM, Manu wrote: >>> The reason I ask is because, by my definition, if you have: >>> int* a; >>> shared(int)* b = a; >>> >>> While you have 2 numbers that address the same data, it is not actually aliased because only `a` can access it. >> >> They are aliased, > > Quoting Wikipedia: > >>two pointers A and B which have the same value, then the name A[0] aliases the name B[0]. In this case we say the pointers A and B alias each other. Note that the concept of pointer aliasing is not very well-defined – two pointers A and B may or may not alias each other, depending on what operations are performed in the function using A and B. > > In this case given the above: `a[0]` does not alias `b[0]` because `b[0]` is ill defined under Manu's proposal, because the memory referenced by `a` is not reachable through `b` because you can't read or write through `b`. > >> by code that believes it is unshared > > you cannot `@safe`ly modify the memory through `b`, `a`'s view of the memory is unchanged in @safe code. And that's already a bug, because the language can't enforce threadsafe access through `a`, regardless of presence of `b`. Only the programmer can. >> and, code that believes it is shared. > > you cannot have non-atomic access though `b`, `b` has no @safe view of the memory, unless it is atomic (which by definition is synchronised). Synchronized with what? You still have `a`, which isn't `shared` and doesn't require any atomic access or synchronization. At this point it doesn't matter if it's an int or a struct. As soon as you share `a`, you can't just pretend that reading or writing `a` is safe. Encapsulate it all you want, safety only remains a contract of convention, the language can't enforce it. |
October 20, 2018 Re: shared - i need it to be useful | ||||
---|---|---|---|---|
| ||||
Posted in reply to Stanislav Blinov | On Saturday, 20 October 2018 at 16:41:41 UTC, Stanislav Blinov wrote:
> On Saturday, 20 October 2018 at 16:18:53 UTC, aliak wrote:
>
>> class C {
>> void f();
>> void g() shared;
>> }
>>
>> void t1(shared C c) {
>> c.g; // ok
>> c.f; // error
>> }
>>
>> void t2(shared C c) {
>> c.g; // ok
>> c.f; // error
>> }
>>
>> auto c = new C();
>> spawn(&t1, c); // line 20
>> spawn(&t2, c); // line 21
>> c.f; // ok
>> c.g; // ok // line 23
>
> Those are not "ok". They're only "ok" under Manu's proposal so long as the author of C promises (via documentation) that that's indeed "ok". There can be no statically-enforced guarantees that those calls are "ok", or that issuing them in that order is "ok". Yet Manu keeps insisting that somehow there is.
Backing up a bit and making a few observations (after adding imports and wrapping the bottom code in a function):
1. the code above currently does not compile, error messages are:
i. line 20 & 21: spawn fails to instantiate because c is not shared
ii. line 23: shared method C.g is not callable using a non-shared object
iii. the lines already marked // error
2. in order to fix 1.i, one must cast c to shared at the call site, this is not @safe
3 fixing 1.ii requires doing `(cast(shared)c).g`, this is also not @safe
4 fixing 1.iii fixing requires casting away shared, this is not only not @safe, but also wrong. c is a class so one could try locking it although I'm not sure what the implications are for doing that when another thread owns the data, probably bad.
5 the current means of dealing with shared with lock and cast away shared is also not @safe
6 under Manu's proposal reading and writing shared objects results in compilation error
7 The static guarantees we have in the language are type safety and @safe
8 under Manu's proposal to do anything one must call shared functions on said object, this implies a "@trusted" implementation at the bottom of the stack for ensuring thread safety (atomics and lock + cast (assuming it is not wrong), other sync primitives) that are not @safe, but not outright wrong either.
The question then becomes: assuming the implementation _is_ @safe type correct and thread safe etc., can the author of C provide guarantees of @safe and type correctness? and can this guarantee be free of false positives?
Currently the answer is no: the requirement to cast to and from shared is un-@safe and that burden is on the user which means that they must understand the inner workings of C to know it that is the case.
Manu's proposal is slightly more interesting. shared becomes a guarantee that accesses to that object will not race, assuming that the @trusted implementation at the bottom of the stack are correct. In the above if t1 and t2 took `const shared C` and `g` was also const shared, then I think that it could.
|
October 20, 2018 Re: shared - i need it to be useful | ||||
---|---|---|---|---|
| ||||
Posted in reply to Stanislav Blinov | On Saturday, 20 October 2018 at 17:06:22 UTC, Stanislav Blinov wrote: > On Saturday, 20 October 2018 at 16:48:05 UTC, Nicholas Wilson wrote: >> On Saturday, 20 October 2018 at 09:04:17 UTC, Walter Bright wrote: >>> by code that believes it is unshared >> >> you cannot `@safe`ly modify the memory through `b`, `a`'s view of the memory is unchanged in @safe code. > > And that's already a bug, because the language can't enforce threadsafe access through `a`, regardless of presence of `b`. Only the programmer can. access through `a` is through the owned reference threadsafety through a does't mean anything, all _other_ access must ensure that the are ordered correctly. > >>> and, code that believes it is shared. >> >> you cannot have non-atomic access though `b`, `b` has no @safe view of the memory, unless it is atomic (which by definition is synchronised). > > Synchronized with what? You still have `a`, which isn't `shared` and doesn't require any atomic access or synchronization. Synchronized w.r.t any writes to that memory, e.g. from `a`. >At this point it doesn't matter if it's an int > or a struct. Yes. > As soon as you share `a`, you can't just pretend that reading or writing `a` is safe. You can if no-one else writes to it, which is the whole point of Manu's proposal. Perhaps it should be const shared instead of shared but still. |
October 20, 2018 Re: shared - i need it to be useful | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | On Sat, Oct 20, 2018 at 2:05 AM Walter Bright via Digitalmars-d <digitalmars-d@puremagic.com> wrote: > > On 10/19/2018 11:18 PM, Manu wrote: > > The reason I ask is because, by my definition, if you have: > > int* a; > > shared(int)* b = a; > > > > While you have 2 numbers that address the same data, it is not actually aliased because only `a` can access it. > > They are aliased, by code that believes it is unshared, and code that believes it is shared. This situation could only occur if you do unsafe code badly. Unlike today, where you must do unsafe code to do any interaction of any kind, you will be able to do fully-safe interaction with the stack of tooling. In that world, any unsafe code and *particularly* where it interacts with shared will be an obnoxious and scary code smell. If it was possible to interact with shared safely, it would be blindingly suspicious when people are likely to be shooting themselves in the foot. The situation you describe here is *exactly* what we have right now, and I'm trying to prevent that. > This is not going to work This is an unfair dismissal. Have you tried it? I have. Write me the rules in a patch that I can take for a drive and demonstrate what the stack looks like. > > Exclusively distinguishing shared and unshared data is not an interesting distinction if shared data has no access. > > Somehow, you still have to find a way to give the shared path access, through a gate or a cast or a lock or whatever. I'm not sure you've understood the proposal. This is the reason for the implicit conversion. It provides safe transition. That's why I'm so insistent on it. > And then it breaks, because two different > threads are accessing the same data each thinking that data is not shared. This can only occur if you deliberately violate your @safety, and then mess up. This is *exactly* the interaction prescribed to shared today. This is what I'm fixing by making a fully @safe path! I think you demonstrate here that you haven't understood the reason, or the semantics of my proposal. I'm not sure how to clarify it, what can I give you? |
Copyright © 1999-2021 by the D Language Foundation