October 16, 2018 Re: shared - i need it to be useful | ||||
---|---|---|---|---|
| ||||
Posted in reply to Manu | On Tuesday, 16 October 2018 at 02:26:04 UTC, Manu wrote: >> I understand your point but I think the current shared (no implicit conversion) has its uses. It can be quite useful to have one interface for when an object is shared and one for when it is not (one with and without the synchronization cost). Sure, something as trivial as a counter can be re-implemented in both ways but more complex objects would easily result in extreme code duplication. > > If you can give a single 'use', I'm all ears ;) My usages are a custom ref counted template and list types (which are built on top of the former). The ref counted template will initialize naively when not shared but utilize compare-and-set and thread yielding if needed when shared (ensureInitialized can occur any time after declaration). The list types (which are not shared-compatible *yet*) will utilize a read-write mutex only when shared (no cost beyond a little unused memory to non-shared objects). As such, casting any of the non-shared versions of these types to shared would be unsafe. (Technically the types can be safely casted to shared so long as no non-shared reference to them exists and vice versa.) >> It's also easy to acknowledge that implicit conversion to shared has its uses. > I actually know of many real uses for this case. (There was a time when I wanted it as well. I've forgotten what for by now.) > That might be fine. > Like I said in OP, the first point that I think needs to be agreed on, > is that shared can not read or write members. I think that's a > pre-requisite for any interesting development. I will agree that *outsiders* should not be able to read/write members of a shared object. I'm not sure whether your design means even within a shared function such members cannot be read from or written to. If it does, the casting required may get a little annoying but isn't unworkable (especially since an unshared template can do the cast in an @trusted manner). |
October 15, 2018 Re: shared - i need it to be useful | ||||
---|---|---|---|---|
| ||||
Posted in reply to Isaac S. | On Mon, Oct 15, 2018 at 8:55 PM Isaac S. via Digitalmars-d <digitalmars-d@puremagic.com> wrote: > > On Tuesday, 16 October 2018 at 02:26:04 UTC, Manu wrote: > >> I understand your point but I think the current shared (no implicit conversion) has its uses. It can be quite useful to have one interface for when an object is shared and one for when it is not (one with and without the synchronization cost). Sure, something as trivial as a counter can be re-implemented in both ways but more complex objects would easily result in extreme code duplication. > > > > If you can give a single 'use', I'm all ears ;) > > My usages are a custom ref counted template and list types (which are built on top of the former). The ref counted template will initialize naively when not shared but utilize compare-and-set and thread yielding if needed when shared (ensureInitialized can occur any time after declaration). The list types (which are not shared-compatible *yet*) will utilize a read-write mutex only when shared (no cost beyond a little unused memory to non-shared objects). As such, casting any of the non-shared versions of these types to shared would be unsafe. > > (Technically the types can be safely casted to shared so long as > no non-shared > reference to them exists and vice versa.) Can you link to code? It doesn't sound incompatible with my proposal. Mutexes are blunt instruments, and easy to model. My whole suggestion has virtually no effect on the mutex protected case, which is what most people seem to talk about. > > That might be fine. > > Like I said in OP, the first point that I think needs to be > > agreed on, > > is that shared can not read or write members. I think that's a > > pre-requisite for any interesting development. > > I will agree that *outsiders* should not be able to read/write members of a shared object. I'm not sure whether your design means even within a shared function such members cannot be read from or written to. Absolutely. Shared references can not access members, that's the rule. It's completely unsafe to read or write to members of a shared thing. Period. Attributing a method shared doesn't change that fact. You can create a thread-local (ie, accessible) object from a shared object by acquiring a lock on it for some duration, and it would be helpful if the mechanism that fetches the lock also cast away shared as part of the lock operation. > If it does, the casting required may get a > little annoying but isn't unworkable (especially since an > unshared template can do the cast in an @trusted manner). I agree users shouldn't write casts explicitly, they should use a lock helper or something that grabs the lock and returns the un-shared reference. The existing behaviour is totally unacceptable though; shared objects can read/write arbitrarily... what's the point of that? It gives the impression that it's okay, but it's completely broken. There is no situation where a shared thing can arbitrarily access its members. |
October 16, 2018 Re: shared - i need it to be useful | ||||
---|---|---|---|---|
| ||||
Posted in reply to Manu | On 2018-10-15 20:46, Manu wrote: > 1. traditional; assert that the object become thread-local by > acquiring a lock, cast shared away Instead of having to explicitly cast away shared we could leverage the synchronized statement. It could be enhanced to allow the following: shared int a; Mutex mutex; synchronized(tlsA; a, mutex) { // within this scope "tlsA" is the same as "a" but without // the "shared" qualifier. "tlsA" is not allowed to escape the block } This could also be implemented as a library function, but that would require casting away shared inside the implementation: guard(a, mutex, (tlsA){ }); void guard(T)(scope shared T value, Mutex mutex, scope void delegate (scope T) block) { mutex.lock_nothrow(); scope (exit) mutex.unlock_nothrow(); block(cast(T) value); } -- /Jacob Carlborg |
October 16, 2018 Re: shared - i need it to be useful | ||||
---|---|---|---|---|
| ||||
Posted in reply to Manu | On Monday, 15 October 2018 at 18:46:45 UTC, Manu wrote: > Current situation where you can arbitrarily access shared members > undermines any value it has. The value of shared is existence of thread-local data that's guaranteed to be not shared, so you don't need to worry about thread-local data being shared, and shared protects that value well. > Assuming this world... how do you use shared? Unique solution for each case. > If you write a lock-free queue for instance, and all the methods are > `shared` (ie, threadsafe), then under the current rules, you can't > interact with the object when it's not shared, and that's fairly > useless. Create it as shared. > Assuming the rules above: "can't read or write to members", and the understanding that `shared` methods are expected to have threadsafe implementations (because that's the whole point), what are the risks from allowing T* -> shared(T)* conversion? All data becomes possibly shared, so you can't assume it's unshared, effectively C-style sharing. BTW D supports the latter already. |
October 16, 2018 Re: shared - i need it to be useful | ||||
---|---|---|---|---|
| ||||
Posted in reply to Manu | On 15.10.2018 20:46, Manu wrote: > > Assuming the rules above: "can't read or write to members", and the > understanding that `shared` methods are expected to have threadsafe > implementations (because that's the whole point), what are the risks > from allowing T* -> shared(T)* conversion? > Unshared becomes useless, and in turn, shared becomes useless. You can't have unshared/shared aliasing. > All the risks that I think have been identified previously assume that > you can arbitrarily modify the data. That's insanity... assume we fix > that... I think the promotion actually becomes safe now...? But useless, because there is no way to ensure thread safety of reads and writes if only one party to the shared state knows about the sharing. |
October 16, 2018 Re: shared - i need it to be useful | ||||
---|---|---|---|---|
| ||||
Posted in reply to Timon Gehr | On Tuesday, 16 October 2018 at 10:15:51 UTC, Timon Gehr wrote: > On 15.10.2018 20:46, Manu wrote: >> >> Assuming the rules above: "can't read or write to members", and the >> understanding that `shared` methods are expected to have threadsafe >> implementations (because that's the whole point), what are the risks >> from allowing T* -> shared(T)* conversion? >> > > Unshared becomes useless, and in turn, shared becomes useless. why is unshared useless? Unshared means you can read an write to it. If you give it to a function that expect something shared, the function you had given it to can't read or write it, so it can't do any harm. Of course it can handle it threadsave, but as it is local, that is only overhead - reading or changing the value can't do any harm either. I like the idea. > But useless, because there is no way to ensure thread safety of reads and writes if only one party to the shared state knows about the sharing. Of course there is. Giving an unshared value to a function that even can handle shared values may create some overhead, but is indeed threadsave. |
October 16, 2018 Re: shared - i need it to be useful | ||||
---|---|---|---|---|
| ||||
Posted in reply to Manu | On Tuesday, 16 October 2018 at 03:00:21 UTC, Manu wrote: >> I don't see how an *implicit* cast can be a restriction. At all. > Because a shared pointer can't access anything. > You can't do anything with a shared instance, so the can be no harm done. That just doesn't compute. You obviously *can* do "anything" with a shared instance, per your own requirements all you need is methods. But at this point, the caller has an unshared reference, and your API possibliy stole away a shared one without the caller ever knowing of it happening. >> It's like we're talking about wholly different things here. Casting should be done by the caller, i.e. a programmer that uses some API. If that API expects shared arguments, the caller better make sure they pass shared values. Implicit conversion destroys any obligations between the caller and the API. > > Why? What could a function do with shared arguments? How, exactly, do you propose getting shared state from one thread to another? Exclusively through globals? >> The problem that I'm suggesting is exactly that: an `int*` is not, and can not, be a `shared int*` at the same time. Substitute int for any type. But D is not Rust and it can't statically prevent that, except for disallowing trivial programming mistakes, which, with implicit conversion introduced, would also go away. > > Why not? The guy who receives the argument receives an argument that > *may be shared*, and as such, he's restricted access to it appropriately. But the guy that *provides* that argument may have no idea of this happening. This is unshared aliasing. module yourapi; // My code does not know about this function at all void giveToThread(shared int* ptr) { /* ... */ } void yourAPI(int* ptr) { giveToThread(ptr); } module mycode; int x; void main() { yourAPI(&x); } > Just like if you receive a const thing, you can't write to it, even if the caller's thing isn't const. You do know why those Rust guys disallowed mutable aliasing? That is, having both mutable and immutable references at the same time. That's a long known problem that in C++ and D is only "solved" by programmer discipline and not much else. > If you receive a shared thing, you can't read or write to it. You can, per your model, through methods. All the while the original is being read and written freely. >> > a good chance they don't need it though, they might not interact with a thread-unsafe portion of the class. >> >> Or they might. > > Then you will implement synchronisation, or have violated your thread-safety promise. Such a good promise it is when it's simply ignored by an implicit cast. >> Because that is exactly the code that a good amount of "developers" will write. Especially those of the "don't think about it" variety. Don't be mistaken for a second: if the language allows it, they'll write it. > > This is not even an argument. > Atomic!int must be used with care. Any threading of ANY KIND must be > handled with care. Implicit casts are in another galaxy as far as handling with care is concerned. > Saying we shouldn't make shared useful because someone can do something wrong is like saying we shouldn't have atomic int's and we shouldn't have spawn(). They're simply too dangerous to give to users... We should make `shared` useful. We shouldn't allow unshared aliasing. >> Can you actually provide an example of a mixed shared/unshared class that even makes sense then? As I said, at this point I'd rather see such definitions prohibited entirely. > > I think this is a typical sort of construction: > > struct ThreadsafeQueue(T) > { > void QueueItem(T*) shared; > T* UnqueueItem() shared; > } That's an interface for a multi-producer multi-consumer, yet you're using it as a multi-producer single-consumer. Those would have very different implementations. > struct SpecialWorkList > { > struct Job { ... } > > void MakeJob(int x, float y, string z) shared // <- any thread may produce a job > { > Job* job = new Job; // <- this is thread-local > PopulateJob(job, x, y, z); // <- preparation of a job might be complex, and worthy of the SpecialWorkList implementation ...except you can't call PopulateJob from MakeJob. The two methods after compiler rewrite are (pseudocode): void struct_method_SpecialWorkList_MakeJob(shared SpecialWorkList* this, int x, float y, string z); void struct_method_SpecialWorkList_PopulateJob(SpecialWorkList* this, Job* job, ...); Or are you also suggesting to allow implicitly demoting shared(T)* to T* ?!? Then you can just throw away `shared`. > jobList.QueueItem(job); // <- QueueItem encapsulates thread-safety, no need for blunt casts None were needed here regardless. > void Flush() // <- not shared, thread-local consumer > { I.e. this can only be called by the thread that owns this instance, because only that thread can have an unshared reference to it. > void GetSpecialSystemState() // <- this has NOTHING to do with the threadsafe part of SpecialWorkList > { > return os.functionThatChecksSystemState(); > } Therefore it's either static or out of SpecialWorkList. As someone in strong opposition to conflation you should've seen that. > // there may be any number of utility functions that don't interact with jobList. So have a separate interface for them, why put them in the same namespace then? > private: > void PopulateJob(ref Job job, ...) > { > // expensive function; not thread-safe, and doesn't have any > interaction with threading. And cannot be called from `shared` functions without at least casting away `shared`. > This isn't an amazing example, but it's typical of a thing that's > mostly thread-local, and only a small controlled part of it's > functionality is thread-safe. > The thread-local method Flush() also deals with thread-safety > internally... because it flushes a thread-safe queue. > All thread-safety concerns are composed by a utility object, so there's no need for locks, magic, or casts here. > // Casts: auto assumeUnshared(T)(ref shared(T) x) { /* ... */ } struct MPSCQueue(T) { void put(T*) shared; T* remove(); // perhaps also this one: T* steal() shared; } struct Job { /* ... */ } struct JobList { void addJob(Args...)(Args args) shared { auto job = new Job; this.assumeUnshared.populateJob(*job, forward!args); jobs.put(job); } Job* removeJob() { return jobs.remove(); } Job* stealJob() shared { return jobs.steal(); } private: void populateJob(ref Job job, ...) { /* ... */ } MPSCQueue!Job jobs; } struct SpecialWorkList { shared JobList* getJobList() { return jobList; } // here's an implicit cast to the actually "shared" data if you need one: //alias getJobList this; void flush() { auto list = jobList.assumeUnshared; Job* job; while ((job = list.removeJob) !is null) { // ... } } // put anything non-shared here... private: // Even though your 'SpecialWorkList' is unshared, // you're sharing at least this reference with other threads: shared JobList* jobList; } SpecialWorkList makeSpecialWorkList() { SpecialWorkList result; // whatever, allocate the jobList, initialize other state, etc. return result; } void worker(shared int* keepMeltingCPU, shared JobList* jobList) { while (true) { if (!keepMeltingCPU.atomicLoad()) break; // Do some heavy processing... jobList.addJob(/* arguments */); jobList.addJob(/* arguments */); } } void thief(shared int* keepMeltingCPU, shared JobList* jobList) { while (true) { if (!keepMeltingCPU.atomicLoad()) break; jobList.addJob(/* arguments */); jobList.addJob(/* arguments */); /* ... */ // help with jobs Job* job; while ((job = jobList.steal) !is null) { // ... } } } bool readInput() { /* ... */ } void main() { Thread[2] threads; shared int[2] flags = [ 1, 1 ]; SpecialWorkList list = makeSpecialWorkList(); threads[0] = someFunctionThatMakesAThread(&worker, &flags[0], list.getJobList); threads[1] = someFunctionThatMakesAThread(&thief, &flags[1], list.getJobList); while (true) { list.flush(); if (!readInput()) break; } foreach (ref flag; flags) { flag.atomicStore(0); } foreach (t; threads) t.join(); } As you can see, the amount of casting is minimal. It is explicit and documenting the intent. And all the casts are actually the opposite way of what you're proposing. That's because the design lends to just not letting you share thread-local data. The "problem" here is that you can't add jobs from main thread without adding an unshared overload. Which is trivial, but would require a cast. There are, however, deeper problems that don't have anything to do with the `shared` keyword, and start with the call "new Job". I could assume it's just for the sake of example, and you'd actually store the jobs: a) not as pointers b) in aligned and padded storage but since we're nitpicking, and you're very specific in that approach, I'm not sure if I'm safe making such an assumption. >> missing the point altogether. The way `shared` is "implemented" today, the API (`thread` function) *requires* the caller to pass a `shared int*`. Implicit conversion breaks that contract. > > So? > What does it mean to pass a `shared int*`? Exactly what it says: passing a pointer to a shared int. Which is what int* is not. Adding `shared` to a type is not the same thing as "promoting" a mutable to a const, which, the way that's done in D, isn't that awesome a deal to begin with. >> At the highest level, the only reason for taking a `shared` argument is to pass that argument to another thread. > > Not even. This is the most un-useful application I can think of. It doesn't really model the problem at all. There's no feasible alternative in the current type system. > Transfer of ownership is a job for move semantics. There is no transfer of ownership when you're *sharing* data. > shared is for interacting with objects that are *already* owned by many threads. There's only one owner, usually a thread that instantiated the thing in the first place. Ergo, there needs to be a way to "share" that thing. Explicitly. > shared needs to model mechanics to do a limited set of thread-safe > interactions with shared objects that are shared. That would make > shared a useful thing, rather than a giant stain. Agreed. >> That is the *only* way to communicate that intent via the type system for the >> time being. > > ...but that's shit. And it doesn't communicate that intent at all. Bluntly casting attributes on things is a terrible solution to that proposed problem. It's not a solution, it's a crutch. Presumably the programmer would know whether or not that cast is safe in that particular context. Can you propose a better way to distinguish shared and thread-local instances without introducing "thread" built-in type (at which point we should all probably just stop using D)? So long as threading is purely library code, there isn't exactly much room for that. > Yes; everything we think about shared today is completely worthless. > Under my proposal, some existing applications might remain untouched > (they do), but they're not worth worrying about from a design point of view, because they're not really 'designs'. > Focus on making shared a useful thing, and then see where we're at. I quite like the idea of disallowing member access wholesale. For example, things like assignment, copy/postblit should be @disabled for anything `shared` by default, shared destructors should not exist. But this doesn't *really* solve the problem. >> `shared` was >> supposed to protect from unshared aliasing, not silently allow it. > > Inhibiting all access satisfies that protection. It doesn't matter if > a pointer is distributed if you can't access the contents. > Now from there, we need a way to make interacting with guaranteed > thread-safe API's interesting and useful, and I'm describing how to do that. An API cannot be guaranteed thread-safe without some kind of contract in it's signature. >> If you allow implicit conversion, there would literally be no way of knowing whether some API will access your data concurrently, other than plain old documentation (or sifting through it's code, which may not be available). This makes `shared` useless as a type qualifier. > That's the whole point though. > A thread-safe think couldn't care less if the data is shared or not, > because it's threadsafe. > Now we're able to describe what's thread-safe, and what's not. This makes shared *useful* as a type qualifier. You're looking at it backwards, I assume because you keep thinking about `shared` in terms of `const`. `shared` thing cares if the data is shared or not, to which with implicit casting in place there is no compile-time check. >> I think you will agree that passing a pointer to a thread-local variable to another thread is not always a safe thing to do. > That's the problem I'm trying to resolve by removing all access. > I'm trying to make that interaction safe, and that's the key to moving forward as I see it. > If the object is thread-local, then no other thread can access the object in any way, and it's just a fancy int. Not true if you allow to silently cast a pointer to that fancy int to a pointer to a shared int. >> Conditions do apply, which are on you (the programmer) to uphold, and the compiler can't help you with that. The only way the compiler *can* help you here is make sure you don't do that unintentionally. Which it won't be able to do if you allow such implicit conversion. > You need to demonstrate how the implicit conversion may lead to chaos. The conversion is immensely useful, and I haven't thought how it's a problem yet. I already have. With your implicit promotions, you can literally pass off anything as shared. No amount of DIP1000s will safeguard from this, because the only ways to pass data to threads is to either only use globals, or escape it. Therefore the language needs some way of annotating that intent. Currently, `shared` does that. If you don't like that, there needs to be some alternative. |
October 16, 2018 Re: shared - i need it to be useful | ||||
---|---|---|---|---|
| ||||
Posted in reply to Dominikus Dittes Scherkl | On 16.10.2018 13:04, Dominikus Dittes Scherkl wrote: > On Tuesday, 16 October 2018 at 10:15:51 UTC, Timon Gehr wrote: >> On 15.10.2018 20:46, Manu wrote: >>> >>> Assuming the rules above: "can't read or write to members", and the >>> understanding that `shared` methods are expected to have threadsafe >>> implementations (because that's the whole point), what are the risks >>> from allowing T* -> shared(T)* conversion? >>> >> >> Unshared becomes useless, and in turn, shared becomes useless. > why is unshared useless? > Unshared means you can read an write to it. > If you give it to a function that expect something shared, > the function you had given it to can't read or write it, so it > can't do any harm. It can do harm to others who hold an unshared alias to the same data and are operating on it concurrently. > Of course it can handle it threadsave, but as it is local, > that is only overhead - reading or changing the value can't do > any harm either. I like the idea. > >> But useless, because there is no way to ensure thread safety of reads and writes if only one party to the shared state knows about the sharing. > Of course there is. Please do enlighten me. You have two processors operating (reading/writing) on the same address space on a modern computer architecture with a weak memory model, and you are using an optimizing compiler. How do you ensure sensible results without cooperation from both of them? (Hint: you don't.) > Giving an unshared value to a function that > even can handle shared values may create some overhead, but is > indeed threadsave. > Yes, if you give it to one function only, that is the case. However, as you may know, concurrency means that there may be multiple functions operating on the data _at the same time_. If one of them operates on the data as if it was not shared, you will run into trouble. You are arguing as if there was either no concurrency or no mutable aliasing. |
October 16, 2018 Re: shared - i need it to be useful | ||||
---|---|---|---|---|
| ||||
Posted in reply to Manu | On 10/15/18 2:46 PM, Manu wrote:
> Okay, so I've been thinking on this for a while... I think I have a
> pretty good feel for how shared is meant to be.
>
> 1. shared should behave exactly like const, except in addition to
> inhibiting write access, it also inhibits read access.
>
> I think this is the foundation for a useful definition for shared, and
> it's REALLY easy to understand and explain.
>
> Current situation where you can arbitrarily access shared members
> undermines any value it has. Shared must assure you don't access
> members unsafely, and the only way to do that with respect to data
> members, is to inhibit access completely.
> I think shared is just const without read access.
>
> Assuming this world... how do you use shared?
>
> 1. traditional; assert that the object become thread-local by
> acquiring a lock, cast shared away
> 2. object may have shared methods; such methods CAN be called on
> shared instances. such methods may internally implement
> synchronisation to perform their function. perhaps methods of a
> lock-free queue structure for instance, or operator overloads on
> `Atomic!int`, etc.
>
> In practise, there is no functional change in usage from the current
> implementation, except we disallow unsafe accesses (which will make
> the thing useful).
>
>>From there, it opens up another critical opportunity; T* -> shared(T)*
> promotion.
> Const would be useless without T* -> const(T)* promotion. Shared
> suffers a similar problem.
> If you write a lock-free queue for instance, and all the methods are
> `shared` (ie, threadsafe), then under the current rules, you can't
> interact with the object when it's not shared, and that's fairly
> useless.
>
> Assuming the rules above: "can't read or write to members", and the
> understanding that `shared` methods are expected to have threadsafe
> implementations (because that's the whole point), what are the risks
> from allowing T* -> shared(T)* conversion?
>
> All the risks that I think have been identified previously assume that
> you can arbitrarily modify the data. That's insanity... assume we fix
> that... I think the promotion actually becomes safe now...?
>
> Destroy...
>
This is a step in the right direction. But there is still one problem -- shared is inherently transitive.
So casting away shared is super-dangerous, even if you lock the shared data, because any of the subreferences will become unshared and read/writable.
For instance:
struct S
{
int x;
int *y;
}
shared int z;
auto s1 = shared(S)(1, &z);
auto s2 = shared(S)(2, &z);
S* s1locked = s1.lock;
Now I have access to z via s1locked as an unshared int, and I never locked z. Potentially one could do the same thing via s2, and now there are 2 mutable references, potentially in 2 threads.
All of this, of course, is manual. So technically we could manually implement it properly inside S. But this means shared doesn't help us much.
We really need on top of shared, a way to specify something is tail-shared. That is, all the data in S is unshared, but anything it points to is still shared. That at least helps the person implementing the manual locking from doing stupid things himself.
-Steve
|
October 16, 2018 Re: shared - i need it to be useful | ||||
---|---|---|---|---|
| ||||
Posted in reply to Steven Schveighoffer | On 10/16/18 9:25 AM, Steven Schveighoffer wrote: > On 10/15/18 2:46 PM, Manu wrote: >>> From there, it opens up another critical opportunity; T* -> shared(T)* >> promotion. >> Const would be useless without T* -> const(T)* promotion. Shared >> suffers a similar problem. >> If you write a lock-free queue for instance, and all the methods are >> `shared` (ie, threadsafe), then under the current rules, you can't >> interact with the object when it's not shared, and that's fairly >> useless. >> Oh, I didn't see this part. Completely agree with Timon on this, no implicit conversions should be allowed. If you want to have a lock-free implementation of something, you can abstract the assignments and reads behind the proper mechanisms anyway, and still avoid locking (casting is not locking). -Steve |
Copyright © 1999-2021 by the D Language Foundation