Something needs to happen with shared, and soon. (page 22) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » Something needs to happen with shared, and soon. (page 22)

November 16, 2012

Re: Something needs to happen with shared, and soon.

Posted by deadalnix
in reply to Sean Kelly

deadalnix

Posted in reply to Sean Kelly

Le 15/11/2012 17:33, Sean Kelly a écrit :
> On Nov 15, 2012, at 4:54 AM, deadalnix<deadalnix@gmail.com>  wrote:
>
>> Le 14/11/2012 21:01, Sean Kelly a écrit :
>>> On Nov 14, 2012, at 6:32 AM, Andrei Alexandrescu<SeeWebsiteForEmail@erdani.org>   wrote:
>>>>
>>>> This is a simplification of what should be going on. The core.atomic.{atomicLoad, atomicStore} functions must be intrinsics so the compiler generate sequentially consistent code with them (i.e. not perform certain reorderings). Then there are loads and stores with weaker consistency semantics (acquire, release, acquire/release, and consume).
>>>
>>> No.  These functions all contain volatile ask blocks.  If the compiler respected the "volatile" it would be enough.
>>
>> It is sufficient for monocore and mostly correct for x86. But isn't enough.
>>
>> volatile isn't for concurency, but memory mapping.
>
> Traditionally, the term "volatile" is for memory mapping.  The description of "volatile" for D1, though, would have worked for concurrency.  Or is there some example you can provide where this isn't true?

I'm not aware of D1 compiler inserting memory barrier, so any memory operation reordering done by the CPU would have screwed up.

November 16, 2012

Re: Something needs to happen with shared, and soon.

Posted by Dmitry Olshansky
in reply to Michel Fortin

Dmitry Olshansky

Posted in reply to Michel Fortin

11/16/2012 5:17 PM, Michel Fortin пишет:
> On 2012-11-15 16:08:35 +0000, Dmitry Olshansky <dmitry.olsh@gmail.com>
> said:
>> While the rest of proposal was more or less fine. I don't get why we
>> need escape control of mutex at all - in any case it just opens a
>> possibility to shout yourself in the foot.
>
> In case you want to protect two variables (or more) with the same mutex.
> For instance:
>
>      Mutex m;
>      synchronized(m) int next_id;
>      synchronized(m) Object[int] objects_by_id;
>

Wrap in a struct and it would be even much clearer and safer.
struct ObjectRepository {
	int next_id;
	Object[int] objects_by_id;
}
//or whatever that combination indicates anyway
synchronized ObjectRepository objeRepo;


>      int addObject(Object o)
>      {
>          synchronized(next_id, objects_by_id)

...synchronized(objRepo) with(objRepo)...
Though I'd rather use it as struct directly.

>              return objects_by_id[next_id++] = o;
>      }
>
> Here it doesn't make sense and is less efficient to have two mutexes,
> since every time you need to lock on next_id you'll also want to lock on
> objects_by_id.
>

Yes. But we shouldn't close our eyes on the rest of language for how to implement this. Moreover it makes more sense to pack related stuff (that is under a single lock) into a separate entity.

> I'm not sure how you could shoot yourself in the foot with this. You
> might get worse performance if you reuse the same mutex for too many
> things, just like you might get better performance if you use it wisely.
>

Easily - now the mutex is separate and there is no guarantee that it won't get used for something else then intended. The declaration implies the connection but I do not see anything preventing it from abuse.

>
>> But anyway we can make it in the library right about now.
>>
>> synchronized T ---> Synchronized!T
>> synchronized(i){ ... } --->
>>
>> i.access((x){
>> //will lock & cast away shared T inside of it
>>     ...
>> });
>>
>> I fail to see what it doesn't solve (aside of syntactic sugar).
>
> It solves the problem too. But it's significantly more inconvenient to
> use. Here's my example above redone using Syncrhonized!T:
>
>      Synchronized!(Tuple!(int, Object[int])) objects_by_id;
>
>      int addObject(Object o)
>      {
>          int id;
>          objects_by_id.access((obj_by_id){
>              id = obj_by_id[1][obj_by_id[0]++] = o;
>          };
>          return id;
>      }
>
> I'm not sure if I have to explain why I prefer the first one or not, to
> me it's pretty obvious.

If we made a tiny change in the language that would allow different syntax for passing delegates mine would shine. Such a change at the same time enables more nice way to abstract away control flow.

Imagine:

access(object_by_id){
	...	
};

to be convertible to:

(x){with(x){
	...
}})(access(object_by_id));

More generally speaking a lowering:

expression { ... }
-->
(x){with(x){ ... }}(expression);

AFIAK it doesn't conflict with anything.

Or wait a sec. Even simpler idiom and no extra features.
Drop the idea of 'access' taking a delegate. The other library idiom is to return a RAII proxy that locks/unlocks an object on construction/destroy.

with(lock(object_by_id))
{
	... do what you like
}


Fine by me. And C++ can't do it ;)


>> The key point is that Synchronized!T is otherwise an opaque type.
>> We could pack a few other simple primitives like 'load', 'store' etc.
>> All of them will go through lock-unlock.
>
> Our proposals are pretty much identical. Your works by wrapping a
> variable in a struct template, mine is done with a policy object/struct
> associated with a variable. They'll produce the same code and impose the
> same restrictions.

I kind of wanted to point out this disturbing thought about your proposal. That is a lot of extra syntax and rules added buys us very small gain - prettier syntax.

>
>> Even escaping a reference can be solved by passing inside of 'access'
>> a proxy of T. It could even asserts that the lock is in indeed locked.
>
> Only if you can make a proxy object that cannot leak a reference. It's
> already not obvious how to not leak the top-level reference, but we must
> also consider the case where you're protecting a data structure with the
> mutex and get a pointer to one of its part, like if you slice a container.
>
> This is a hard problem. The language doesn't have a solution to that
> yet. However, having the link between the access policy and the variable
> known by the compiler makes it easier patch the hole later.
>

It need not be 100% malicious dambass proof. Basic foolproofness is OK.
See my sketch, it could be vastly improved:
https://gist.github.com/4089706

See also Ludwig's work. Though he is focused on classes and their monitor mutex.

> What bothers me currently is that because we want to patch all the holes
> while not having all the necessary tools in the language to avoid
> escaping references, we just make using mutexes and things alike
> impossible without casts at every corner, which makes things even more
> bug prone than being able to escape references in the first place.
Well it kind of double-edged.

However I do think we need more general tools in the language and niche ones in the library. Precisely because you can pack tons of niche and miscellaneous stuff on the bookshelf ;)

Locks & the works are niche stuff enabling a lot more of common things.

> There are many perils in concurrency, and the compiler cannot protect
> you from them all. It is of the uttermost importance that code dealing
> with mutexes be both readable and clear about what it is doing. Casts in
> this context are an obfuscator.
>

See below about high-level primitives. The code dealing with mutexes has to be small and isolated anyway. Encouraging pattern of 'just grab the lock and you are golden' is even worse (cause it won't break as fast and hard as e.g. naive atomics will).


>> That and clarifying explicitly what guarantees (aside from being
>> well.. being shared) it provides w.r.t. memory model.
>>
>> Until reaching this thread I was under impression that shared means:
>> - globally visible
>> - atomic operations for stuff that fits in one word
>> - sequentially consistent guarantee
>> - any other forms of access are disallowed except via casts
>
> Built-in shared(T) atomicity (sequential consistency) is a subject of
> debate in this thread. It is not clear to me what will be the
> conclusion, but the way I see things atomicity is just one of the many
> policies you may want to use for keeping consistency when sharing data
> between threads.
>
> I'm not trilled by the idea of making everything atomic by default.
> That'll lure users to the bug-prone expert-only path while relegating
> the more generally applicable protection systems (mutexes) as a
> second-class citizen.

That's why I think people shouldn't have to use mutexes at all.
Explicitly - provide folks with blocking queues, Synchronized!T, concurrent containers  (e.g. hash map) and what not. Even Java has some useful incarnations of these.

> I think it's better that you just can't do
> anything with shared, or that shared simply disappear, and that those
> variables that must be shared be accessible only through some kind of
> access policy. Atomic access should be one of those access policies, on
> an equal footing with other ones.

This is where casts will be a most unwelcome obfuscator and there is no sensible way to de-obscure it by using higher level primitives. Having to say Atomic!X is workable though.

>
> But if D2 is still "frozen" -- as it was meant to be when TDPL got out
> -- and only minor changes can be made to it now, I don't see much hope
> for its concurrency model. Your Syncronized!T and Atomic!T wrappers
> might be the best thing we can hope for, but they're nothing to set D
> apart from its rivals (I could implement that easily in C++ for instance).

Yeah, but we may tweak some syntax in terms of one lowering or a couple. I'm of strong opinion that lock-based multi-threading needs no _specific_ built-in support in the language.

The case is niche and hardly useful outside of certain help with doing safe high-level primitives in the library. As for client code it doesn't care that much.
Compared to C++ there is one big thing. That is no-shared by default. This alone should be immensely helpful especially when dealing with 3rd party libraries that 'try hard to be thread-safe' except that they are usually not.

-- 
Dmitry Olshansky

November 17, 2012

Re: Something needs to happen with shared, and soon.

Posted by Michel Fortin
in reply to Dmitry Olshansky

Michel Fortin

Posted in reply to Dmitry Olshansky

On 2012-11-16 18:56:28 +0000, Dmitry Olshansky <dmitry.olsh@gmail.com> said:

> 11/16/2012 5:17 PM, Michel Fortin пишет:
>> In case you want to protect two variables (or more) with the same mutex.
>> For instance:
>> 
>>      Mutex m;
>>      synchronized(m) int next_id;
>>      synchronized(m) Object[int] objects_by_id;
>> 
> 
> Wrap in a struct and it would be even much clearer and safer.
> struct ObjectRepository {
> 	int next_id;
> 	Object[int] objects_by_id;
> }
> //or whatever that combination indicates anyway
> synchronized ObjectRepository objeRepo;

I guess that'd be fine too.


> If we made a tiny change in the language that would allow different syntax for passing delegates mine would shine. Such a change at the same time enables more nice way to abstract away control flow.
> 
> Imagine:
> 
> access(object_by_id){
> 	...	
> };
> 
> to be convertible to:
> 
> (x){with(x){
> 	...
> }})(access(object_by_id));
> 
> More generally speaking a lowering:
> 
> expression { ... }
> -->
> (x){with(x){ ... }}(expression);
> 
> AFIAK it doesn't conflict with anything.
> 
> Or wait a sec. Even simpler idiom and no extra features.
> Drop the idea of 'access' taking a delegate. The other library idiom is to return a RAII proxy that locks/unlocks an object on construction/destroy.
> 
> with(lock(object_by_id))
> {
> 	... do what you like
> }
> 
> Fine by me. And C++ can't do it ;)

Clever. But you forgot to access the variable somewhere. What's its name within the with block? Your code would be clearer this way:

	{
		auto locked_object_by_id = lock(object_by_id);
		// … do what you like
	}

And yes you can definitely do that in C++.

I maintain that the "synchronized (var)" syntax is still much clearer, and greppable too. That could be achieved with an appropriate lowering.


>>> The key point is that Synchronized!T is otherwise an opaque type.
>>> We could pack a few other simple primitives like 'load', 'store' etc.
>>> All of them will go through lock-unlock.
>> 
>> Our proposals are pretty much identical. Your works by wrapping a
>> variable in a struct template, mine is done with a policy object/struct
>> associated with a variable. They'll produce the same code and impose the
>> same restrictions.
> 
> I kind of wanted to point out this disturbing thought about your proposal. That is a lot of extra syntax and rules added buys us very small gain - prettier syntax.

Sometime having something built in the language is important: it gives first-class status to some constructs. For instance: arrays. We don't need language-level arrays in D, we could just use a struct template that does the same thing. By integrating a feature into the language we're sending the message that this is *the* way to do it, as no other way can stand on equal footing, preventing infinite reimplementation of the concept within various libraries.

You might be right however than mutex-protected variables do not deserve this first class status.


>> Built-in shared(T) atomicity (sequential consistency) is a subject of
>> debate in this thread. It is not clear to me what will be the
>> conclusion, but the way I see things atomicity is just one of the many
>> policies you may want to use for keeping consistency when sharing data
>> between threads.
>> 
>> I'm not trilled by the idea of making everything atomic by default.
>> That'll lure users to the bug-prone expert-only path while relegating
>> the more generally applicable protection systems (mutexes) as a
>> second-class citizen.
> 
> That's why I think people shouldn't have to use mutexes at all.
> Explicitly - provide folks with blocking queues, Synchronized!T, concurrent containers  (e.g. hash map) and what not. Even Java has some useful incarnations of these.

I wouldn't say they shouldn't use mutexes at all, but perhaps you're right that they don't deserve first-class treatment. I still maintain that "syncronized (var)" should work, for clarity and consistency reasons, but using a template such as Synchronized!T when declaring the variable might be the best solution.


-- 
Michel Fortin
michel.fortin@michelf.ca
http://michelf.ca/

November 17, 2012

Re: Something needs to happen with shared, and soon.

Posted by Michel Fortin
in reply to Sean Kelly

Michel Fortin

Posted in reply to Sean Kelly

On 2012-11-16 15:23:37 +0000, Sean Kelly <sean@invisibleduck.org> said:

> On Nov 16, 2012, at 5:17 AM, Michel Fortin <michel.fortin@michelf.ca> wrote:
> 
>> On 2012-11-15 16:08:35 +0000, Dmitry Olshansky <dmitry.olsh@gmail.com> said:
>> 
>>> While the rest of proposal was more or less fine. I don't get why we
>>> need escape control of mutex at all - in any case it just opens a
>>> possibility to shout yourself in the foot.
>> 
>> In case you want to protect two variables (or more) with the same
>> mutex.
> 
> This is what setSameMutex was intended for in Druntime.  Except that no
> one uses it and people have requested that it be removed.  Perhaps
> that's because the semantics aren't great though.

Perhaps it's just my style of coding, but when designing a class that needs to be shared in C++, I usually use one mutex to protect only a couple of variables inside the object. That might mean I have two mutexes in one class for two sets of variables if it fits the access pattern. I also make the mutex private so that derived classes cannot access it. The idea is to strictly control what happens when each mutex is locked so that I can make sure I never have two mutexes locked at the same time without looking at the whole code base. This is to avoid deadlocks, and also it removes the need for recursive mutexes.

I'd like the language to help me enforce this pattern, and what I'm proposing goes in that direction.

Regarding setSameMutex, I'd argue that the semantics of having one mutex for a whole object isn't great. Mutexes shouldn't protect types, they should protect variables. Whether a class needs to protect its variables and how it does it is an implementation detail that shouldn't be leaked to the outside world. What the outside world should know is whether the object is thread-safe or not.

-- 
Michel Fortin
michel.fortin@michelf.ca
http://michelf.ca/

November 17, 2012

Re: Something needs to happen with shared, and soon.

Posted by Jason House
in reply to Sean Kelly

Jason House

Posted in reply to Sean Kelly

On Thursday, 15 November 2012 at 16:31:43 UTC, Sean Kelly wrote:
> On Nov 11, 2012, at 6:30 PM, Walter Bright <newshound2@digitalmars.com> wrote:
>> 
>> To make a shared type work in an algorithm, you have to:
>> 
>> 1. ensure single threaded access by aquiring a mutex
>> 2. cast away shared
>> 3. operate on the data
>> 4. cast back to shared
>> 5. release the mutex
>
>
> So what happens if you pass a reference to the now non-shared object to a function that caches a local reference to it?  Half the point of the attribute is to protect us from accidents like this.

The constructive thing to do may be to try and figure out what should users be allowed to do with locked shared data... I think the basic idea is that no references can be escaped; SafeD rules could probably help with that. Non-shared member functions might also need to be tagged with their ability to be called on locked, shared data.

November 17, 2012

Re: Something needs to happen with shared, and soon.

Posted by Jacob Carlborg
in reply to Michel Fortin

Jacob Carlborg

Posted in reply to Michel Fortin

On 2012-11-17 14:22, Michel Fortin wrote:

> Sometime having something built in the language is important: it gives
> first-class status to some constructs. For instance: arrays. We don't
> need language-level arrays in D, we could just use a struct template
> that does the same thing. By integrating a feature into the language
> we're sending the message that this is *the* way to do it, as no other
> way can stand on equal footing, preventing infinite reimplementation of
> the concept within various libraries.

If a feature can be implemented in a library with the same syntax, semantic and performance I see no reason to put it in the language.

-- 
/Jacob Carlborg

November 17, 2012

Re: Something needs to happen with shared, and soon.

Posted by Dmitry Olshansky
in reply to Michel Fortin

Dmitry Olshansky

Posted in reply to Michel Fortin

11/17/2012 5:22 PM, Michel Fortin пишет:
> On 2012-11-16 18:56:28 +0000, Dmitry Olshansky <dmitry.olsh@gmail.com>
>> Or wait a sec. Even simpler idiom and no extra features.
>> Drop the idea of 'access' taking a delegate. The other library idiom
>> is to return a RAII proxy that locks/unlocks an object on
>> construction/destroy.
>>
>> with(lock(object_by_id))
>> {
>>     ... do what you like
>> }
>>
>> Fine by me. And C++ can't do it ;)
>
> Clever. But you forgot to access the variable somewhere. What's its name
> within the with block?

Not having the name would imply you can't escape it :) But I agree it's not always clear where the writes go to when doing things inside the with block.

>Your code would be clearer this way:
>
>      {
>          auto locked_object_by_id = lock(object_by_id);
>          // … do what you like
>      }
>
> And yes you can definitely do that in C++.

Well, I actually did it in the past when C++0x was relatively new.
I just thought 'with' makes it more interesting. As to how access the variable - it depends on what it is.

>
> I maintain that the "synchronized (var)" syntax is still much clearer,
> and greppable too. That could be achieved with an appropriate lowering.

Yes! If we could make synchronized to be user-hookable this all would be more clear and generally useful. There was a discussion about providing a user defined semantics for synchronized block. It was clear and useful and a lot of folks were favorable of it. Yet it wasn't submitted as a proposal.

All other things being equal I believe we should go in this direction - amend a couple of things (say add a user-hookable synchronized) and start laying bricks for std.sharing.


-- 
Dmitry Olshansky

November 19, 2012

Re: Something needs to happen with shared, and soon.

Posted by deadalnix
in reply to Sean Kelly

deadalnix

Posted in reply to Sean Kelly

Le 15/11/2012 15:22, Sean Kelly a écrit :
> On Nov 15, 2012, at 3:05 PM, David Nadlinger<see@klickverbot.at>  wrote:
>
>> On Thursday, 15 November 2012 at 22:57:54 UTC, Andrei Alexandrescu wrote:
>>> On 11/15/12 1:29 PM, David Nadlinger wrote:
>>>> On Wednesday, 14 November 2012 at 17:54:16 UTC, Andrei Alexandrescu wrote:
>>>>> That is correct. My point is that compiler implementers would follow
>>>>> some specification. That specification would contain informationt hat
>>>>> atomicLoad and atomicStore must have special properties that put them
>>>>> apart from any other functions.
>>>>
>>>> What are these special properties? Sorry, it seems like we are talking
>>>> past each other…
>>>
>>> For example you can't hoist a memory operation before a shared load or after a shared store.
>>
>> Well, to be picky, that depends on what kind of memory operation you mean – moving non-volatile loads/stores across volatile ones is typically considered acceptable.
>
> Usually not, really.  Like if you implement a mutex, you don't want non-volatile operations to be hoisted above the mutex acquire or sunk below the mutex release.  However, it's safe to move additional operations into the block where the mutex is held.

If it is known that the memory read/write is thread local, this is safe, even in the case of a mutex.

November 19, 2012

Re: Something needs to happen with shared, and soon.

Posted by deadalnix
in reply to Jason House

deadalnix

Posted in reply to Jason House

Le 17/11/2012 05:49, Jason House a écrit :
> On Thursday, 15 November 2012 at 16:31:43 UTC, Sean Kelly wrote:
>> On Nov 11, 2012, at 6:30 PM, Walter Bright
>> <newshound2@digitalmars.com> wrote:
>>>
>>> To make a shared type work in an algorithm, you have to:
>>>
>>> 1. ensure single threaded access by aquiring a mutex
>>> 2. cast away shared
>>> 3. operate on the data
>>> 4. cast back to shared
>>> 5. release the mutex
>>
>>
>> So what happens if you pass a reference to the now non-shared object
>> to a function that caches a local reference to it? Half the point of
>> the attribute is to protect us from accidents like this.
>
> The constructive thing to do may be to try and figure out what should
> users be allowed to do with locked shared data... I think the basic idea
> is that no references can be escaped; SafeD rules could probably help
> with that. Non-shared member functions might also need to be tagged with
> their ability to be called on locked, shared data.

Nothing is safe if ownership cannot be statically proven. This is completely useless.

November 19, 2012

Re: Something needs to happen with shared, and soon.

Posted by Sönke Ludwig
in reply to deadalnix

Sönke Ludwig

Posted in reply to deadalnix

Am 19.11.2012 05:57, schrieb deadalnix:
> Le 17/11/2012 05:49, Jason House a écrit :
>> On Thursday, 15 November 2012 at 16:31:43 UTC, Sean Kelly wrote:
>>> On Nov 11, 2012, at 6:30 PM, Walter Bright <newshound2@digitalmars.com> wrote:
>>>>
>>>> To make a shared type work in an algorithm, you have to:
>>>>
>>>> 1. ensure single threaded access by aquiring a mutex
>>>> 2. cast away shared
>>>> 3. operate on the data
>>>> 4. cast back to shared
>>>> 5. release the mutex
>>>
>>>
>>> So what happens if you pass a reference to the now non-shared object to a function that caches a local reference to it? Half the point of the attribute is to protect us from accidents like this.
>>
>> The constructive thing to do may be to try and figure out what should users be allowed to do with locked shared data... I think the basic idea is that no references can be escaped; SafeD rules could probably help with that. Non-shared member functions might also need to be tagged with their ability to be called on locked, shared data.
> 
> Nothing is safe if ownership cannot be statically proven. This is completely useless.

But you can at least prove ownership under some limited circumstances. Limited, but (without having tested on a large scale) still practical.

Interest seems to be limited much more than those circumstances, but anyway: http://forum.dlang.org/thread/k831b6$1368$1@digitalmars.com

(the same approach that I already posted in this thread, but in a state that should be more or less bullet proof)

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation