June 11, 2012
On Mon, 11 Jun 2012 09:39:40 -0400, Artur Skawina <art.08.09@gmail.com> wrote:

> On 06/11/12 14:07, Steven Schveighoffer wrote:
>> However, allocating another heap block to do sharing, in my opinion, is worth the extra cost.  This way, you have clearly separated what is shared and what isn't.
>>
>> You can always cast to get around the limitations.
>
> "clearly separating what is shared and what isn't" *is* exactly what
> tagging the data with 'shared' does.


I posted a response, it showed up in the online forums, but for some reason didn't show up in my nntp client...

If you missed it, it is here.

http://forum.dlang.org/post/op.wfqtz5u0eav7ka@steves-laptop

-Steve
June 11, 2012
On 06/11/12 16:57, Steven Schveighoffer wrote:
> On Mon, 11 Jun 2012 09:39:40 -0400, Artur Skawina <art.08.09@gmail.com> wrote:
> 
>> On 06/11/12 14:07, Steven Schveighoffer wrote:
>>> However, allocating another heap block to do sharing, in my opinion, is worth the extra cost.  This way, you have clearly separated what is shared and what isn't.
>>>
>>> You can always cast to get around the limitations.
>>
>> "clearly separating what is shared and what isn't" *is* exactly what tagging the data with 'shared' does.
> 
> 
> I posted a response, it showed up in the online forums, but for some reason didn't show up in my nntp client...
> 
> If you missed it, it is here.
> 
> http://forum.dlang.org/post/op.wfqtz5u0eav7ka@steves-laptop

The mailing list delivered it too.

I'm against disallowing things that are not unsafe as such and have valid use cases, so we will probably not agree about that.

I considered the GC/mempool implications before arguing for allowing 'shared' fields inside unshared aggregates - the compiler has enough knowledge to pick the right pool, if it ever decides to treat "local" data differently. I'm not sure doing that would be good idea, in cases where the lifetime of an object cannot be determined statically. But deciding to use a global pool can always be done by checking if a shared field exists.

artur
June 11, 2012
On Mon, 11 Jun 2012 09:41:37 -0400, Artur Skawina <art.08.09@gmail.com> wrote:

> On 06/11/12 14:11, Steven Schveighoffer wrote:
>> On Mon, 11 Jun 2012 07:56:12 -0400, Artur Skawina <art.08.09@gmail.com> wrote:
>>
>>> On 06/11/12 12:35, Steven Schveighoffer wrote:
>>
>>>> I wholly disagree.  In fact, keeping the full qualifier intact *enforces* incorrect code, because you are forcing shared semantics on literally unshared data.
>>>>
>>>> Never would this start ignoring shared on data that is truly shared.  This is why I don't really get your argument.
>>>>
>>>> If you could perhaps explain with an example, it might be helpful.
>>>
>>> *The programmer* can then treat shared data just like unshared. Because every
>>> load and every store will "magically" work. I'm afraid that after more than
>>> two or three people touch the code, the chances of it being correct would be
>>> less than 50%...
>>> The fact that you can not (or shouldn't be able to) mix shared and unshared
>>> freely is one of the main advantages of shared-annotation.
>>
>> If shared variables aren't doing the right thing with loads and stores, then we should fix that.
>
> Where do you draw the line?
>
> shared struct S {
>    int i
>    void* p;
>    SomeStruct s;
>    ubyte[256] a;
> }
>
> shared(S)* p = ... ;
>
> auto v1 = p.i;
> auto v2 = p.p;
> auto v3 = p.s;
> auto v4 = p.a;
> auto v5 = p.i++;
>
> Are these operations on shared data all safe? Note that if these
> accesses would be protected by some lock, then the 'shared' qualifier
> wouldn't really be needed - compiler barriers, that make sure it all
> happens while this thread holds the lock, would be enough. (even the
> order of operations doesn't usually matter in that case and enforcing
> one would in fact add overhead)

No, they should not be all safe, I never suggested that.  It's impossible to engineer a one-size-fits-all for accessing shared variables, because it doesn't know what mechanism you are going to use to protect it.  As you say, once this data is protected by a lock, memory barriers aren't needed.  But requiring a lock is too heavy handed for all cases.  This is a good point to make about the current memory-barrier attempts, they just aren't comprehensive enough, nor do they guarantee pretty much anything except simple loads and stores.

Perhaps the correct way to implement shared semantics is to not allow access *whatsoever* (except taking the address of a shared piece of data), unless you:

a) lock the block that contains it
b) use some library feature that uses casting-away of shared to accomplish the correct thing.  For example, atomicOp.

None of this can prevent deadlocks, but it does create a way to prevent deadlocks.

If this was the case, stack data would be able to be marked shared, and you'd have to use option b (it would not be in a block).  Perhaps for simple data types, when memory barriers truly are enough, and a shared(int) is on the stack (and not part of a container), straight loads and stores would be allowed.

Now, would you agree that:

auto v1 = synchronized p.i;

might be a valid mechanism?  In other words, assuming p is lockable, synchronized p.i locks p, then reads i, then unlocks p, and the result type is unshared?

Also, inside synchronized(p), p becomes tail-shared, meaning all data contained in p is unshared, all data referred to by p remains shared.

In this case, we'd need a new type constructor (e.g. locked) to formalize the type.

Make sense?

-Steve
June 11, 2012
>> Are these operations on shared data all safe? Note that if these
>> accesses would be protected by some lock, then the 'shared' qualifier
>> wouldn't really be needed - compiler barriers, that make sure it all
>> happens while this thread holds the lock, would be enough. (even the
>> order of operations doesn't usually matter in that case and enforcing
>> one would in fact add overhead)
>
> No, they should not be all safe, I never suggested that. It's impossible
> to engineer a one-size-fits-all for accessing shared variables, because
> it doesn't know what mechanism you are going to use to protect it. As
> you say, once this data is protected by a lock, memory barriers aren't
> needed. But requiring a lock is too heavy handed for all cases. This is
> a good point to make about the current memory-barrier attempts, they
> just aren't comprehensive enough, nor do they guarantee pretty much
> anything except simple loads and stores.
>
> Perhaps the correct way to implement shared semantics is to not allow
> access *whatsoever* (except taking the address of a shared piece of
> data), unless you:
>
> a) lock the block that contains it
> b) use some library feature that uses casting-away of shared to
> accomplish the correct thing. For example, atomicOp.
>
It may be a good idea. Though I half-expect reads and writes to be atomic. Yet things like this are funky trap:
shread int x; //global
...
x = x + func();
//Booom! read-modify-write and not atomic, should have used x+= func()

So a-b set of rules could be more reasonable then it seems.

> None of this can prevent deadlocks, but it does create a way to prevent
> deadlocks.
>
> If this was the case, stack data would be able to be marked shared, and
> you'd have to use option b (it would not be in a block). Perhaps for
> simple data types, when memory barriers truly are enough, and a
> shared(int) is on the stack (and not part of a container), straight
> loads and stores would be allowed.
>
> Now, would you agree that:
>
> auto v1 = synchronized p.i;
>
> might be a valid mechanism? In other words, assuming p is lockable,
> synchronized p.i locks p, then reads i, then unlocks p, and the result
> type is unshared?
>
> Also, inside synchronized(p), p becomes tail-shared, meaning all data
> contained in p is unshared, all data referred to by p remains shared.
>
> In this case, we'd need a new type constructor (e.g. locked) to
> formalize the type.
>
> Make sense?
>

While I've missed a good portion of this thread I think we should explore this direction. Shared has to be connected with locks/synchronized.

-- 
Dmitry Olshansky
June 11, 2012
On Mon, 11 Jun 2012 13:42:37 -0400, Dmitry Olshansky <dmitry.olsh@gmail.com> wrote:

>> a) lock the block that contains it
>> b) use some library feature that uses casting-away of shared to
>> accomplish the correct thing. For example, atomicOp.
>>
> It may be a good idea. Though I half-expect reads and writes to be atomic. Yet things like this are funky trap:
> shread int x; //global
> ...
> x = x + func();
> //Booom! read-modify-write and not atomic, should have used x+= func()

We cannot prevent data races such as these (though we may be able to disable specific cases like this), since you can always split out this expression into multiple valid ones.  Also, you can hide details in functions:

x = func(x);

But we can say that you cannot *read or write* a shared variable non-atomically.  That is a goal I think is achievable by the type system and the language.  That arguably has no real-world value, ever, whereas the above may be valid in some cases (maybe you know more semantically about the application than the compiler can glean).

>
> While I've missed a good portion of this thread I think we should explore this direction. Shared has to be connected with locks/synchronized.

Yes, I agree.  If shared and synchronized are not connected somehow, the point of both seems rather lost.

As this was mostly a brainstorming post, I'll restate what I think as a reply to the original post, since my views have definitely changed.

-Steve
June 11, 2012
On 06/11/12 19:27, Steven Schveighoffer wrote:
> On Mon, 11 Jun 2012 09:41:37 -0400, Artur Skawina <art.08.09@gmail.com> wrote:
> 
>> On 06/11/12 14:11, Steven Schveighoffer wrote:
>>> On Mon, 11 Jun 2012 07:56:12 -0400, Artur Skawina <art.08.09@gmail.com> wrote:
>>>
>>>> On 06/11/12 12:35, Steven Schveighoffer wrote:
>>>
>>>>> I wholly disagree.  In fact, keeping the full qualifier intact *enforces* incorrect code, because you are forcing shared semantics on literally unshared data.
>>>>>
>>>>> Never would this start ignoring shared on data that is truly shared.  This is why I don't really get your argument.
>>>>>
>>>>> If you could perhaps explain with an example, it might be helpful.
>>>>
>>>> *The programmer* can then treat shared data just like unshared. Because every
>>>> load and every store will "magically" work. I'm afraid that after more than
>>>> two or three people touch the code, the chances of it being correct would be
>>>> less than 50%...
>>>> The fact that you can not (or shouldn't be able to) mix shared and unshared
>>>> freely is one of the main advantages of shared-annotation.
>>>
>>> If shared variables aren't doing the right thing with loads and stores, then we should fix that.
>>
>> Where do you draw the line?
>>
>> shared struct S {
>>    int i
>>    void* p;
>>    SomeStruct s;
>>    ubyte[256] a;
>> }
>>
>> shared(S)* p = ... ;
>>
>> auto v1 = p.i;
>> auto v2 = p.p;
>> auto v3 = p.s;
>> auto v4 = p.a;
>> auto v5 = p.i++;
>>
>> Are these operations on shared data all safe? Note that if these accesses would be protected by some lock, then the 'shared' qualifier wouldn't really be needed - compiler barriers, that make sure it all happens while this thread holds the lock, would be enough. (even the order of operations doesn't usually matter in that case and enforcing one would in fact add overhead)
> 
> No, they should not be all safe, I never suggested that.  It's impossible to engineer a one-size-fits-all for accessing shared variables, because it doesn't know what mechanism you are going to use to protect it.  As you say, once this data is protected by a lock, memory barriers aren't needed.  But requiring a lock is too heavy handed for all cases.  This is a good point to make about the current memory-barrier attempts, they just aren't comprehensive enough, nor do they guarantee pretty much anything except simple loads and stores.
> 
> Perhaps the correct way to implement shared semantics is to not allow access *whatsoever* (except taking the address of a shared piece of data), unless you:
> 
> a) lock the block that contains it
> b) use some library feature that uses casting-away of shared to accomplish the correct thing.  For example, atomicOp.

Exactly; this is what I'm after the whole time. And I think it can be done in most cases without casting away shared. For example by allowing the safe conversions from/to shared of results of expression involving shared data, but only under certain circumstances. Eg in methods with a shared 'this'.


> None of this can prevent deadlocks, but it does create a way to prevent deadlocks.
> 
> If this was the case, stack data would be able to be marked shared, and you'd have to use option b (it would not be in a block).  Perhaps for simple data types, when memory barriers truly are enough, and a shared(int) is on the stack (and not part of a container), straight loads and stores would be allowed.

Why? Consider the case of function that directly or indirectly launches a few threads and gives them the address of some local shared object. If the current thread also accesses this object, which has to be possible, then it must obey the same rules.


> Now, would you agree that:
> 
> auto v1 = synchronized p.i;
> 
> might be a valid mechanism?  In other words, assuming p is lockable, synchronized p.i locks p, then reads i, then unlocks p, and the result type is unshared?

I think I would prefer

   auto v1 = synchronized(p).i;

ie for the synchronized expression to lock the object, return an unshared reference, and the object be unlocked once this ref goes away. RLII. ;)

Which would then also allow for

   {
      auto unshared_p = synchronized(p);
      auto v1 = unshared_p.i;
      auto v2 = unshared_p.p;
      // etc
   }

and with a little more syntax sugar it could turn into

   synchronized (unshared_p = p) {
      auto v1 = unshared_p.i;
      auto v2 = unshared_p.p;
      // etc
   }


The problem with this is that it only unshares the head, which I think isn't enough. Hmm. One approach would be to allow

   shared struct S {
      ubyte* data;
      AStruct *s1;
      shared AnotherStruct *s2;
      shared S* next;
   }

and for synchronized(s){} to drop 'shared' from any field that isn't also marked as shared. IOW treat any 'unshared' field as owned by the object. (an alternative could be to tag the fields that should be unshared instead)

> Also, inside synchronized(p), p becomes tail-shared, meaning all data contained in p is unshared, all data referred to by p remains shared.
> 
> In this case, we'd need a new type constructor (e.g. locked) to formalize the type.

I should have read to the end i guess. :)

You mean something like I described above, only done by mutating the type of 'p'? That might work too.

But I need to think about this some more.

Why would we need 'locked'?


> Make sense?

More and more.

artur
June 11, 2012
On Mon, 11 Jun 2012 15:23:56 -0400, Artur Skawina <art.08.09@gmail.com> wrote:

> On 06/11/12 19:27, Steven Schveighoffer wrote:

>> Perhaps the correct way to implement shared semantics is to not allow access *whatsoever* (except taking the address of a shared piece of data), unless you:
>>
>> a) lock the block that contains it
>> b) use some library feature that uses casting-away of shared to accomplish the correct thing.  For example, atomicOp.
>
> Exactly; this is what I'm after the whole time. And I think it can be done
> in most cases without casting away shared. For example by allowing the safe
> conversions from/to shared of results of expression involving shared data,
> but only under certain circumstances. Eg in methods with a shared 'this'.

Good, I'm glad we are starting to come together.

>> None of this can prevent deadlocks, but it does create a way to prevent deadlocks.
>>
>> If this was the case, stack data would be able to be marked shared, and you'd have to use option b (it would not be in a block).  Perhaps for simple data types, when memory barriers truly are enough, and a shared(int) is on the stack (and not part of a container), straight loads and stores would be allowed.
>
> Why? Consider the case of function that directly or indirectly launches a few
> threads and gives them the address of some local shared object. If the current
> thread also accesses this object, which has to be possible, then it must obey
> the same rules.

I think this is possible for what I prescribed.  You need a special construct for locking and using shared data on the stack (for instance Lockable!S).

Another possible option is to consider the stack frame as the "container", and if it contains any shared data, put in a hidden mutex.

In order to do this correctly, we need a way to hook synchronized properly from library code.

>> Now, would you agree that:
>>
>> auto v1 = synchronized p.i;
>>
>> might be a valid mechanism?  In other words, assuming p is lockable, synchronized p.i locks p, then reads i, then unlocks p, and the result type is unshared?
>
> I think I would prefer
>
>    auto v1 = synchronized(p).i;

This kind of makes synchronized a type constructor, which it is not.

> ie for the synchronized expression to lock the object, return an unshared
> reference, and the object be unlocked once this ref goes away. RLII. ;)
>
> Which would then also allow for
>
>    {
>       auto unshared_p = synchronized(p);
>       auto v1 = unshared_p.i;
>       auto v2 = unshared_p.p;
>       // etc
>    }

I think this can be done, but I would not want to use synchronized.  One of the main benefits of synchronized is it's a block attribute, not a type attribute.  So you can't actually abuse it.

The locked type I specify below might fit the bill.  But it would have to be hard-tied to the block.  In other words, we would have to make *very* certain it would not escape the block.  Kind of like inout.

>> Also, inside synchronized(p), p becomes tail-shared, meaning all data contained in p is unshared, all data referred to by p remains shared.
>>
>> In this case, we'd need a new type constructor (e.g. locked) to formalize the type.
>
> I should have read to the end i guess. :)
>
> You mean something like I described above, only done by mutating
> the type of 'p'? That might work too.

Right, any accesses to p *inside* the block "magically" become locked(S) instead of shared(S).  We have to make certain locked(S) instances cannot escape, and we already do something like this with inout -- just don't allow members or static variables to be typed as locked(T).

I like replacing the symbol because then it doesn't allow you access to the outer symbol (although you can get around this, it should be made difficult).  As long as the locks are reentrant, it shouldn't pose a large problem, but obviously you should try and avoid locking the same data over and over again.

One interesting thing: synchronized methods now would mark this as locked(typeof(this)) instead of typeof(this).  So you can *avoid* the locking and unlocking code while calling member functions, while preserving it for the first call.

This is important -- you don't want to escape a reference to the unlocked type somewhere.

-Steve
June 11, 2012
On 06/11/12 22:21, Steven Schveighoffer wrote:
>>> Now, would you agree that:
>>>
>>> auto v1 = synchronized p.i;
>>>
>>> might be a valid mechanism?  In other words, assuming p is lockable, synchronized p.i locks p, then reads i, then unlocks p, and the result type is unshared?
>>
>> I think I would prefer
>>
>>    auto v1 = synchronized(p).i;
> 
> This kind of makes synchronized a type constructor, which it is not.

Yes; the suggestion was to also allow synchronized /expressions/, in addition to statements.


>> ie for the synchronized expression to lock the object, return an unshared reference, and the object be unlocked once this ref goes away. RLII. ;)
>>
>> Which would then also allow for
>>
>>    {
>>       auto unshared_p = synchronized(p);
>>       auto v1 = unshared_p.i;
>>       auto v2 = unshared_p.p;
>>       // etc
>>    }
> 
> I think this can be done, but I would not want to use synchronized.  One of the main benefits of synchronized is it's a block attribute, not a type attribute.  So you can't actually abuse it.

There's a precedent, mixin expressions.

However, there's no need to invent new constructs, as this already works:

    {
       auto unshared_p = p.locked;
       auto v1 = unshared_p.i;
       auto v2 = unshared_p.p;
       // etc
    }

and does not require compiler or language changes.

I'm using this idiom with mutexes and semaphores; the 'locked' implementation is *extremely* fragile, it's very easy to confuse the compiler, which then spits out nonsensical error messages and refuses to cooperate. But the above should already be possible, only the return type could be problematic; keeping 'p' opaque would be best. I'll play with this when I find some time.

But 'synchronized' and 'shared' are really two different things, I probably shouldn't have used your original example as a base, as it only added to the confusion, sorry.

'synchronized' allows you to implement critical sections.
'shared' is just a way to mark some data as needing special treatment.

If all accesses to an object are protected by 'synchronized', either explicitly or implicitly (by using a struct or class marked as synchronized) then you don't need to mark the data as 'shared' at all. It would be pointless - the thread that owns the lock also owns the data.

'shared' is what lets you implement the locking primitives used by synchronized and various lock-free schemes. (right now 'shared' alone isn't powerful enough, yes)

You can use one or the other, sometimes even both, but they are not directly tied to each other. So there's no need for 'synchronized' to unshare anything, at least not in the simple mutex case. Accessing objects both with and without holding a lock is extremely rare.


> The locked type I specify below might fit the bill.  But it would have to be hard-tied to the block.  In other words, we would have to make *very* certain it would not escape the block.  Kind of like inout.

    void f(scope S*);
    ...
    {
       auto locked_p = p.locked;
       f(locked_p.s);
    }

Requiring the signature to be 'void f(locked S*);' would not be a good idea; this must continue to work and introducing another type would exclude all code not specifically written with it in mind, like practically all libraries.


> This is important -- you don't want to escape a reference to the unlocked type somewhere.

Yes, but it needs another solution. 'scope' might be enough, but right now we'd have to trust the programmer completely...

(It's about not leaking refs to *inside* the locked object, not just
'p' (or 'locked_p') itself)

artur
June 11, 2012
On 06/12/12 00:00, Artur Skawina wrote:
> On 06/11/12 22:21, Steven Schveighoffer wrote:
>>>> Now, would you agree that:
>>>>
>>>> auto v1 = synchronized p.i;
>>>>
>>>> might be a valid mechanism?  In other words, assuming p is lockable, synchronized p.i locks p, then reads i, then unlocks p, and the result type is unshared?

What I think you want is relatively simple, something like this:

   struct synchronized(m) S {
      int i;
      void *p;
      Mutex m;
   }

and then for S to be completely opaque, unless inside a synchronized statement. So

   S* s = ...
   auto v1 = s.i; // "Error: access to 's.i' requires synchronization"
   synchronized (s) {
      auto v2 = s.i;
      // ...
   }
   auto v3 = s.p; // "Error: access to 's.p' requires synchronization"

and there's no 'shared' involved at all.

Provided that no reference to a locked 's' can escape this should be enough to solve this problem.

Preventing the leaks while not unnecessarily restricting what can be done inside the synchronized block would be a different problem. The obvious solution would be to treat all refs gotten from or via 's' as scoped (and trust the programmer; with time the enforcing can be improved), but sometimes you will actually want to remove objects from a synchronized container - so that must be possible too.

artur
June 12, 2012
Le 08/06/2012 01:51, Steven Schveighoffer a écrit :
> 2. shared value types.

2. You can have value type on heap. Or value types that point to shared data.