July 10, 2020
On Friday, July 10, 2020 12:30:16 PM MDT mw via Digitalmars-d-learn wrote:
> On Friday, 10 July 2020 at 17:35:56 UTC, Steven Schveighoffer
>
> wrote:
> > Mark your setTime as shared, then cast away shared (as you don't need atomics once it's locked), and assign:
> >
> > synchronized setTime(ref SysTime t) shared {
> >
> >     (cast()this).time = t;
> >
> > }
>
> I know I can make it work by casting, my question is:
>
> we had a lock on the owning shared object already, WHY we still need the cast to make it compile.

Because the type system has no way of knowing that access to that shared object is currently protected, and baking that into the type system is actually very difficult - especially if you don't want to be super restrictive about what is allowed.

The only scheme that anyone has come up thus far with which would work is TDPL's synchronized classes (which have never been implemented), but in order for them to work, they would have to be restrictive about what you do with the member variables, and ultimately, the compiler would still only be able to implicitly remove the outer layer of shared (i.e. the layer sitting directly in the class object itself), since that's the only layer that the compiler could prove hadn't had any references to it escape. So, you'd have to create a class just to be able to avoid casting, and it wouldn't implicitly remove enough of shared to be useful in anything but simple cases.

Sure, it would be great if we could have shared be implicitly removed when the object in question is protected by a mutex, but the type system would have to know that that mutex was associated with that object and be able to prove not only that that mutex was locked but that no other piece of code could possibly access that shared object without locking that mutex. It would also have to be able to prove that no thread-local references escaped from the code where shared was implicitly removed. It's incredibly difficult to bake the required information into the type system even while be very restrictive about what's allowed let alone while allowing code to be as flexible as code generally needs to be - especially in a systems language like D.

If someone actually manages to come up with an appropriate scheme that lets us implicitly removed shared under some set of circumstances, then we may very well get that ability at some point in the future, but it seems very unlikely as things stand, and even if someone did manage it, it's even less likely that it would work outside of a limited set of use cases, since there are a variety of ways of dealing with safely accessing data across threads.

So, for the forseeable future, explicit casts are generally going to be required when dealing with shared.

- Jonathan M Davis



July 11, 2020
On Friday, 10 July 2020 at 17:18:25 UTC, mw wrote:
> On Friday, 10 July 2020 at 08:48:38 UTC, Kagamin wrote:
>> On Friday, 10 July 2020 at 05:12:06 UTC, mw wrote:
>>> looks like we still have to cast:
>>> as of 2020, sigh.
>>
>> Why not?
>
> Because cast is ugly.

Implicitly escaping thread local data into shared context is much uglier than a cast. D disallows such implicit sharing, and thus ensures existence of thread local data on the language level. SysTime wasn't designed to be shared and due to this is incompatible with sharing by default, which enforces the promise that SysTime must be thread local, because it wasn't designed to be shared.

synchronized setTime(ref SysTime t) shared {
    (cast()this).time = t;
}
Steven's solution isn't good in the general case, because it still puts thread local data in shared context, which itself is a problem, because it makes thread local data implicitly shared, and when you work with such implicitly shared thread local data, you can't assume it's thread local, because it might be escaped into shared context. In this case the language prevented implicit sharing of thread local data (this is what shared does and does it well contrary to the popular myth that shared is broken).
July 11, 2020
On 10/7/20 20:30, mw wrote:
> On Friday, 10 July 2020 at 17:35:56 UTC, Steven Schveighoffer wrote:
>> Mark your setTime as shared, then cast away shared (as you don't need atomics once it's locked), and assign:
>>
>> synchronized setTime(ref SysTime t) shared {
>>     (cast()this).time = t;
>> }
> 
> I know I can make it work by casting, my question is:
> 
> we had a lock on the owning shared object already, WHY we still need the cast to make it compile.
> 

Because the system don't know if just this lock is enough to protect this specific access. When you have multiple locks protecting multiple data, things can become messy.

What I really miss is some way of telling the compiler "OK, I know what I'm doing, I'm already in a critical section, and that all the synchronization issues have been already managed by me".

Within this block, shared would implicitly convert to non-shared, and the other way round, like this (in a more complex setup with a RWlock):

```
setTime(ref SysTime t) shared {
	synchronized(myRWMutex.writer) critical_section {  // From this point I can forget about shared
		time = t;
	}
}
```

As a workaround, I have implemented the following trivial helpers:

```
mixin template unshareThis() {
    alias S = typeof(this);
    static if (is(S C == shared C)) {}
    static if (is(S == class) || is(S == interface)) {
        C unshared = cast(C) this;
    } else static if (is(S == struct)) {
        C* unshared = cast(C*) &this;
    } else {
        static assert(0, "Only classes, interfaces and structs can be unshared");
    }
}


pragma(inline, true);
ref unshare(S)(return ref S s) {
	static if (is (S C == shared C)) { }
    return *(cast(C*) &s);
}
```

With them you should be able to do either:

```
synchronized setTime(ref SysTime t) shared {
	mixin unshareThis;
	unshared.time = t;
}
```
(useful if you need multiple access), or:

```
synchronized setTime(ref SysTime t) shared {
	time.unshare = t;
}
```
July 12, 2020
On 7/11/20 6:15 AM, Arafel wrote:
> 
> Because the system don't know if just this lock is enough to protect this specific access. When you have multiple locks protecting multiple data, things can become messy.

Yes.

> 
> What I really miss is some way of telling the compiler "OK, I know what I'm doing, I'm already in a critical section, and that all the synchronization issues have been already managed by me".

You do. It's a cast.

> Within this block, shared would implicitly convert to non-shared, and the other way round, like this (in a more complex setup with a RWlock):
> 
> ```
> setTime(ref SysTime t) shared {
>      synchronized(myRWMutex.writer) critical_section {  // From this point I can forget about shared
>          time = t;
>      }
> }
> ```

This isn't checkable by the compiler.

You could accidentally end up referencing shared things as unshared when the lock is unlocked. If you remove shared, you need to know and understand the consequences, and the compiler can't help there, because the type qualifier has been removed, so it's not aware of which things are going to become shared after the lock is gone.

-Steve
July 12, 2020
On 7/11/20 1:03 AM, Kagamin wrote:
> Steven's solution isn't good in the general case

Right, you need to know that SysTime is actually a value type, and so it can be implicitly copied without problems with aliasing.

In fact, the cast isn't needed to ensure there is no lingering aliasing. I can tell it's a value type because:

const SysTime x;
SysTime y;
y = x; // ok.

Likewise, I technically could just copy to a shared one, but the problem is that the actual act of writing the field is subject to memory problems. It has nothing to do with the SysTime internals.

To make the solution more "correct" you could mark the incoming SysTime as const.

-Steve
July 13, 2020
On 13/7/20 3:46, Steven Schveighoffer wrote:
> On 7/11/20 6:15 AM, Arafel wrote:
>>
>> What I really miss is some way of telling the compiler "OK, I know what I'm doing, I'm already in a critical section, and that all the synchronization issues have been already managed by me".
> 
> You do. It's a cast.
> 

Yes, and that's what I'm doing (although with some helper function to make it look slightly less ugly), but for non-reference types I have to do it every single time you use the variables, and it's annoying for anything beyond trivial.

There's no way to avoid it, because at best you can get a pointer that will be enough for most things, but it will show for instance if you want to use it as a parameter to another function.

Also, with more complex data types like structs and AAs where not only the AA itself, but also the members, keys and values become shared, it's *really* annoying, because there's no easy way you can get a "fully" non-shared reference, because `cast()` will *often* only remove the external shared layer (I'm not sure it's always the case, it has happen semi-randomly to me, and what it's worse, I don't know the rules for that).

Also, it becomes a real pain when you have to send those types to generic code that is not "share-aware".

And for basic types, you'll be forced to use atomicOp all the time, or again resort to pointers.

So yes, it's not impossible, but it's really, really inconvenient, to the point of making `shared` almost unusable beyond the most simple cases. In fact, I would be happy if it had to take a list of variables, and ignore `shared` just for them (and their members):

>> Within this block, shared would implicitly convert to non-shared, and the other way round, like this (in a more complex setup with a RWlock):
>>
>> ```
>> setTime(ref SysTime t) shared {
>>      synchronized(myRWMutex.writer) critical_section {  // From this point I can forget about shared
>>          time = t;
>>      }
>> }
>> ```
> 
> This isn't checkable by the compiler.
> 

That's exactly why what I propose is a way to *explicitly* tell the compiler about it, like @system does for safety. I used `critical_section`, but perhaps `@critical_section` would have been clearer. Here is be a more explicit version specifying the variables to which it applies (note that you'd be able to use "this", or leave it empty and have it apply to everything):

```
void setTime(ref SysTime t) shared {
    synchronized(myRWMutex.writer) {
        @critical_section(time) {  // From this point I can forget about shared
            time = t;
        }
    }
}
```

Here it doesn't make a difference because the critical section is a single line (so it's even longer), but if you had to use multiple variables like that in a large expression, it'd become pretty much impossible to understand without it:

```
import std;

synchronized shared class TimeCount { // It's a synchronized class, so automatically locking
	public:
	void startClock() {
		cast() startTime = Clock.currTime; // Here I have to cast the lvalue
        // startTime = cast(shared) Clock.currTime; // Fails because opAssign is not defined for shared
	}
	void endClock() {
		cast() endTime = Clock.currTime; // Again unintuitively casting the lvalue
	}
	void calculateDuration() {
        timeEllapsed = cast (shared) (cast() endTime - cast() startTime); // Here I can also cast the rvalue, which looks more natural
	}

	private:
	SysTime startTime;
	SysTime endTime;
	Duration timeEllapsed;
}
```

Non-obvious lvalue-casts all over the place, and even `timeEllapsed = cast (shared) (cast() end - cast() start);`.

And that one is not even too complex... I know in this case you can reorganize things, but it was just an example of what happens when you have to use multiple shared variables in an expression.

> You could accidentally end up referencing shared things as unshared when the lock is unlocked. If you remove shared, you need to know and understand the consequences, and the compiler can't help there, because the type qualifier has been removed, so it's not aware of which things are going to become shared after the lock is gone.
> 
> -Steve

Well, it's meant as a low level tool, similar to what @system does for memory safety. You can't blame the compiler if you end up doing something wrong with your pointer arithmetic or with your casts from and to void* in your @system code, can you?
July 13, 2020
On 7/13/20 3:26 AM, Arafel wrote:
> On 13/7/20 3:46, Steven Schveighoffer wrote:
>> On 7/11/20 6:15 AM, Arafel wrote:
>>>
>>> What I really miss is some way of telling the compiler "OK, I know what I'm doing, I'm already in a critical section, and that all the synchronization issues have been already managed by me".
>>
>> You do. It's a cast.
>>
> 
> Yes, and that's what I'm doing (although with some helper function to make it look slightly less ugly), but for non-reference types I have to do it every single time you use the variables, and it's annoying for anything beyond trivial.
> 
> There's no way to avoid it, because at best you can get a pointer that will be enough for most things, but it will show for instance if you want to use it as a parameter to another function.
> 
> Also, with more complex data types like structs and AAs where not only the AA itself, but also the members, keys and values become shared, it's *really* annoying, because there's no easy way you can get a "fully" non-shared reference, because `cast()` will *often* only remove the external shared layer (I'm not sure it's always the case, it has happen semi-randomly to me, and what it's worse, I don't know the rules for that).

cast() will remove as little as possible, but for most cases, including classes and struts, this means the entire tree referenced is now unshared.

An AA does something really useless, which I didn't realize.

If you have a shared int[int], and use cast() on it, it becomes shared(int)[int]. Which I don't really understand the point of.

But in any case, casting away shared is doable, even if you need to type a bit more.

> 
> Also, it becomes a real pain when you have to send those types to generic code that is not "share-aware".
> 
> And for basic types, you'll be forced to use atomicOp all the time, or again resort to pointers.

The intent is to cast away shared on the ENTIRE aggregate, and then use everything in the aggregate as unshared.

I can imagine something like this:

ref T unshared(T)(return ref shared(T) item) { return *(cast(T*)&item); }

with(unshared(this)) {
    // implementation using unshared things
}

I wasn't suggesting that for each time you access anything in a shared object, you need to do casting. In essence, it's what you are looking for, but just opt-in instead of automatic.

>>> Within this block, shared would implicitly convert to non-shared, and the other way round, like this (in a more complex setup with a RWlock):
>>>
>>> ```
>>> setTime(ref SysTime t) shared {
>>>      synchronized(myRWMutex.writer) critical_section {  // From this point I can forget about shared
>>>          time = t;
>>>      }
>>> }
>>> ```
>>
>> This isn't checkable by the compiler.
>>
> 
> That's exactly why what I propose is a way to *explicitly* tell the compiler about it, like @system does for safety. I used `critical_section`, but perhaps `@critical_section` would have been clearer. Here is be a more explicit version specifying the variables to which it applies (note that you'd be able to use "this", or leave it empty and have it apply to everything):
> 
> ```
> void setTime(ref SysTime t) shared {
>      synchronized(myRWMutex.writer) {
>          @critical_section(time) {  // From this point I can forget about shared
>              time = t;
>          }
>      }
> }
> ```

Yeah, this looks suspiciously like the with statement above. We seem to be on the same page, even if having different visions of who should implement it.

> Here it doesn't make a difference because the critical section is a single line (so it's even longer), but if you had to use multiple variables like that in a large expression, it'd become pretty much impossible to understand without it:
> 
> ```
> import std;
> 
> synchronized shared class TimeCount { // It's a synchronized class, so automatically locking
>      public:
>      void startClock() {
>          cast() startTime = Clock.currTime; // Here I have to cast the lvalue
>          // startTime = cast(shared) Clock.currTime; // Fails because opAssign is not defined for shared
>      }
>      void endClock() {
>          cast() endTime = Clock.currTime; // Again unintuitively casting the lvalue
>      }
>      void calculateDuration() {
>          timeEllapsed = cast (shared) (cast() endTime - cast() startTime); // Here I can also cast the rvalue, which looks more natural
>      }
> 
>      private:
>      SysTime startTime;
>      SysTime endTime;
>      Duration timeEllapsed;
> }
> ```
> 
> Non-obvious lvalue-casts all over the place, and even `timeEllapsed = cast (shared) (cast() end - cast() start);`.
> 
> And that one is not even too complex... I know in this case you can reorganize things, but it was just an example of what happens when you have to use multiple shared variables in an expression.

You are better off separating the implementation of the shared and unshared parts. That is, you have synchronized methods, but once you are synchronized, you cast away shared and all the implementation is normal looking.

Compare:

class TimeCount {
    public:
    void startClock() {
        startTime = Clock.currTime;
    }
    synchronized void startClock() shared {
       (cast()this).startClock();
    }
    void endClock() {
        endTime = Clock.currTime;
    }
    synchronized void endClock() shared {
       (cast()this).endClock();
    }
    void calculateDuration() {
        timeEllapsed = endTime - startTime;
    }
    synchronized void calculateDuration() shared {
        (cast()this).calculateDuration();
    }

    private:
    SysTime startTime;
    SysTime endTime;
    Duration timeEllapsed;
}

I would imagine a mixin could accomplish a lot of this, but you have to be careful that the locking properly protects all the data.

A nice benefit of this approach is that no locking is needed when the instance is thread-local.

> Well, it's meant as a low level tool, similar to what @system does for memory safety. You can't blame the compiler if you end up doing something wrong with your pointer arithmetic or with your casts from and to void* in your @system code, can you?

I think we may have been battling a strawman here. I assumed you were asking for synchronized to be this mechanism, when it seems you actually were asking for *any* tool. I just don't want the locking to be conflated with "OK now I can safely access any data because something was locked!". It needs to be opt-in, because you understand the risks.

I think those tools are necessary for shared to have a good story, whether the compiler implements it, or a library does.

-Steve
July 13, 2020
On 13/7/20 14:18, Steven Schveighoffer wrote:
> 
> cast() will remove as little as possible, but for most cases, including classes and struts, this means the entire tree referenced is now unshared.
> 

Yeah, but the whole lvalue cast looks just non-obvious and ugly to me:

```
cast() foo = bar;
```

It looks like an ad-hoc hack, and I haven't seen it used anywhere else. I don't even think it's well-documented (it's probably somewhere in the grammar, without much explanation of what it does or what it would be useful for). I know I had to asks in the forums because I couldn't even assign to a shared SysTime!

> An AA does something really useless, which I didn't realize.
> 
> If you have a shared int[int], and use cast() on it, it becomes shared(int)[int]. Which I don't really understand the point of.
> 
> But in any case, casting away shared is doable, even if you need to type a bit more.
> 

Sure, it's doable, but the readability suffers a lot, and also it's just too error-prone.

> The intent is to cast away shared on the ENTIRE aggregate, and then use everything in the aggregate as unshared.
> 
> I can imagine something like this:
> 
> ref T unshared(T)(return ref shared(T) item) { return *(cast(T*)&item); }
> 
> with(unshared(this)) {
>      // implementation using unshared things
> }
> 
> I wasn't suggesting that for each time you access anything in a shared object, you need to do casting. In essence, it's what you are looking for, but just opt-in instead of automatic.
> 

Yes, that would be nice as a workaround, although ideally I'd like a more comprehensive and general solution.

Sometimes you don't need to strip shared only from `this`, sometimes only it's only from some parts, and sometimes also from some external objects.

To be clear, I'm so far assuming it's explicitly opt-in by the user.

I wouldn't mind seen something done with `synchronized` classes, but that's probably a much more complex issue.

> 
> Yeah, this looks suspiciously like the with statement above. We seem to be on the same page, even if having different visions of who should implement it.
> 

I think we're in "violent agreement" territory here :-)

I honestly would be happy if there were a reliable library solution that worked even now, because so far for any non-trivial situation I have to spend more time casting from and to shared than doing the actual work, and the code becomes a mess to follow afterwards.

> 
> You are better off separating the implementation of the shared and unshared parts. That is, you have synchronized methods, but once you are synchronized, you cast away shared and all the implementation is normal looking.
> 
> Compare:
> 
> class TimeCount {
>      public:
>      void startClock() {
>          startTime = Clock.currTime;
>      }
>      synchronized void startClock() shared {
>         (cast()this).startClock();
>      }
>      void endClock() {
>          endTime = Clock.currTime;
>      }
>      synchronized void endClock() shared {
>         (cast()this).endClock();
>      }
>      void calculateDuration() {
>          timeEllapsed = endTime - startTime;
>      }
>      synchronized void calculateDuration() shared {
>          (cast()this).calculateDuration();
>      }
> 
>      private:
>      SysTime startTime;
>      SysTime endTime;
>      Duration timeEllapsed;
> }
> 
> I would imagine a mixin could accomplish a lot of this, but you have to be careful that the locking properly protects all the data.
> 
> A nice benefit of this approach is that no locking is needed when the instance is thread-local.
> 

Just thinking of the amount of boilerplate makes my head spin. Even if a mixin could somehow automate it, I still think there should be a "proper" way to do it, without that much hacking around.

Furthermore, In my case I'm trying to do fine-grained locking, and I might have to get different locks within the same function. Of course I could split the function, but it would be constantly interrupting the "natural flow" of what I'm trying to do, and it would become so much harder to understand and to reason about.

And these functions wouldn't make sense by themselves, would probably need access to locals from the parent function, and would only be called from one place... so I see them as a kind of anti-pattern.

Also, `shared` and `synchronized` would become in this case pretty much useless then when applied to a class / structure.

> 
> I think we may have been battling a strawman here. I assumed you were asking for synchronized to be this mechanism, when it seems you actually were asking for *any* tool. I just don't want the locking to be conflated with "OK now I can safely access any data because something was locked!". It needs to be opt-in, because you understand the risks.
> 
> I think those tools are necessary for shared to have a good story, whether the compiler implements it, or a library does.
> 
> -Steve

I totally agree with this. As I mentioned, I wouldn't mind `synchronized` classes becoming apt for the trivial cases (i.e. you just have a shared counter, or something equally simple). Of course they would have to be much more restricted in what they can do than they are now, so it's probably not going to happen (code breakage and everything).

In fact, I'm working on a POD-Proxy that would automatically guard access to members with a per-instance lock, and I think it's the kind of situation `synchronized` classes could be useful for.

But that's orthogonal to the issue here of being able to have something like @system or @trusted for `shared`.

A.
July 14, 2020
On Monday, 13 July 2020 at 07:26:06 UTC, Arafel wrote:
> That's exactly why what I propose is a way to *explicitly* tell the compiler about it, like @system does for safety.

With __gshared you can opt out from sharing safety, then you're back to old good C-style multithreading.
July 14, 2020
---
import std;

shared class TimeCount {
	void startClock() {
		auto me = cast()this;
		me.startTime = Clock.currTime;
	}
	void endClock() {
		auto me = cast()this;
		me.endTime = Clock.currTime;
	}
	void calculateDuration() {
		auto me = cast()this;
		me.elapsed = me.endTime - me.startTime;
	}

	private:
	SysTime startTime;
	SysTime endTime;
	Duration elapsed;
}
---
And this is shorter than your unshared member specification.