July 08, 2015
On 8 July 2015 at 11:20, wobbles via Digitalmars-d < digitalmars-d@puremagic.com> wrote:

> After reading the recent "Lessons Learned" article [1], and reading a few comments on the thread, there was a mention of using __gshared over shared.
>
> What exactly is the difference here?
> Are they 2 keywords to do the same thing, or are there specific use cases
> to both?
> Is there plans to 'converge' them at some point?
>
> [1] https://www.reddit.com/r/programming/comments/3cg1r0/lessons_learned_writing_a_filesystem_in_d/
>
>
>
http://forum.dlang.org/post/mailman.739.1431034764.4581.digitalmars-d@puremagic.com

Iain


July 08, 2015
On Wed, 08 Jul 2015 10:10:55 +0000, Jonathan M Davis wrote:

> Regardless, while I would very much like to see shared properly ironed out, I'm _very_ grateful that thread-local is the default in D. It's just so much saner.

+1

July 09, 2015
On Wednesday, 8 July 2015 at 21:15:19 UTC, deadalnix wrote:
> On Wednesday, 8 July 2015 at 12:08:37 UTC, Jonathan M Davis wrote:

>> I know that there are a number of people who get frustrated with shared and using __gshared instead, but unless you fully understand what you're doing and how the language works, and you're _really_ careful, you're going to shoot yourself in the foot it subtle ways if you do that.
>>
>> - Jonathan M Davis
>
> Amen

What sort of subtle ways? Can you give examples that are not effectively the same subtle ways you would encounter with pthreads in C/C++? I have been running with the assumption that __gshared effectively bypasses TLS, which again, feels sort of dirty to use a __ prefixed keyword for that, but, yeah...

I'm not sure why I don't see the magic with synchronized classes. To me, they have a fundamental flaw in the fact they are classes. While I don't mind much their existence, I would very much dislike if that is the only convenient way to use the shared/synchronized mechanisms.

Regarding your point about multiple pieces of data being guarded by multiple mutexes using the proposed design, we could perhaps do it this way:

@lock(mutex) shared {
  int moo;
  float foo;
}


We could implement synchronized classes/structs like this:

struct Foo {
  void cowsay() synchronized(moootex) {
     // your synchronized method implementation
  }

  void cowsay() {
    synchronized(mootex) {
      // no sugar
    }
  }

  @lock(moootex) private shared {
    int m_moo;
    float m_foo;
  }
}

I think we should have avoid Java's non-sense of having to declare a class to do anything, and instead find generic ways to do things that are useful for multiple paradigms.
July 09, 2015
On Thursday, 9 July 2015 at 12:39:00 UTC, Márcio Martins wrote:
> On Wednesday, 8 July 2015 at 21:15:19 UTC, deadalnix wrote:
>> On Wednesday, 8 July 2015 at 12:08:37 UTC, Jonathan M Davis wrote:
>
>>> I know that there are a number of people who get frustrated with shared and using __gshared instead, but unless you fully understand what you're doing and how the language works, and you're _really_ careful, you're going to shoot yourself in the foot it subtle ways if you do that.
>>>
>>> - Jonathan M Davis
>>
>> Amen
>
> What sort of subtle ways? Can you give examples that are not effectively the same subtle ways you would encounter with pthreads in C/C++? I have been running with the assumption that __gshared effectively bypasses TLS, which again, feels sort of dirty to use a __ prefixed keyword for that, but, yeah...

Well, the compiler is free to assume that a variable that is not marked as shared is thread-local. So, it's free to make optimizations based on that. So, for instance, it can know for a fact that

auto foo = getFoo();
auto result1 = foo.constPureFunction(); // This function _cannot_ mutate foo
auto result2 = foo.constPureFunction(); // This function _cannot_ mutate foo
auto bar = foo;

So, it knows that the value of bar is identical to the value of foo and that result1 and result2 are guaranteed to be the same, because it knows that no other thread can possibly have mutated foo within this code, and there's no way that this code mutated foo even through another reference to the same data on the same thread. And it can know that thanks to how const, pure, and TLS all work. The compiler is free to optimize the code or make other alterations to it based on that knowledge, so if it makes an optimization based on that, and foo is actually shared across threads (either because the object it refers to was originally __gshared or because shared was cast away incorrectly), then you're going to have incorrect machine code. And what optimizations the compiler does with code like this could change over time. And unless you're an expert in the language and in the compiler, you're not going to know when the compiler is going to make optimizations where the fact that the variable is in TLS factors in. So, you're not going to know when the compiler might optimize your code in ways that won't work with __gshared, and what optimizations it does or doesn't do right now won't necessarily be the same ones that it does or doesn't do later.

You have the same problem with shared, but in that case, the compiler makes it so that you have to cast away shared to get into this mess. It protects against doing stuff like accidentally passing a shared object around into code that will treat is a thread-local. Heck, with a __gshared object, if you change its type, it could go from being a value type where passing it to other code works just fine because it's truly copied to being a reference type (or partial reference type) where it's not copied (or only partially copied), and the compiler won't be able to help you catch the points where you were doing a full copy before but aren't now. And if it's your coworker that changed the definition of the type of the variable that you marked as __gshared, you could be screwed without knowing it.

Really, we can't tell what subtle behavioral problems you're risking with __gshared, because that depends on what the compiler is currently able to do with the assumption that a variable is in TLS. You run into all of the problems that you risk with sharing variables in threads in C++ only worse, because the D compiler is free to assume that an object is thread-local unless it's marked as shared and thus can make optimizations based on that, whereas the C++ compiler can't. And you've thrown away all of the compiler's help by using __gshared. __gshared is intended specifically for use with interacting with C code where we don't really have a choice, and you have to be careful with it. For everything else, if you need to share data across threads, that's what shared is for, and the compiler then knows that it's shared, so it will optimize differently, and it'll yell at you when you misuse it. Ultimately, you still have the risk of screwing it up when you cast away shared when the object is protected by a lock, but then at least, even if you end up with a subtle bug, because a thread-local reference to the shared data escaped the lock, it's a lot easier to figure out where you've misused shared incorrectly in D than figuring out where you might have screwed it up in C++, because all of your shared objects are explicitly marked as such, and it's the points where you cast away shared that risk problems, so you have a lot less code to look at.

As annoying as it can be, shared is your friend. __gshared is not.

- Jonathan M Davis
July 09, 2015
On Thursday, 9 July 2015 at 14:03:18 UTC, Jonathan M Davis wrote:
> Well, the compiler is free to assume that a variable that is not marked as shared is thread-local. So, it's free to make optimizations based on that. So, for instance, it can know for a fact that
>
> auto foo = getFoo();
> auto result1 = foo.constPureFunction(); // This function _cannot_ mutate foo
> auto result2 = foo.constPureFunction(); // This function _cannot_ mutate foo
> auto bar = foo;
>
> So, it knows that the value of bar is identical to the value of foo and that result1 and result2 are guaranteed to be the same,

Pretty sure that's the same as in C++. Unless there was an acquire operation/barrier in there, the compiler is free to assume that two sequential reads to a memory location, without an intervening write (only considering the same thread), will return the same result. The optimisations that are forbidden in C++ are more subtle.

> Really, we can't tell what subtle behavioral problems you're risking with __gshared, because that depends on what the compiler is currently able to do with the assumption that a variable is in TLS. You run into all of the problems that you risk with sharing variables in threads in C++ only worse, because the D compiler is free to assume that an object is thread-local unless it's marked as shared and thus can make optimizations based on that, whereas the C++ compiler can't. And you've thrown away all of the compiler's help by using __gshared. __gshared is intended specifically for use with interacting with C code where we don't really have a choice, and you have to be careful with it.

Basically, __gshared pretends to compatible with C(++) globals, but in actual fact it doesn't share the same memory model so who knows what might happen... It's not just dangerous-so-be-very-careful, it's fundamentally broken and we're currently just getting away with it by relying on C(++) optimisers.
July 09, 2015
On Thursday, 9 July 2015 at 14:03:18 UTC, Jonathan M Davis wrote:
> On Thursday, 9 July 2015 at 12:39:00 UTC, Márcio Martins wrote:
>>[...]
> Well, the compiler is free to assume that a variable that is not marked as shared is thread-local. So, it's free to make optimizations based on that. So, for instance, it can know for a fact that
>
> [...]

But this is what a C/C++ compiler would do, unless you your data is qualified as volatile. I believe __gshared also implies the volatile behavior, right? I wouldn't make sense any other way.

So basically, __gshared is like saying "I want the C/C++ behavior, and I accept I am all on my own as the compiler will not help me".
July 09, 2015
On Thursday, 9 July 2015 at 14:57:56 UTC, Márcio Martins wrote:
> On Thursday, 9 July 2015 at 14:03:18 UTC, Jonathan M Davis wrote:
>> On Thursday, 9 July 2015 at 12:39:00 UTC, Márcio Martins wrote:
>>>[...]
>> Well, the compiler is free to assume that a variable that is not marked as shared is thread-local. So, it's free to make optimizations based on that. So, for instance, it can know for a fact that
>>
>> [...]
>
> But this is what a C/C++ compiler would do, unless you your data is qualified as volatile. I believe __gshared also implies the volatile behavior, right? I wouldn't make sense any other way.
>
> So basically, __gshared is like saying "I want the C/C++ behavior, and I accept I am all on my own as the compiler will not help me".

Sort of, but the assumptions that the D compiler is allowed to make aren't the same. Regardless of shared/__gshared itself, D's const is very different, and C++ doesn't have const or immutable. And the D compiler devs can add whatever optimizations they want based on what those features guarantee so long as they can prove that they're correct, which changes what what a D compiler is allowed to optimize in comparison to a C++ compiler. So, if you make assumptions on what's valid based purely on C++, you risk shooting yourself in the foot.

__gshared is really only meant for interacting with C APIs, and if you're using it for other stuff, you're just begging for trouble. You might get away with it at least some of the time, but it really isn't a good idea to try.

- Jonathan M Davis
July 09, 2015
On Thursday, 9 July 2015 at 14:40:17 UTC, John Colvin wrote:
> Basically, __gshared pretends to compatible with C(++) globals, but in actual fact it doesn't share the same memory model so who knows what might happen... It's not just dangerous-so-be-very-careful, it's fundamentally broken and we're currently just getting away with it by relying on C(++) optimisers.

__gshared is required for interacting with C/C++ APIs, but really, even there, what you're mainly dealing with is primitive types like int, and access to it should normally be pretty minimal/restricted. That being said, C/C++ bindings in general are arguably a giant hole, because they're marked as non-shared when they're arguably shared. It usually works fine, because the C/C++ functions generally aren't doing anything with what you pass to them which would cause them to be used across multiple threads, and you're usually not doing a lot of passing around of data that you get from C/C++ functions, but it _is_ an area that is a bit of minefield if you're not careful. You're basically dealing with the __gshared problem. Unfortunately, I'm not sure what we can do about it. Simply marking it all as shared would be problematic, since most of it really isn't, but having it all be treated as thread-local is also problematic. So, unfortunately, you just have to be very careful when dealing with C/C++ bindings and understand what the C/C++ code is doing. It is a problem though.

Still, using __gshared more than you have to is just going to make the problem bigger. It's bad enough that we have to deal with it at the C/C++ binding layer without having to worry about it in straight up D code.

- Jonathan M Davis
July 10, 2015
On Thursday, 9 July 2015 at 14:57:56 UTC, Márcio Martins wrote:
> On Thursday, 9 July 2015 at 14:03:18 UTC, Jonathan M Davis wrote:
>> On Thursday, 9 July 2015 at 12:39:00 UTC, Márcio Martins wrote:
>>>[...]
>> Well, the compiler is free to assume that a variable that is not marked as shared is thread-local. So, it's free to make optimizations based on that. So, for instance, it can know for a fact that
>>
>> [...]
>
> But this is what a C/C++ compiler would do, unless you your data is qualified as volatile. I believe __gshared also implies the volatile behavior, right? I wouldn't make sense any other way.
>
> So basically, __gshared is like saying "I want the C/C++ behavior, and I accept I am all on my own as the compiler will not help me".

If you think volatile is going to help you with concurency, you gonna have bad time.

The only thing volatile does is to prevent register promotion of the variable. It is usefull for MMIO, it doesn't provide guarantee for multithreading.
1 2 3
Next ›   Last »