December 17, 2009
On Thu, 17 Dec 2009 00:51:48 +0000, dsimcha wrote:

> == Quote from Andrei Alexandrescu (SeeWebsiteForEmail@erdani.org)'s
> article
>> But let's not forget we have concurrency ahead of us. I encourage you
>> all to chime in with your thoughts and ideas regarding all aspects of
>> concurrency. The recent multicore performance bug is a great starting
>> point. If you try e.g. shared and it's broken, let us know. If you try
>> it and it works, push it til it breaks. If you have ideas on how to
>> make semantic checking better, pipe up.
>> Andrei
> 
> I think for this to happen, there needs to be a tutorial somewhere explaining what shared is supposed to do (there were so many ideas thrown around that I don't remember them all and don't know which ones got adopted), and how much of it is already implemented.

I agree. The whole thing is very confused right now, and certainly everything I try in my code doesn't work out. The only way I can use threads tight now is to avoid use of shared at all. If anyone knows how it is supposed to work now, please write out a description.

December 18, 2009
On Thu, 17 Dec 2009 03:04:59 +0000, Graham St Jack wrote:

> On Thu, 17 Dec 2009 01:13:36 +0100, grauzone wrote:
> 
>> Andrei Alexandrescu wrote:
>>> But let's not forget we have concurrency ahead of us. I encourage you all to chime in with your thoughts and ideas regarding all aspects of concurrency. The recent multicore performance bug is a great starting point. If you try e.g. shared and it's broken, let us know. If you try it and it works, push it til it breaks. If you have ideas on how to make semantic checking better, pipe up.
>> 
>> There's a guy on the NG who posts once in a while how broken shared is and how he has to use __gshared instead. His threads are being pretty much ignored.
>> 
>> And now?
>> 
>> 
>>> Andrei
> 
> I'm certainly in that category. I will be trying (soon) to put a post together that sets out my issues with 'shared'.

Here is my attempt to set out my problems with the "shared" keyword as it now stands.


First the good part: writing multi-threaded programs is not easy, and anything the language and compiler can do to make it easier is good.

Now for the way I approach writing multi-threaded programs:

Keep threads apart from each other wherever possible. Specifically, don't let them access or modify the same data except in well-understood and carefully controlled ways.

My preferred way of doing this is for threads to share only a very small amount of data, all of which is mutex protected, and carefully designed to be safe to access by multiple threads. My favourite kind of these is a templated queue (like a "go" channel). Any data passed out from such a "shareable object" has to be either immutable or cloned, so that each thread can rely on its data not being trampled on by other threads.

And my wish-list:

What I would like is for D to provide a clean way for me to be able to say "this object and all its methods are nice and safe to access from multiple threads", and to also say "all instances of this type are immutable".

And I also want the compiler to tell me if it thinks I have multiple threads accessing data in an unsafe way.


Finally my issues with D as it stands:

Immutable or const objects are a real pain because I can't have a mutable reference to them. Rebindable doesn't seem to work, and I haven't been able to make a version that works well enough.

Immutable types don't add any value because you have to keep stating that objects of them are immutable everywhere. Having first-class immutable types would make it MUCH easier to reap the benefits of immutable data.

I can't currently see a use for the shared keyword as it stands. It seems to me that what is needed is a keyword more like "shareable", meaning that this object (or data or function?) can be safely accessed by multiple threads. It should be an error to access a non-shareable object with multiple threads.

It should also be an error to claim that something is shareable unless it
meets some well thought out criteria that the compiler can check. I
haven't thought these out yet, but some candidates I like the look of are:
* All outputs from the object must be immutable or passed by value
(cloned).
* All externally accessible methods must be synchronized on the object's
monitor.

I'm happy that some sort of back-door override like __gshared has to be there for those cases that are actually ok, but only because of some grubby detail that the compiler can't figure out.
December 18, 2009
On 2009-12-17 23:38:58 -0500, Graham St Jack <Graham.StJack@internode.on.net> said:

> Immutable or const objects are a real pain because I can't have a mutable
> reference to them. Rebindable doesn't seem to work, and I haven't been
> able to make a version that works well enough.

In my observation, just like the most useful string type is defined as immutable(char)[] and thus is rebindable, most of the immutable objects I need also need to be rebindable. Rebindable being the most needed case, it means that I almost always need to write Rebindable!(immutable Object) for my immutable or const objects, which is a pain.

I've been wondering yesterday if being rebindable shouldn't just be the default for objects. The rule would be that reference types are rebindable unless marked with final. So you would have:

	const(Object)           o1; // rebindable
	const(final Object)     o2; // not rebindable
	immutable(Object)       o3; // rebindable
	immutable(final Object) o4; // not rebindable

Final would have no effect on non-reference types (anything but classes and interfaces). I can see how it isn't necessarily coherent with the rest of the const system for value types, but reference types are different from value types too.

In essence, I just want the most frequently used type (a rebindable object) to be easier to write than Rebindable!(immutable Object), so I'm open to other ideas too.

As for a Rebindable that works, I've made one. Unfortunately, use of alias this is broken because of this bug <http://d.puremagic.com/issues/show_bug.cgi?id=3626>, so for now you need to always use the "get" parameter for comparing to null and other things that don't start with a dot.


private template Rebindable(T) if (is(T == class) || isArray!(T))
{
   static if (!is(T X == const(U), U) && !is(T X == immutable(U), U))
   {
       alias T Rebindable;
   }
   else static if (isArray!(T))
   {
       alias const(ElementType!(T))[] Rebindable;
   }
   else
   {
       struct Rebindable
       {
			private U stripped;
			
           void opAssign(T another)
           {
               stripped = cast(U) another;
           }
			this(T value) {
				this = value;
			}
           T get() const {
				return cast(T)stripped;
			}
			T opDot() const {
				return get;
			}
       }
   }
}

-- 
Michel Fortin
michel.fortin@michelf.com
http://michelf.com/

December 24, 2009
Andrei Alexandrescu wrote:

> Jason House wrote:
>> Andrei Alexandrescu Wrote:
>> 
>>> But let's not forget we have concurrency ahead of us. I encourage you all to chime in with your thoughts and ideas regarding all aspects of concurrency. The recent multicore performance bug is a great starting point. If you try e.g. shared and it's broken, let us know. If you try it and it works, push it til it breaks. If you have ideas on how to make semantic checking better, pipe up.
>> 
>> I posted several shared issues to the NG a few days ago and got no replies. Most of it was about poor error messages (both misleading text and missing file/line numbers). I should do proper bugzilla entries but haven't tinkered with D much since. As always, this post comes from my phone instead of a proper computer, or else I'd provide a link or do the bugzilla entries while I'm thinking of it.
>> 
>> IIRC, the most troublesome error happened when I created "shared this(){...}" constructor and no information on where it caused errors.
> 
> Thanks, Jason. I did see those messages but I didn't want to dilute focus back when you posted them.
> 
> If you could put together bugzilla entries as you try things, that would be great. The shared constructor I'll experiment with first.

I have reproduced and reported most of the issue I had hit with shared. If it's any consolation, I was able to convert my message passing queues to use shared, but hit issues when expanding to cover other uses of shared data within my code base.  There may also be an issue with what a shared delegate is supposed to be and what can be inside of one, but I haven't really worried about that too much.  It's a hairy issue and partly my fault for using delegates as messages between threads.  When sending a message, I cast the delegate to a shared delegate even though it access immutable data and has no side effect except on thread-local data in the receiving thread.

bugzilla 3640
     shared this() constructor does not work and reports strange errors
     without line numbers

bugzilla 3641 keywords: rejects-valid
    alias shared T U does not work

bugzilla 3642 keywords: diagnostic
    Poor error message when using shared:
    function ___ not callable with argument types ___

December 24, 2009
Jason House wrote:

> bugzilla 3640
>      shared this() constructor does not work and reports strange errors
>      without line numbers
> 
> bugzilla 3641 keywords: rejects-valid
>     alias shared T U does not work
> 
> bugzilla 3642 keywords: diagnostic
>     Poor error message when using shared:
>     function ___ not callable with argument types ___

I'll add to that list:
bugzilla 3091 keywords: rejects-valid
     "auto x = new shared foo" does not compile
December 24, 2009
Jason House wrote:
> Andrei Alexandrescu wrote:
> 
>> Jason House wrote:
>>> Andrei Alexandrescu Wrote:
>>>
>>>> But let's not forget we have concurrency ahead of us. I encourage you
>>>> all to chime in with your thoughts and ideas regarding all aspects of
>>>> concurrency. The recent multicore performance bug is a great starting
>>>> point. If you try e.g. shared and it's broken, let us know. If you try
>>>> it and it works, push it til it breaks. If you have ideas on how to make
>>>> semantic checking better, pipe up.
>>> I posted several shared issues to the NG a few days ago and got no
>>> replies. Most of it was about poor error messages (both misleading text
>>> and missing file/line numbers). I should do proper bugzilla entries but
>>> haven't tinkered with D much since. As always, this post comes from my
>>> phone instead of a proper computer, or else I'd provide a link or do the
>>> bugzilla entries while I'm thinking of it.
>>>
>>> IIRC, the most troublesome error happened when I created "shared
>>> this(){...}" constructor and no information on where it caused errors.
>> Thanks, Jason. I did see those messages but I didn't want to dilute
>> focus back when you posted them.
>>
>> If you could put together bugzilla entries as you try things, that would
>> be great. The shared constructor I'll experiment with first.
> 
> I have reproduced and reported most of the issue I had hit with shared. If it's any consolation, I was able to convert my message passing queues to use shared, but hit issues when expanding to cover other uses of shared data within my code base.  There may also be an issue with what a shared delegate is supposed to be and what can be inside of one, but I haven't really worried about that too much.  It's a hairy issue and partly my fault for using delegates as messages between threads.  When sending a message, I cast the delegate to a shared delegate even though it access immutable data and has no side effect except on thread-local data in the receiving thread.
> 
> bugzilla 3640
>      shared this() constructor does not work and reports strange errors
>      without line numbers
> 
> bugzilla 3641 keywords: rejects-valid
>     alias shared T U does not work
> 
> bugzilla 3642 keywords: diagnostic
>     Poor error message when using shared:     function ___ not callable with argument types ___
> 

Great. I just got word from Walter that he fixed a lot of shared bugs (most reported and some probably not yet reported), so I expect the situation to get considerably better with the next minor release.

Andrei
December 24, 2009
On Thu, 24 Dec 2009 08:13:53 +0300, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:

> Jason House wrote:
>> Andrei Alexandrescu wrote:
>>
>>> Jason House wrote:
>>>> Andrei Alexandrescu Wrote:
>>>>
>>>>> But let's not forget we have concurrency ahead of us. I encourage you
>>>>> all to chime in with your thoughts and ideas regarding all aspects of
>>>>> concurrency. The recent multicore performance bug is a great starting
>>>>> point. If you try e.g. shared and it's broken, let us know. If you try
>>>>> it and it works, push it til it breaks. If you have ideas on how to make
>>>>> semantic checking better, pipe up.
>>>> I posted several shared issues to the NG a few days ago and got no
>>>> replies. Most of it was about poor error messages (both misleading text
>>>> and missing file/line numbers). I should do proper bugzilla entries but
>>>> haven't tinkered with D much since. As always, this post comes from my
>>>> phone instead of a proper computer, or else I'd provide a link or do the
>>>> bugzilla entries while I'm thinking of it.
>>>>
>>>> IIRC, the most troublesome error happened when I created "shared
>>>> this(){...}" constructor and no information on where it caused errors.
>>> Thanks, Jason. I did see those messages but I didn't want to dilute
>>> focus back when you posted them.
>>>
>>> If you could put together bugzilla entries as you try things, that would
>>> be great. The shared constructor I'll experiment with first.
>>  I have reproduced and reported most of the issue I had hit with shared. If it's any consolation, I was able to convert my message passing queues to use shared, but hit issues when expanding to cover other uses of shared data within my code base.  There may also be an issue with what a shared delegate is supposed to be and what can be inside of one, but I haven't really worried about that too much.  It's a hairy issue and partly my fault for using delegates as messages between threads.  When sending a message, I cast the delegate to a shared delegate even though it access immutable data and has no side effect except on thread-local data in the receiving thread.
>>  bugzilla 3640
>>      shared this() constructor does not work and reports strange errors
>>      without line numbers
>>  bugzilla 3641 keywords: rejects-valid
>>     alias shared T U does not work
>>  bugzilla 3642 keywords: diagnostic
>>     Poor error message when using shared:     function ___ not callable with argument types ___
>>
>
> Great. I just got word from Walter that he fixed a lot of shared bugs (most reported and some probably not yet reported), so I expect the situation to get considerably better with the next minor release.
>
> Andrei

I'll add my experience with shared/local separation.

Shared/local as it is now relies solely on convention. For example, I have wrote a large heavily-multithreaded application in D2 without any use of shared (i.e. everything is "local", yet pointers are freely passed among threads).

This is because stdlib is not shared aware at all, starting with druntime: thread creation is "broken" in a sense that it allows passing local this pointer through delegate to another thread:

class Foo {
    void bar() { /* do stuff */ }
    void createNewThread() {
        auto thread = new Thread(&bar); // local this pointer is bound to delegate
        thread.start(); // "this" is now accessible from 2 different threads (current one, and a new one)
    }
}

A simple fix would be to disallow starting a new thread with a delegate, allowing free function pointers only.
Other options would be to introduce "shared delegates" - a delegate that has shared context pointer.

I'd suggest starting with fixing signatures of those OS API functions that deal with creating new thread. For example, there are _beginthreadex (Windows) and pthread_create (POSIX) functions that are declared in druntime roughly like this:

extern (C) uintptr_t _beginthreadex(void* /+security,+/, uint /+stack_size+/, uint function(void*) /+start_routine+/, void* /+arg+/, uint /+initflag+/, uint* /+thrdaddr+/); // added variable names to make it more clear

int pthread_create(pthread_t* /+thread+/, in pthread_attr_t* /+attr+/, void* function(void*) /+start_routine+/, void* /+arg+/);

(there are a lot more functions like this, of course)

Variable arg is passed from one thread to another, and thus *must* be marked as shared (or unique). The same applies to start_routine (should be "void function(shared void*) start_routine" instead).

This will break a lot of code but will at least try to enforce program correctness.
December 24, 2009
Denis Koroskin wrote:

> I'll add my experience with shared/local separation.
> 
> Shared/local as it is now relies solely on convention. For example, I have wrote a large heavily-multithreaded application in D2 without any use of shared (i.e. everything is "local", yet pointers are freely passed among threads).

That was my first experience with shared as well.  I submitted a ticket [1] to druntime for it over 7 months ago.  I'm not sure what the current status of the threading rewrite is.  I know Bartosz worked on it until he hit a nasty bug which Don later fixed.

[1] http://www.dsource.org/projects/druntime/ticket/23

December 24, 2009
Andrei Alexandrescu wrote:

> Jason House wrote:
>> bugzilla 3640
>>      shared this() constructor does not work and reports strange errors
>>      without line numbers
>> 
>> bugzilla 3641 keywords: rejects-valid
>>     alias shared T U does not work
>> 
>> bugzilla 3642 keywords: diagnostic
>>     Poor error message when using shared:
>>     function ___ not callable with argument types ___
>> 
> 
> Great. I just got word from Walter that he fixed a lot of shared bugs (most reported and some probably not yet reported), so I expect the situation to get considerably better with the next minor release.
> 
> Andrei

The commit logs show that bugzilla 3641 was fixed.  I'd appreciate it if someone could translate what other shared coding constructs were fixed by Walter's changes.

Even a partial fix to 3640 to include line numbers would probably help people.  My grep results below show where the message is being generated.  I see there's a variant of error that accepts a location parameter, but this error doesn't use it.  What surprises me is that it looks like nearly every call to error lacks a location parameter.  I'm confused why this would be. I don't normally notice a lack of line numbers in most error messages I see from dmd.

/usr/local/src/dmd/cast.c:104:    error("cannot implicitly convert expression (%s) of type %s to %s",

Maybe the biggest help for those converting to shared would be inclusion of why a particular variable is shared and causing the error.  Shared can be viral, and exactly how a piece of code is getting called in shared context can occasionally be a bit unclear.  I think there was a related patch for inclusion of stack traces with templated code?
December 26, 2009
Jason House wrote:
> Maybe the biggest help for those converting to shared would be inclusion of why a particular variable is shared and causing the error.  Shared can be viral, and exactly how a piece of code is getting called in shared context can occasionally be a bit unclear.  I think there was a related patch for inclusion of stack traces with templated code?
The stack trace patch is in.