April 24, 2008
== Quote from Walter Bright (newshound1@digitalmars.com)'s article
> Sean Kelly wrote:
> > == Quote from Walter Bright (newshound1@digitalmars.com)'s article
> >> It wasn't safe anyway, because it wasn't implemented. For now, just use synchronized instead.
> >
> > Um, I thought that the volatile statement effectively turned off optimization in the function containing the volatile block?  This wasn't ideal, but it should have done the trick.
> Just turning off optimization isn't good enough. The processor can reorder things!

Of corse it can.  But there are assembly instructions for that bit.  The unaccounted for problem is/was the compiler.


Sean
April 24, 2008
Sean Kelly wrote:
> Every tool can be mis-used with insufficient understanding.

Of course. But successfully writing multithreaded code that uses shared memory requires a level of expertise that is rare, and the need to write safe multithreaded code is far greater than the expertise available.

Even for those capable of doing it, writing correct multithreaded code is hard, time-consuming, resistant to testing, and essentially impossible to prove correct. It's like writing assembler code with a hex editor.


> Look at shared-
> memory multiprogramming for instance.  It's quite easy and understandable
> to share a few data structures between threads

It is until one of those threads tries to change the data.

> (which I'd assert is the original
> intent anyway), but common practice among non-experts is to use mutexes
> to protect code rather than data, and to call across threads willy-nilly.  It's
> no wonder the commonly held belief is that multiprogramming is hard.

The "multiprogramming is hard" is not based on a misunderstanding. It really is hard.


> Regarding lock-free programming in particular, I think it's worth pointing
> out that leaving out support for lock-free programming in general excludes
> an entire realm of code being written--not only library code to be ultimately
> used by everyday programmers, but kernel code and such as well.  Look at
> the Linux source code, for example.

I agree that lock free programming is important, but volatile doesn't get you there.


>  As for the C++0x discussions, I feel
> that some of the participants of the memory model discussion are experts
> in the field and understand quite well the issues involved.

Yes, there are a handful who do really understand it (Hans Boehm and Herb Sutter come to mind). If only the rest of us were half as smart <g>.
April 24, 2008
Jarrett Billingsley wrote:
> But what about things like accessing memory-mapped registers?  That is, as a hint to the compiler to say "don't inline this; don't cache results in registers"? 

I've written code that uses memory mapped registers. Even in programs that manipulate hardware directly, the percentage of code that does this is vanishingly small. It is a very poor cost/benefit ratio to support such a capability directly.

It's more appropriate to support it via peek/poke methods (which can be builtin compiler intrinsics), or inline assembler.
April 24, 2008
== Quote from Walter Bright (newshound1@digitalmars.com)'s article
> Sean Kelly wrote:
> > Every tool can be mis-used with insufficient understanding.
> Of course. But successfully writing multithreaded code that uses shared memory requires a level of expertise that is rare, and the need to write safe multithreaded code is far greater than the expertise available. Even for those capable of doing it, writing correct multithreaded code is hard, time-consuming, resistant to testing, and essentially impossible to prove correct. It's like writing assembler code with a hex editor.

I disagree... see below.

> > Look at shared-
> > memory multiprogramming for instance.  It's quite easy and understandable
> > to share a few data structures between threads
> It is until one of those threads tries to change the data.

I suppose I should have been more clear.  An underlying assumption of mine is that no thread maintains references into shared data unless they hold the lock that protects that data.

> > (which I'd assert is the original
> > intent anyway), but common practice among non-experts is to use mutexes
> > to protect code rather than data, and to call across threads willy-nilly.  It's
> > no wonder the commonly held belief is that multiprogramming is hard.
> The "multiprogramming is hard" is not based on a misunderstanding. It really is hard.

My claim is that multiprogramming is hard because the ability to share memory has been mis-used.  It's not hard in general, in my opinion.

> > Regarding lock-free programming in particular, I think it's worth pointing
> > out that leaving out support for lock-free programming in general excludes
> > an entire realm of code being written--not only library code to be ultimately
> > used by everyday programmers, but kernel code and such as well.  Look at
> > the Linux source code, for example.
> I agree that lock free programming is important, but volatile doesn't get you there.

How is it lacking?  I grant that it's very low-level, but it does address the key concern for lock-free programming.

>  >  As for the C++0x discussions, I feel
>  > that some of the participants of the memory model discussion are experts
>  > in the field and understand quite well the issues involved.
> Yes, there are a handful who do really understand it (Hans Boehm and
> Herb Sutter come to mind). If only the rest of us were half as smart <g>.

My personal belief is that the issue is really more a lack of plain old explanation
of the concepts than anything else.  The topic is rarely discussed outside of
research papers, and most other documentation is either confusing or just plain
wrong (the IA-86 memory model spec comes to mind, for example).  Not to
belittle the knowledge or experience of the C++ folks in any respect--this is
simply my experience with the information surrounding the topic :-)


Sean
April 24, 2008
== Quote from Walter Bright (newshound1@digitalmars.com)'s article
> Jarrett Billingsley wrote:
> > But what about things like accessing memory-mapped registers?  That is, as a hint to the compiler to say "don't inline this; don't cache results in registers"?
> I've written code that uses memory mapped registers. Even in programs
> that manipulate hardware directly, the percentage of code that does this
> is vanishingly small. It is a very poor cost/benefit ratio to support
> such a capability directly.
> It's more appropriate to support it via peek/poke methods (which can be
> builtin compiler intrinsics), or inline assembler.

From my understanding, the problem with doing this via inline assembler is that some compilers can actually optimize inline assembler, leaving no truly portable way to do this in language.  This issue has come up on comp.programming.threads in the past, but I don't remember whether there was any resolution insofar as C++ is concerned.


Sean
April 24, 2008
Sean Kelly wrote:
> From my understanding, the problem with doing this via inline assembler is
> that some compilers can actually optimize inline assembler, leaving no truly
> portable way to do this in language.  This issue has come up on
> comp.programming.threads in the past, but I don't remember whether there
> was any resolution insofar as C++ is concerned.

There's always a way to do it, even if you have to write an external function and link it in. I still don't believe memory mapped register access justifies adding complex language features.
April 24, 2008
== Quote from Walter Bright (newshound1@digitalmars.com)'s article
> Sean Kelly wrote:
> > From my understanding, the problem with doing this via inline assembler is that some compilers can actually optimize inline assembler, leaving no truly portable way to do this in language.  This issue has come up on comp.programming.threads in the past, but I don't remember whether there was any resolution insofar as C++ is concerned.
> There's always a way to do it, even if you have to write an external function and link it in. I still don't believe memory mapped register access justifies adding complex language features.

Well... I'm looking forward to seeing what you all have planned for multiprogramming in D!


Sean
April 24, 2008
Sean Kelly wrote:
>>> Look at shared-
>>> memory multiprogramming for instance.  It's quite easy and understandable
>>> to share a few data structures between threads
>> It is until one of those threads tries to change the data.
> 
> I suppose I should have been more clear.  An underlying assumption of mine is
> that no thread maintains references into shared data unless they hold the lock
> that protects that data.

The problem with locks are:

1) they are expensive, so people try to optimize them away (grep for "double checked locking")
2) people forget to use the locks
3) deadlocks

>>> (which I'd assert is the original
>>> intent anyway), but common practice among non-experts is to use mutexes
>>> to protect code rather than data, and to call across threads willy-nilly.  It's
>>> no wonder the commonly held belief is that multiprogramming is hard.
>> The "multiprogramming is hard" is not based on a misunderstanding. It
>> really is hard.
> 
> My claim is that multiprogramming is hard because the ability to share memory
> has been mis-used.  It's not hard in general, in my opinion.

When people as smart and savvy as Scott Meyers find it confusing, it's confusing. (Scott Meyers wrote the definitive paper on doubled checked locking, and what's wrong with it.) Heck, I have a hard enough time explaining what the difference between const and invariant is, how is memory coherency going to go down? <g>


>>> Regarding lock-free programming in particular, I think it's worth pointing
>>> out that leaving out support for lock-free programming in general excludes
>>> an entire realm of code being written--not only library code to be ultimately
>>> used by everyday programmers, but kernel code and such as well.  Look at
>>> the Linux source code, for example.
>> I agree that lock free programming is important, but volatile doesn't
>> get you there.
> 
> How is it lacking?  I grant that it's very low-level, but it does address the key
> concern for lock-free programming.

volatile actually puts locks around accesses (at least in the Java memory model it does). So, it doesn't get you lock-free programming. Just avoiding caching of reloads is not the key to lock-free programming. There's the ordering problem.


>>  >  As for the C++0x discussions, I feel
>>  > that some of the participants of the memory model discussion are experts
>>  > in the field and understand quite well the issues involved.
>> Yes, there are a handful who do really understand it (Hans Boehm and
>> Herb Sutter come to mind). If only the rest of us were half as smart <g>.
> 
> My personal belief is that the issue is really more a lack of plain old explanation
> of the concepts than anything else.  The topic is rarely discussed outside of
> research papers, and most other documentation is either confusing or just plain
> wrong (the IA-86 memory model spec comes to mind, for example).  Not to
> belittle the knowledge or experience of the C++ folks in any respect--this is
> simply my experience with the information surrounding the topic :-)

I've seen many attempts at explaining it, including presentations by Herb Sutter himself. Sorry, but most of the audience doesn't get it.

I attended a conference a couple years back on what do do about adding multithreading support to C++. There were about 30 attendees, pretty much the top guys in C++ programming, including Herb Sutter and Hans Boehm. Herb and Hans did most of the talking, and the rest of us sat there wondering "what's a cubit". Things have improved a bit since then, but it's pretty clear that the bulk of programmers are never going to get it, and getting mp programs to work will have the status of a black art.

What's needed is something like what garbage collection did for memory management. The language has to take care of synchronization *automatically*. Being D, of course there will be a way for the sorcerers to practice the black art, but for the rest of us there needs to be a reliable and reasonable alternative.
April 24, 2008
== Quote from Walter Bright (newshound1@digitalmars.com)'s article
> Sean Kelly wrote:
> >>> Look at shared-
> >>> memory multiprogramming for instance.  It's quite easy and understandable
> >>> to share a few data structures between threads
> >> It is until one of those threads tries to change the data.
> >
> > I suppose I should have been more clear.  An underlying assumption of mine is that no thread maintains references into shared data unless they hold the lock that protects that data.
> The problem with locks are:
> 1) they are expensive, so people try to optimize them away (grep for
> "double checked locking")
> 2) people forget to use the locks
> 3) deadlocks

1) The cost of acquiring or committing a lock is generally roughly equivalent to
     a memory synchronization, and sometimes less than that (futexes, etc).  So
     it's not insignificant, but also not as bad as people seem to think.  I suspect
     that locked operations are often subject to premature optimization.
2) If locking is built into the API then they can't forget.
3) Deadlocks aren't typically an issue with the approach I described above because
     it largely eliminates the chance that the programmer will call into unknown code
     while holding a lock.

I do think that locks stink as a general multiprogramming tool, but they can be
quite useful in implementing more complex multiprogramming tools, if nothing
else.  Also, they can be about the fastest option in some cases, and this can be
important.  For example, locks are much faster than transactional memory--they
just introduce problems like priority inversion and deadlock (fun fun).  That said,
transactional memory can result in livelock, so neither is a clear win.

> >>> (which I'd assert is the original
> >>> intent anyway), but common practice among non-experts is to use mutexes
> >>> to protect code rather than data, and to call across threads willy-nilly.  It's
> >>> no wonder the commonly held belief is that multiprogramming is hard.
> >> The "multiprogramming is hard" is not based on a misunderstanding. It really is hard.
> >
> > My claim is that multiprogramming is hard because the ability to share memory has been mis-used.  It's not hard in general, in my opinion.
> When people as smart and savvy as Scott Meyers find it confusing, it's confusing. (Scott Meyers wrote the definitive paper on doubled checked locking, and what's wrong with it.) Heck, I have a hard enough time explaining what the difference between const and invariant is, how is memory coherency going to go down? <g>

Fair enough :-)

> >>> Regarding lock-free programming in particular, I think it's worth pointing
> >>> out that leaving out support for lock-free programming in general excludes
> >>> an entire realm of code being written--not only library code to be ultimately
> >>> used by everyday programmers, but kernel code and such as well.  Look at
> >>> the Linux source code, for example.
> >> I agree that lock free programming is important, but volatile doesn't get you there.
> >
> > How is it lacking?  I grant that it's very low-level, but it does address the key concern for lock-free programming.
> volatile actually puts locks around accesses (at least in the Java memory model it does). So, it doesn't get you lock-free programming. Just avoiding caching of reloads is not the key to lock-free programming. There's the ordering problem.

I must be missing something... I thought 'volatile' addressed compiler reordering as well?  That aside, I do think that the implementation of 'volatile' in D 1.0 is too complicated for the average programmer to use correctly and thus may not be the perfect solution for D, but I also think that it solves the language/compiler part of the problem.

> >>  >  As for the C++0x discussions, I feel
> >>  > that some of the participants of the memory model discussion are experts
> >>  > in the field and understand quite well the issues involved.
> >> Yes, there are a handful who do really understand it (Hans Boehm and
> >> Herb Sutter come to mind). If only the rest of us were half as smart <g>.
> >
> > My personal belief is that the issue is really more a lack of plain old explanation
> > of the concepts than anything else.  The topic is rarely discussed outside of
> > research papers, and most other documentation is either confusing or just plain
> > wrong (the IA-86 memory model spec comes to mind, for example).  Not to
> > belittle the knowledge or experience of the C++ folks in any respect--this is
> > simply my experience with the information surrounding the topic :-)
> I've seen many attempts at explaining it, including presentations by
> Herb Sutter himself. Sorry, but most of the audience doesn't get it.
> I attended a conference a couple years back on what do do about adding
> multithreading support to C++. There were about 30 attendees, pretty
> much the top guys in C++ programming, including Herb Sutter and Hans
> Boehm. Herb and Hans did most of the talking, and the rest of us sat
> there wondering "what's a cubit". Things have improved a bit since then,
> but it's pretty clear that the bulk of programmers are never going to
> get it, and getting mp programs to work will have the status of a black art.
> What's needed is something like what garbage collection did for memory
> management. The language has to take care of synchronization
> *automatically*. Being D, of course there will be a way for the
> sorcerers to practice the black art, but for the rest of us there needs
> to be a reliable and reasonable alternative.

I very much agree.  My real interest in preserving the black arts in D is so that
library developers can produce code which solves these problems in a more
elegant manner, whatever that may be.  I don't have any expectation that the
average programmer would ever want or need to use something like 'volatile'
or even ordered atomics.  It's far too low-level of a solution to the problem at
hand.  However, if this can be accomplished without any language facilities at
all then I'm all for it.  I simply don't want to have to rely on compiler-specific
knowledge when writing code, be it at a high level or a low level.


Sean
April 24, 2008
Sean Kelly wrote:
> 2) If locking is built into the API then they can't forget.

Sure they can forget. All the memory in the process can be accessed by any thread, so it's easy to share globals (for example) without locking of any sort.