October 30, 2013
On 10/30/2013 3:30 AM, Russel Winder wrote:
> On Tue, 2013-10-29 at 14:38 -0700, Walter Bright wrote:
> […]
>> I wrote one for DDJ a few years back, "Safe Systems from Unreliable Parts". It's
>> probably scrolled off their system.
>
> Update it and republish somewhere. Remember the cool hipsters think if
> it is over a year old it doesn't exist. And the rest of us could always
> do with a good reminder of quality principles.
>

Good idea. Maybe I should do a followup DDJ article based on the Toyota report.
October 30, 2013
On 10/30/2013 3:01 AM, Chris wrote:
> On Wednesday, 30 October 2013 at 03:24:54 UTC, Walter Bright wrote:
>> Take a look at the reddit thread on this:
>>
>> http://www.reddit.com/r/programming/comments/1pgyaa/toyotas_killer_firmware_bad_design_and_its/
>>
>>
>> Do a search for "failsafe". Sigh.
>
> One of the comments under the original article you posted says
>
> "Poorly designed firmware caused unintended operation, lack of driver training
> made it fatal."
>
> So it's the driver's fault, who couldn't possibly know what was going on in that
> car-gone-mad? To put the blame on the driver is cynicism of the worst kind.

Much effort in cockpit design goes into trying to figure out what the pilot would do "intuitively" and ensuring that that is the right thing to do.

Of course, we try to do that with programming language design, too, with varying degrees of success.

> Unfortunately, that's a common (and dangerous) attitude I've come across among
> programmers and engineers. The user has to adapt to anything they fail to
> implement or didn't think of. However, machines have to adapt to humans not the
> other way around (realizing this was part of Apple's success in UI design,
> Ubuntu is very good now too).
>
> I warmly recommend the book "Architect or Bee":
>
> http://www.amazon.com/Architect-Bee-Human-Technology-Relationship/dp/0896081311/ref=sr_1_1?ie=UTF8&qid=1383127030&sr=8-1&keywords=architect+or+bee
>

October 30, 2013
On Wed, 2013-10-30 at 11:12 -0700, Walter Bright wrote: […]
> Much effort in cockpit design goes into trying to figure out what the pilot would do "intuitively" and ensuring that that is the right thing to do.

I've no experience with cockpit design, but I am aware of all the HCI work that went into air traffic control in the 1980s and 1990s, especially realizing the safety protocols which are socio-political systems as much as computer realized things. This sort of safety work is as much about the context and the human actors as much as the computer and software.

> Of course, we try to do that with programming language design, too, with varying degrees of success.
[…]

Has any programming language ever had psychology of programming folk involved from the outset rather than after the fact as a "patch up" activity?

-- 
Russel. ============================================================================= Dr Russel Winder      t: +44 20 7585 2200   voip: sip:russel.winder@ekiga.net 41 Buckmaster Road    m: +44 7770 465 077   xmpp: russel@winder.org.uk London SW11 1EN, UK   w: www.russel.org.uk  skype: russel_winder

October 30, 2013
On Wednesday, 30 October 2013 at 18:35:44 UTC, Russel Winder wrote:
> On Wed, 2013-10-30 at 11:12 -0700, Walter Bright wrote:
> […]
>> Much effort in cockpit design goes into trying to figure out what the pilot would do "intuitively" and ensuring that that is the right thing to do.
>
> I've no experience with cockpit design, but I am aware of all the HCI
> work that went into air traffic control in the 1980s and 1990s,
> especially realizing the safety protocols which are socio-political
> systems as much as computer realized things. This sort of safety work is
> as much about the context and the human actors as much as the computer
> and software.
>
>> Of course, we try to do that with programming language design, too, with varying degrees of success.
> […]
>
> Has any programming language ever had psychology of programming folk
> involved from the outset rather than after the fact as a "patch up"
> activity?

Ruby, they say. Even if it's only one programmer they based it on. :-)
October 30, 2013
On Tue, Oct 29, 2013 at 07:14:50PM -0700, Walter Bright wrote: [...]
> The ideas are actually pretty simple. The hard parts are:
> 
> 1. Convincing engineers that this is the right way to do it.

Yeah, if you had said this to me many years ago, I'd have rejected it. Sadly, it's only with hard experience that one comes to acknowledge wisdom.


> 2. Convincing people that improving quality, better testing, hiring better engineers, government licensing for engineers, following MISRA standards, etc., are not the solution. (Note that all of the above were proposed in the HN thread.)

Ha. And yet where do we see companies pouring all that money into? Precisely into improving quality, improving test coverage, inventing better screening for hiring engineers, and in many places, requiring pieces of paper to certify that candidate X has successfully completed program Y sponsored by large corporation Z, which purportedly has a good reputation that therefore (by some inscrutible leap of logic) translates to proof that candidate X is capable of producing better code, which therefore equates to the product being made ... safer? Hmm. Something about the above line of reasoning seems to be fishy somewhere. :P

(And don't even get me started on the corporate obsession with standards bearing acronymic buzzword names that purportedly will solve everything from software bugs to world hunger. As though the act of writing the acronym into the company recommended practices handbook [which we all know everybody loves to read and obey, to the letter] will actually change anything.)


> 3. Beating out of engineers the hubris that "this part I designed will never fail!" Jeepers, how often I've heard that.

"This piece of code is so trivial, and so obviously, blatantly correct, that it serves as its own proof of correctness." (Later...) "What do you *mean* the unit tests are failing?!"


> 4. Developing a mindset of "what happens when this part fails in the worst way."

I wish software companies would adopt this mentality. It would save so many headaches I get just from *using* software as an end-user (don't even mention what I have to do at work as a software developer).


> 5. Learning to recognize inadvertent coupling between the primary and backup systems.

If there even *is* a backup system... :P  I think a frighteningly high percentage of enterprise software fails this criterion.


> 6. Being familiar with the case histories of failure of related designs.

They really should put this into the CS curriculum.


> 7. Developing a system to track failures, the resolutions, and check that new designs don't suffer from the same problems. (Much like D's bugzilla, the test suite, and the auto-tester.)

I like how the test suite actually (mostly?) consists of failing cases from actual reported bugs, which the autotester then tests for, thus ensuring that the same bugs don't happen again.

Most software companies have bug trackers, I'm pretty sure, but it's pretty scary how few of them actually have an *automated* system in place to ensure that previously-fixed bugs don't recur. Some places rely on the QA department doing manual testing over some standard checklist that may have no resemblance whatsoever to previously-fixed bugs, as though it's "good enough" that the next patch release (which is inevitably not just a "patch" but a full-on new version packed with new, poorly-tested features) doesn't outright crash on the most basic functionality. Use anything more complex than trivial, everyday tasks? With any luck, you'll crash within the first 5 minutes of using the new version just by previously-fixed bugs that got broken again. Which then leads to today's mentality of "let's *not* upgrade until everybody else has crashed the system to bits and the developers have been shamed into fixing them, then maybe things won't break as badly when we do upgrade".

For automated testing to be practical, of course, requires that the system be designed to be tested in that way in the first place -- which unfortunately very few programmers have been trained to do. "Whaddya mean, make my code modular and independently testable? I've a deadline by 12am tonight, and I don't have time for that! Just hardcode the data into the global variables and get the product out the door before the midnight bell strikes; who cares if this thing is testable, as long as the customer thinks it looks like it works!"

Sigh.


T

-- 
Береги платье снову, а здоровье смолоду.
October 30, 2013
On Wednesday, 30 October 2013 at 18:35:44 UTC, Russel Winder wrote:
> Has any programming language ever had psychology of programming folk
> involved from the outset rather than after the fact as a "patch up"
> activity?

Yes, sadly I can't remember the name. Very insightful project. Probably won't be successful in itself, but a lot have to be learned from the experiment.
October 30, 2013
On 10/30/2013 12:24 PM, H. S. Teoh wrote:
> On Tue, Oct 29, 2013 at 07:14:50PM -0700, Walter Bright wrote:
> Ha. And yet where do we see companies pouring all that money into?
> Precisely into improving quality, improving test coverage, inventing
> better screening for hiring engineers, and in many places, requiring
> pieces of paper to certify that candidate X has successfully completed
> program Y sponsored by large corporation Z, which purportedly has a good
> reputation that therefore (by some inscrutible leap of logic) translates
> to proof that candidate X is capable of producing better code, which
> therefore equates to the product being made ... safer? Hmm. Something
> about the above line of reasoning seems to be fishy somewhere. :P

There's still plenty of reason to improve software quality.

I just want to emphasize that failsafe system design is not about improving quality.

October 30, 2013
On 10/30/2013 11:35 AM, Russel Winder wrote:
> Has any programming language ever had psychology of programming folk
> involved from the outset rather than after the fact as a "patch up"
> activity?

I think they all have. The "patch up" activity comes from discovering that they were wrong :-)

One of my favorite anecdotes comes from the standardized jargon used in aviation. When you are ready to take off, you throttle up to max power first. Hence, the standard jargon for firewalling the throttles is "takeoff power".

This lasted until an incident where the pilot, coming in for a landing, realized he had to abort the landing and go around. He yelled "takeoff power", and the copilot promptly powered down the engines, causing the plane to stall and crash.

"take off power", get it?

The standard phrase was then changed to "full power" or "maximum power", I forgot which.

This all seems so so obvious in hindsight, doesn't it? But the best minds didn't see it until after there was an accident. This is all too common.
October 30, 2013
And the slashdot version:

http://tech.slashdot.org/story/13/10/29/208205/toyotas-killer-firmware
October 30, 2013
On 10/30/2013 11:01 AM, Chris wrote:
> "Poorly designed firmware caused unintended operation, lack of driver
>  training made it fatal."
> So it's the driver's fault, who couldn't possibly know what was going on
> in that car-gone-mad? To put the blame on the driver is cynicism of the worst kind.
> Unfortunately, that's a common (and dangerous) attitude I've come across
> among programmers and engineers.

There are also misguided end users who believe there cannot be any other way (and sometimes even believe that the big players in the industry are infallible, and hence the user is to blame for any failure).

> The user has to adapt to anything they
> fail to implement or didn't think of. However, machines have to adapt to
> humans not the other way around (realizing this was part of Apple's
> success in UI design,

AFAIK Apple designs are not meant to be adapted. It seems to be mostly marketing.

> Ubuntu is very good now too).

The distribution is not really indicative of the UI/window manager you'll end up using, so what do you mean?