[SystemSafety] Critical Design Checklist

Les Chambers les at chambers.com.au
Tue Aug 27 22:37:46 CEST 2013


On the subject of human fallibility. Humans seem to be most vulnerable and fallible in emergency situations. In chemical processing automation the most important device on the Control Panel was the "panic button". A panic button use case follows: you're enjoying a cup of coffee watching a perfectly lined out chemical process do its thing under computer control, when you glance out the window and see a chlorine cloud headed your way. When chlorine hits the atmosphere it combines with moisture and produces hydrochloric acid which dissolves lungs. Provided the cloud is not between you and the car park the best action is to run like hell. But before you do you push the panic button. Behind that panic button is 6 to 12 months worth of software development. Where one or more highly experienced process engineers sat down and figured out, in quiet contemplation, how to bring a chemical processing plant to a safe shutdown state from most predictable operational states. This is one good reason to replace human interaction with automation; when humans are incapable of interacting logically or, in fact, are not even present because they are in their cars gunning it down the highway or attempting to scale the back fence. Projecting this idea onto avionics, I was impressed with the Airbus A380 system's ability to deal with minor disasters in the QF 32 engine explosion scenario. Only the major stuff made it to the cockpit when a human had to make a decision. Luckily Qantas had five very calm and very competent pilots capable of making good decisions in an emergency.

-----Original Message-----
From: systemsafety-bounces at lists.techfak.uni-bielefeld.de [mailto:systemsafety-bounces at lists.techfak.uni-bielefeld.de] On Behalf Of Driscoll, Kevin R
Sent: Wednesday, August 28, 2013 2:18 AM
To: Peter Bishop; systemsafety at lists.techfak.uni-bielefeld.de
Subject: Re: [SystemSafety] Critical Design Checklist

> need to accommodate human fallibility and limitations in knowledge
This touches on a parallel thread here and some internal discussions we had while developing these questions.  How much automation should be put into the avionics to protect against human fallibility?  At what point does adding automation complexity lead to counterproductive additional failure modes?  So, a question here may be:
"How have you determined the balance between accommodating human fallibility and the probability of failure due to additional system complexity?"

> (defence in depth principle)
And, more generally we have to ask questions about robustness.  How will the system behave when it's outside its design envelope.

>> 5. Can you name the both the individual who will be personally 
>> accountable if the design later proves not to meet its safety  
Hmm, that's interesting.  A more probing question might be:
"What is this person's involvement in design reviews?"
Want to eliminate this person being a VP of engineering who happens to be a PE but has no input into this particular design.

>> What will you do then?
Put it in the risk register.  <half joking>
This is where the organization has to make decisions about risk.  I see our job here as forcing designers to confront these decisions rather than ignoring them or being ignorant of their existence.

> -----Original Message-----
> From: systemsafety-bounces at lists.techfak.uni-bielefeld.de
> [mailto:systemsafety-bounces at lists.techfak.uni-bielefeld.de] On Behalf
> Of Peter Bishop
> Sent: Tuesday, August 27, 2013 04:42
> To: systemsafety at lists.techfak.uni-bielefeld.de
> Subject: Re: [SystemSafety] Critical Design Checklist
> 
> This may be wandering into the realms of system safety, but I would
> extend 1, 2 because we need to accommodate human fallibility and
> limitations in knowledge by having some kind of fallback or recovery
> strategy.
> 
> A If there are residual doubts about requirements or implementation,
> are there any alternative systems that can maintain safety? (defence in
> depth principle) B What what features exist for identifying
> malfunctions in operation, and implementing design rectifications over
> the operating lifetime.
> 
> Peter Bishop
> Adelard LLP
> 
> Martyn Thomas wrote:
> >
> > On 26/08/2013 21:37, Driscoll, Kevin R wrote:
> >>
> >> For NASA, we are creating a Critical Design Checklist:
> >>
> >> •       *Objective*
> >>
> >> -     *A checklist for designers to help them determine if a
> >> safety-critical design has met its safety requirements*
> >>
> >>
> > Kevin
> >
> > For this purpose, I interpret your phrase "safety requirements" for a
> > "safety-critical design" as meaning that any system that can be shown
> > to implement the design correctly will meet the safety requirements
> > for such a system in some required operating conditions.
> >
> > Here's my initial checklist:
> >
> > 1. Have you stated the "safety requirements" unambiguously and
> > completely? How do you know? Can you be certain? If not, what is your
> > confidence level and how as it derived?
> > 2. Have you specified unambiguously and completely the range of
> > operating conditions under which the safety requirements must be met?
> > How do you know? Can you be certain? If not, what is your confidence
> > level and how as it derived?
> > 3. Do you have scientifically sound evidence that the safety-critcal
> > design meets the safety requirements?
> > 4. Has this evidence been examined by an independent expert and
> > certified to be scientifically sound for this purpose?
> > 5. Can you name the both the individual who will be personally
> > accountable if the design later proves not to meet its safety
> > requirements and the organisation that will be liable for any
> damages?
> > 6. Has the individual signed to accept accountability? Has a Director
> > of the organisation signed to accept liability?
> >
> > Of course, there is a lot of detail conceled within these top-level
> > questions. For example, the specification of operating conditions is
> > likely to contain detail of required training for operators, which
> > will also need to be shown to be adequate.
> >
> > But there's probably no need to go into more detail as you will
> > probably get at least one answer "no" to the top six questions.
> >
> > What will you do then?
> >
> > Regards
> >
> > Martyn
> >
> >
> >
> > ---------------------------------------------------------------------
> -
> > --
> >
> > _______________________________________________
> > The System Safety Mailing List
> > systemsafety at TechFak.Uni-Bielefeld.DE
> 
> --
> 
> Peter Bishop
> Chief Scientist
> Adelard LLP
> Exmouth House, 3-11 Pine Street, London,EC1R 0JH http://www.adelard.com
> Recep:  +44-(0)20-7832 5850
> Direct: +44-(0)20-7832 5855
> _______________________________________________
> The System Safety Mailing List
> systemsafety at TechFak.Uni-Bielefeld.DE
_______________________________________________
The System Safety Mailing List
systemsafety at TechFak.Uni-Bielefeld.DE



More information about the systemsafety mailing list