[SystemSafety] Critical Design Checklist

Tue Aug 27 13:55:59 CEST 2013

How convenient!

The German national committee concerned with E/E/PE Functional Safety is currently assembling comments about any need for revision of IEC 61508 ("maintenance" cycle is expected to start Spring 2014). I'll pass these comments along to colleagues, if it's all right with everyone. (Comments don't have to be formal - we will formalise them as necessary.)

PBL

Prof. Peter Bernard Ladkin, University of Bielefeld and Causalis Limited

On 27 Aug 2013, at 13:49, "Les Chambers" <les at chambers.com.au> wrote:

> 1. Describe your process approach to hazard analysis and requirements definition.
> 2. Highlight all hazards based on operational experience and past history of accidents.
> 3. Can you provide some background on the people involved in the process. Can you present evidence that they were capable of recognising credible hazards. (Right answer: yes we had 20 people with a sum total of 300 years experience in the application domain. Wrong answer: we had this groovy consultant with a checklist. He sounded like he knew what he was doing.)
> 4. What is your strategy for demonstrating that all the safety requirements have been satisfied in the design.
> 5. How have you modelled the application domain. In other words, how well do you understand how it works.
> 6. What is your strategy for demonstrating your system's response to unsafe conditions in the application domain.
> 7. How much of your design depends on human intervention to mitigate safety hazards in high stress emergency environments.
> 8. What design measures have you taken to prevent unsafe system maintenance from destroying the safety integrity of your system.
> 9. What measures have you taken to establish the safety integrity of  third-party and legacy software.
> 10. To what degree does your system depend on the accuracy of configuration data. What measures does your design take to secure its integrity.
> 11. How does your design deal with integrating third-party interfaces.
> 12. How does your design support observability and testability, with a particular focus on regression testing. What elements of your design specifically support ongoing maintenance by an organisation other than the development team.
>  
> Failure scenarios (puerile stuff from bitter experience)
> 1. Disconnect and reconnect network cabling from a random selection of points in your system.
> 2. Simulate catastrophic system failure and evaluate your operations staff's response.
> 3. Simulate power failure and brownouts.
> 4. Are your backup systems the same height above sea level?
> 5. Evaluate the attitudes and experience of the people responsible for ongoing operations and maintenance. What is their attitude to safety? How many hours of safety training have they had? How many years of safety-related system experience have they had?
> 6. Turn off the air conditioner.
> 7. Activate the halon/deluge system.
> 8. How are your heatsinks kept in contact with your integrated circuit chips? They're not glued I hope.
>  
> Questions I don't want to be asked:
> 1. Did you fully regression test this system after the last modification?
> 2. With reference to that safety critical third-party software you integrated; and in respect of your claim that it is proven in use with 10 million failure free operational hours; please provide evidence that the exact same code executed on the exact same hardware configuration (an environment identical to your target environment) for each and every one of those 10 million hours.
> 3. Provide evidence that you stress tested that system at 150% of its nameplate capacity.
>  
> I guess that will do, I've frightened myself.
> Cheers
> Les
>  
>  
> From: systemsafety-bounces at lists.techfak.uni-bielefeld.de [mailto:systemsafety-bounces at lists.techfak.uni-bielefeld.de] On Behalf Of Driscoll, Kevin R
> Sent: Tuesday, August 27, 2013 6:38 AM
> To: systemsafety at techfak.uni-bielefeld.de
> Subject: [SystemSafety] Critical Design Checklist
>  
> For NASA, we are creating a Critical Design Checklist:
> •      Objective
> -      A checklist for designers to help them determine if a safety-critical design has met its safety requirements
> -      Not a “Have you done ...” checklist
> w  Too easy to just check “yes” without doing sufficient work
> w  Instead, “What have you done ...”
> w  Prove what you have done is sufficient
> •      We are looking for inputs to include in this checklist
> •      Do you have any inputs that should be included?
> -      Meta-question:  “If you were asked to participate in a design review of a safety-critical design, what questions would you ask?”  (Particularly, general questions you would have before seeing the details of a design.)
> -      Inverse meta-question:  “If you were presenting a design, what questions would you dread being asked?”  :-}
> w  Where are the bodies buried?
>  
> We are finishing the Checklist by next week and would like to include any good questions you may have that we have overlooked.   Realizing this is an imposition on your time, I am hoping some of you would be so kind as to spend just a few minutes to send questions or even question fragments.
>  
> --
> P.S.
> I am also looking for unusual failure scenarios to add to my collection, like those I’ve described in my series of “Murphy was an Optimist” presentations (e.g. http://www.rvs.uni-bielefeld.de/publications/DriscollMurphyv19.pdf).
>  
> _______________________________________________
> The System Safety Mailing List
> systemsafety at TechFak.Uni-Bielefeld.DE
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.techfak.uni-bielefeld.de/mailman/private/systemsafety/attachments/20130827/f0944499/attachment-0001.html>