[SystemSafety] Critical Design Checklist

Thu Aug 29 03:36:11 CEST 2013

Hi Kevin,

I've run into similar definitional issues with what constitutes a
functional failure in FHA's and the taxonomy (usually fairly simple) of
failure that is used. My conclusion was that a lot (perhaps a majority?) of
FHA's suffer from incompleteness because their definition of what is a
function and therefore it's specification is incomplete. Which may, or may
not, matter given the level of abstraction you're working at. Wrote a post
here, http://wp.me/px0Kp-16s if you're interested.

On Thu, Aug 29, 2013 at 3:44 AM, Driscoll, Kevin R <
kevin.driscoll at honeywell.com> wrote:

>  > Is "it" (the revised taxonomy) just re-inventing the wheel, or is
> there something else going on?****
>
> As one who abhors re-inventing the wheel (particularly when the result may
> have some corners on it), we don’t do this unless we need to.****
>
> ** **
>
> There are a number of problems with trying to make such a taxonomy.****
>
> One is the trade between making the fault classes as broad as possible (to
> make sure we have covered as many faults as possible) versus making the
> fault class definitions concise and having useful properties (e.g., being
> able to map appropriate fault avoidance or fault tolerance techniques to a
> particular fault class).****
>
> Another problem is trying to simplify the high dimensionality of this
> space.  When reducing the dimensionality by aggregating fault classes into
> supersets, different hierarchies can result.  For example, should faults
> first be divided into Value faults and Timing faults with each having a
> subset that is Byzantine?  Or, should Byzantine be the superior set with
> Value and Timing subsets?  Whichever way this is done, it seems there is
> always some lack of orthogonality.****
>
> ** **
>
> There is no consensus on how a fault taxonomy should be constructed.  When
> a group of people is assembled for some purpose, in which individuals
> disagree on the taxonomy, some compromise taxonomy usually is created
> (often specific to the task at hand).  There is also a lack of consensus on
> a lot of the terminology.  For example, I disagree with the use of
> "arbitrary" as a synonym for or a description of “Byzantine” (need to edit
> the Wikipedia "Byzantine fault tolerance" page someday).  I don't think
> "arbitrary" should be used for a fault set that doesn't include power
> source overvoltage, shrapnel from exploding capacitors, common mode
> failures due to compiler/linker or synthesizer bugs, ...****
>
> ** **
>
> Even the basic definitions of fault, failure, and error are not completely
> agreed.  I think the definitions created by IFIP WG10.4 are the best
> published and should be the ones generally used.  However, I think the term
> "error" should apply only to the difference in state for those elements of
> a device that are intended to hold state.  I vehemently disagree with those
> (including other members of WG10.4) who use "error" as the difference in
> any state of the device, including a structural state.  That is, I would
> not classify a broken wire as an "error".****
>
> ** **
>
> *From:* systemsafety-bounces at lists.techfak.uni-bielefeld.de [mailto:
> systemsafety-bounces at lists.techfak.uni-bielefeld.de] *On Behalf Of *Robert
> Schaefer at 300
> *Sent:* Tuesday, August 27, 2013 14:06
> *To:* Peter Bernard Ladkin
> *Cc:* systemsafety at lists.techfak.uni-bielefeld.de
> *Subject:* Re: [SystemSafety] Critical Design Checklist****
>
> ** **
>
> "It never seems to be exactly what we want." ****
>
> ** **
>
> Is "it" (the revised taxonomy) just re-inventing the wheel, or is there
> something else going on?****
>
> ** **
>  ------------------------------
>
> *From:* Peter Bernard Ladkin <ladkin at rvs.uni-bielefeld.de>
> *Sent:* Tuesday, August 27, 2013 1:01 PM
> *To:* Robert Schaefer at 300
> *Cc:* Driscoll, Kevin R; systemsafety at lists.techfak.uni-bielefeld.de
> *Subject:* Re: [SystemSafety] Critical Design Checklist ****
>
>  ****
>
> On 27 Aug 2013, at 18:07, Robert Schaefer at 300 <schaefer_robert at dwc.edu>
> wrote:****
>
> ** **
>
>  Would a complete taxonomy even be possible?  ****
>
> As the possibility of fault- contexts-in-the-world appears to be infinite
> or near infinite, wouldn't the number of fault types be near infinite as
> well?****
>
>  ** **
>
> Since "fault type" is a human classification, it is guaranteed not to be
> anywhere near infinite, but indeed quite finite. Perrow has a
> classification he called "DEPOSE". That has just six categories, one for
> each letter. ****
>
> ** **
>
> Whether it does what one wants it to do is another question, as Kevin
> points out.****
>
> ** **
>
> I would also propose that fault is also a human classification (since you
> talk about a fault in language, no matter how precise, your words may have
> another instance which fulfil them, and it is the words/concepts which
> define what you are talking about) whereas failure has at least a
> time/space stamp. Ideally. Unfortunately, in the current state of the (lack
> of) art, I think failure might often be lacking objectivity too, if a
> specification exists and is ambiguous.****
>
> ** **
>
> PBL****
>
> ** **
>
> Prof. Peter Bernard Ladkin, University of Bielefeld and Causalis Limited**
> **
>
> ** **
>
> ** **
>
>
>
> ****
>    ------------------------------
>
> *From:* systemsafety-bounces at lists.techfak.uni-bielefeld.de <
> systemsafety-bounces at lists.techfak.uni-bielefeld.de> on behalf of
> Driscoll, Kevin R <kevin.driscoll at honeywell.com>
> *Sent:* Tuesday, August 27, 2013 11:28 AM
> *To:* Matthew Squair
> *Cc:* systemsafety at lists.techfak.uni-bielefeld.de
> *Subject:* Re: [SystemSafety] Critical Design Checklist ****
>
>  ****
>
> > such a list should possess orthogonality, decidability, atomicity,
> criticality and a rationale. ****
>
> Addressing orthogonality (and completeness), the list should have a proper
> taxonomy.  But, that’s hard to do.****
>
>  ****
>
> Internally, we keep revisiting the creation of a taxonomy for fault types,
> even though much has been published on the subject.   It never seems to be
> exactly what we want.****
>
>  ****
>
> *From:* systemsafety-bounces at lists.techfak.uni-bielefeld.de [
> mailto:systemsafety-bounces at lists.techfak.uni-bielefeld.de<systemsafety-bounces at lists.techfak.uni-bielefeld.de>]
> *On Behalf Of *Matthew Squair
> *Sent:* Tuesday, August 27, 2013 04:12
> *To:* martyn at thomas-associates.co.uk
> *Cc:* systemsafety at lists.techfak.uni-bielefeld.de
> *Subject:* Re: [SystemSafety] Critical Design Checklist****
>
>  ****
>
> Not so much a list but a comment that the items in such a list should
> possess orthogonality, decidability, atomicity, criticality and a
> rationale. ****
>
>  ****
>
> The criticality should address Martyn's 'and what then' comment.
>
> On Tuesday, 27 August 2013, Martyn Thomas wrote:****
>
> On 26/08/2013 21:37, Driscoll, Kevin R wrote:****
>
>  For NASA, we are creating a Critical Design Checklist:****
>
> •       *Objective*****
>
> -     *A checklist for designers to help them determine if a
> safety-critical design has met its safety requirements*****
>
>  ****
>
> Kevin
>
> For this purpose, I interpret your phrase "safety requirements" for a
> "safety-critical design" as meaning that any system that can be shown to
> implement the design correctly will meet the safety requirements for such a
> system in some required operating conditions.
>
> Here's my initial checklist:
>
> 1. Have you stated the "safety requirements" unambiguously and completely?
> How do you know? Can you be certain? If not, what is your confidence level
> and how as it derived?
> 2. Have you specified unambiguously and completely the range of operating
> conditions under which the safety requirements must be met? How do you
> know? Can you be certain? If not, what is your confidence level and how as
> it derived?
> 3. Do you have scientifically sound evidence that the safety-critcal
> design meets the safety requirements?
> 4. Has this evidence been examined by an independent expert and certified
> to be scientifically sound for this purpose?
> 5. Can you name the both the individual who will be personally accountable
> if the design later proves not to meet its safety requirements and the
> organisation that will be liable for any damages?
> 6. Has the individual signed to accept accountability? Has a Director of
> the organisation signed to accept liability?
>
> Of course, there is a lot of detail conceled within these top-level
> questions. For example, the specification of operating conditions is likely
> to contain detail of required training for operators, which will also need
> to be shown to be adequate.
>
> But there's probably no need to go into more detail as you will probably
> get at least one answer "no" to the top six questions.
>
> What will you do then?
>
> Regards
>
> Martyn****
>
>
>
> --
> Sent from Gmail Mobile****
>
>  _______________________________________________
> The System Safety Mailing List
> systemsafety at TechFak.Uni-Bielefeld.DE****
>
>
> _______________________________________________
> The System Safety Mailing List
> systemsafety at TechFak.Uni-Bielefeld.DE
>
>

-- 
*Matthew Squair*
*
*
Mob: +61 488770655
Email: MattSquair at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.techfak.uni-bielefeld.de/mailman/private/systemsafety/attachments/20130829/1f881a6e/attachment-0001.html>