[SystemSafety] Safety Culture redux (David Green)

Thu Feb 22 10:22:28 CET 2018

I’d recommend Avizienis,  Laprie and Randall’s 2001 paper ‘Fundamental
Concepts of Dependability’.

http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.24.6074

On 22 February 2018 at 8:14:23 pm, David Crocker (dcrocker at eschertech.com)
wrote:

> On a related subject, does anyone have a good definition of what
> constitutes a software error/fault/bug etc. that is widely applicable,
> not just to critical systems? The definition I use is "failure to meet
> the reasonable expectations of the user", where what is reasonable is
> influenced by the documentation (including requirements specification if
> available, user manual etc.) or lack of it. But perhaps one of you has a
> better definition.
>
> David Crocker, Escher Technologies Ltd.
> http://www.eschertech.com
> Tel. +44 (0)20 8144 3265 or +44 (0)7977 211486
>
> On 22/02/2018 08:50, Chris Hills wrote:
>
> Hi,
>
> I don't mind what words we eventually use error, failure, fault, defect as
> long as everyone gets on board with "errors not bugs" and writes about it
> during 2018. Get the discussion going and a move away from "bug" and the
> cosy expectancy of them.
>
> Once we use error/defect etc it shifts the emphasis from the expected to
> something that needs sorting. It also puts the ownership back with the
> programmers. Once they have to fix "errors" and are therefore more careful
> it will work back up the process.
>
> As it is now programmers are generally happy to work with incomplete and
> ambiguous requirements and designs. They fill in the blanks. If we get a
> sea change in the terminology from bugs to error/defect etc they hopefully
> they will want to stop being associated with "error" and will where ever
> possible start demanding incomplete designs or requirements are fixed. They
> won't want to carry the can for someone else's errors.
>
> It's not going to happen overnight but let's get the discussion going in
> 2018 and start the change. Start writing articles blogs papers etc "errors
> not bugs" and tae it from there. Even if you end up with defect or failure
> start the conversation going and stamp out bugs. If more of you do it along
> with those of us who have started the ball rolling it might actually work.
>
> If not now when? After you or your family age killed by a friendly
> software "bug".
> Do it for the children. :-)
>
>
> -----Original Message-----
> From: systemsafety [mailto:
> systemsafety-bounces at lists.techfak.uni-bielefeld.de] On Behalf Of Peter
> Bernard Ladkin
> Sent: Thursday, February 22, 2018 6:52 AM
> To: systemsafety at lists.techfak.uni-bielefeld.de
> Subject: Re: [SystemSafety] Safety Culture redux (David Green)
>
>
>
> On 2018-02-22 00:54 , Steve Tockey wrote:
>
> IEEE already has a recommended vocabulary:
>
> Incident = any difference between the observed result and the expected
> result
>
> Failure = it has been determined that the observed result is incorrect
>
> Fault or Defect = the aspect of the code caused the incorrect result
>
>
> If adequate vocabulary already exists, why try to invent new terms?
>
> Because there are things wrong with this series of definitions.
>
> First, an incident in most people's usage is an event. With nothing
> counterfactual about it. It just is (or was). A "difference" is not an
> event, but a contrastive feature of two things, one of which is
> counterfactual. So the definition of "incident" here confuses an event
> (what did happen) with its features (that one of the aspects contrasts with
> what was expected).
>
> Contrastive description is common and useful, but it is better not to
> conflate an event with its description, for the following reasons amongst
> others. Obviously, in order to individuate an event you do so with a
> description, because that is in part how language works. A description (if
> it fits) picks out an aspect of an event. But that aspect may be
> superficial, and not key. If someone proffers a superficial description,
> you want a second person to be able to say "that is not all of what went
> wrong here, that is just a part of it". Whereas, with this definition, the
> second person is not refining what the first said by identifying a more
> significant aspect, they are literally describing a different incident. You
> have as many different incidents as you do aspects, and the set of aspects
> is not usually very well bounded. William of Ockham had something to say
> about that.
>
> It is usual to designate a complex-system incident or accident that as an
> event, one event. But according to the IEEE definition, it becomes a
> plethora of difference specifications.
>
> Second, the definition of "failure" requires a "determination", which is a
> human act. If the system is not sociotechnical, then failure is an
> objective matter without a social component. Further, I think we can bet
> that the IEEE does not say what a "determination" consists in. Continuing,
> the definition makes essential use of the notion of "correct". Is that
> defined somewhere? "Correct" and "incorrect" are both notions which involve
> a comparison between a result and a norm. What norm would that be? "What
> the system should have done"? The problem there is the word "should", which
> has a moral connotation. "What person X thinks would have been a more
> appropriate outcome"? How do you pick person X? "What most people dealing
> with the system agree would have been a more appropriate outcome"? How do
> you select that crowd? We might like to say "What the system specification
> says happens in that case". But that supposes the system has a
> specification, and that specification is adequate to d
>
> et
>
> ermine how the system behaves in this case. One suspects the definition
> was formulated to finesse that need.
>
> Third, the common idea of "fault" is "<certain system aspects> which
> caused the failure" (with "<certain system aspects>" to be determined. It
> was likely causally contributory to the failure that the system received
> certain inputs - the definition of "fault" here entails that the presence
> of those inputs are part of the fault. Should that be so? Intuitively, we
> would say no: (a) if the inputs were inappropriate, they should have been
> filtered and the lack of filtering was part of the fault, not the inputs
> themselves; (b) if the inputs were appropriate, then it is the way the
> system processed them that is usually taken to be the fault, not the inputs
> themselves.
>
> Can we fix these issues easily? Sure. I recommend, as usual, the
> definitions in
> https://causalis.com/90-publications/99-downloads/DefinitionsForSafetyEngineering.pdf
>
> PBL
>
> Prof. Peter Bernard Ladkin, Bielefeld, Germany MoreInCommon Je suis
> Charlie
> Tel+msg +49 (0)521 880 7319 www.rvs-bi.de
>
>
>
>
>
>
> _______________________________________________
> The System Safety Mailing List
> systemsafety at TechFak.Uni-Bielefeld.DE
>
>
>
> _______________________________________________
> The System Safety Mailing List
> systemsafety at TechFak.Uni-Bielefeld.DE
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.techfak.uni-bielefeld.de/mailman/private/systemsafety/attachments/20180222/1b88bbc2/attachment.html>