[SystemSafety] Statistical Assessment of SW ......

Fri Jan 23 07:42:40 CET 2015

On 2015-01-21 14:15 , jean-louis Boulanger wrote:
> For software it's not possible to have statistical evidence.
> the failure is 1 (yes the software have fault and failure appear)

This argument came up again yesterday in a standards-committee meeting. It is usually attributed to
third party "engineers with whom I work", because nobody quite seems to claim they hold the view
themselves when I'm in the room :-) ....

So it might be worthwhile to adduce the proof - again. It's real short.

Suppose you have a piece of SW S which is deterministic. And S is also not perfect, so it outputs
right answers on some inputs and wrong answers on others. And S reverts to an initial state with no
memory of its previous behavior each time it produces its output.

Suppose the distribution of inputs to S has a stochastic character. That is, the input I is a random
variable. Then the output outS(I), which is a function of the input I, also has stochastic
character. A deterministic transformation of a random variable is itself a random variable.

Let us transform outS(I) further, deterministically. Define
CorrS(I) = 1 if outS(I) is correct
CorrS(I) = 0 if outS(I) is incorrect

Then again CorrS(I) has also a stochastic nature and is a random variable.

Thus, if the input to a piece of SW has stochastic nature, then so does the correctness behavior of
the SW.

QED.

The only reasonable objection to this argument which I have heard is to dispute whether inputs have
a stochastic nature.

So, say you build a railway locomotive control system. The piece of track the locomotive runs on has
a fixed architecture, so the argument would run that the behavior of the locomotive is more or less
determined within certain parameters (whether signal X is red or green) and does not have a
stochastic nature. But various parameters such as the condition of the track, the nature of the load
on the locomotive, and other environmental conditions such as wind speed and weather (icy track, or
dry track, and when icy where the ice is) make it practically all but impossible to predict the
inputs to the control system. Besides, at design time the design does not involve designing to the
specific route the locomotive will run on. The designer is ignorant of the application. So the
inputs to the control system as known at design time have a stochastic nature if you are a Bayesian.

I would like to remark here, again, on a couple of incoherences in IEC 61508 and "derivative"
standards.

Something which executes a safety function must consist of both HW and SW, because SW alone cannot
take action. A HW-SW element which executes a safety function is assigned a reliability goal, which
is mostly encapsulated in the SIL. These reliability goals are the safety requirements. A
reliability goal is expressed in terms of probability of function failure per demand, or per unit
time. Suppose that the correct functioning of the HW-SW element E is functionally dependent on the
correct functioning of its SW S (which for most actuators it is). The standard requires one
demonstrates that the reliability is attained (that the safety requirement is fulfilled).

How this is actually done must be something like the following.

We assume as above that the element E deterministically transforms its inputs. We define the
function CorrE as above. Given a distribution of inputs Distr(I), then the probability that E
functions correctly is given by
(Integral over Distr(I) of the function CorrE(I)) divided by (Integral over Distr(I) of the constant
1).

Notice that the probability of correct functioning, the safety requirement as laid down by IEC
61508, is dependent on Distr(I). Change Distr(I) and one can usually expect the probability to
change. (For example, let Distr(I) be the Dirac Delta function on one incorrect input. Then the
probability that E functions correctly is 0.)

Yet in IEC 61508, and everywhere else, Distr(I) is not mentioned. Not once.

This is incoherent.

One could fix it, maybe, by just assuming the uniform distribution on all inputs, by default. Or the
normal distribution. There may be reasons for this, but it is worth pointing out that Distr(I) in
real applications is almost never uniform or normal. If there is a distribution D for which it can
be argued that the real-world input distribution "almost always approximates D" then one could
choose D as the default instead.

The second incoherence is as follows. If the SW does not attain the safety requirement, then E does
not attain the safety requirement, under a certain plausible assumption, namely that if CorrS(I) =
0, then CorrE(I) is almost always 0. (That is, the HW may sometimes fortuitously compensate for
incorrect SW behavior, but mostly not.) Then in order for E to fulfil the safety requirement, it
must be the case that

(Integral over Distr(I) of the function CorrS(I)) divided by (Integral over Distr(I) of the constant
1) GEQ (Integral over Distr(I) of the function CorrE(I)) divided by (Integral over Distr(I) of the
constant 1)- epsilon

(epsilon is there to instantiate the "almost" part of the assumption).

So, since the safety requirement on E has a probabilistic calculation as a component, so must the
inherited safety requirement on S.

Yet there is no requirement in IEC 61508 to substantiate that inherited safety requirement on S. The
only condition on software safety requirements is the techniques which are recommended to be used
during development of S.

In particular, if you don't think that the execution of SW can have a stochastic nature, such as
Jean-Louis, you are thereby committed to the view that IEC 61508 and its derivates are inherently
incoherent. It must be a difficult world to live in ......

PBL

Prof. Peter Bernard Ladkin, Faculty of Technology, University of Bielefeld, 33594 Bielefeld, Germany
Je suis Charlie
Tel+msg +49 (0)521 880 7319  www.rvs.uni-bielefeld.de