[SystemSafety] What do we know about software reliability?

Olwen Morgan olwen at phaedsys.com
Fri Sep 18 12:26:44 CEST 2020


On 15/09/2020 15:07, Martyn Thomas wrote:
>
> Software in its operating environment does degrade over time.
>
>   * What was fit for purpose one year no longer is the year following.
>   * as vulnerabilities are discovered, shared and exploited, failure
>     rates increase
>   * as software is maintained to fix known errors, the fault density
>     may steadily increase because the maintenance degrades the
>     artchitecture and more defects are introduced. (I have seen this
>     happen gradually to major software systems in my career).
>
> The failure rates can be determined statistically within 
> scientifically sound confidence levels. To me, "reliability" carries 
> the right message. It may be an imperfect analogy but many words are.
>
> Martyn
>
>
If we wish to quantify s/w reliability in probabilistic terms, then we 
are talking about the mean time between instances of presentation to the 
system of data that it should handle correctly but does not. To 
calculate this systematically  (leave alone accurately) one needs a 
probabilistic model of the input data sources. Regardless of the rigour 
with which the software has been constructed, without such modelling of 
input sources there is no *soundly-based* underpinning for any claim of 
s/w reliability in terms of quantitative MTBF. Also, even if you have 
modelled the input sources, you still need a decision procedure to 
determine whether any feasible input will actually result in a run-time 
error or incorrect behaviour or outputs. The only soundly-based way to 
do this is formal verification.

I think MTBF is valid measure of reliability for s/w but one needs to 
recognise that it can be soundly based only by use of formal modelling 
and proof and that the techniques for doing this for s/w, though 
conceptually the same as for physical engineering, differ radically in 
the kinds of models used. Sadly, even where I have worked on projects 
using CbyC, I have never been aware that probabilistic input source 
models were also inuse - except in the odd few cases where, pissing 
against the north wind, I was trying do it myself.

As abounds in S/E, the conceptual basis for getting things right exists 
but is typically unrecognised, denied or simply neglected. We don't need 
more argument about what software reliability is. MTBF is the right 
concept but we need to use the self-evidently sound methods that are 
available to us to estimate it rigorously.

All this is quite apart from issues of how to model factors that could 
be described as s/w degradation. To deal with vulnerabilties, one needs 
quantitative threat and risk modelling. Regression verification provides 
some mitigation against maintenance-induced faults but needs also to be 
accompanied by some means of assessing knock-on effects on other 
dependability attributes. As regards becoming unfit for purpose, 
performance often falls off as system usage ramps up. Techniques are 
available to model all of these things but in my experience they are 
rarely used with the rigour that would enable formal verification to 
produce trustworthy numerical estimates of the effects on s/w reliability.

Many years ago, I allowed someone to persuade me that programming is 
engineering and I've referred to myself as a software engineer ever 
since. Latterly, I'm coming back to the idea that software engineering 
should be regarded as a branch of applied mathematics, if only to draw a 
red line to mark the trespasses of the mathematically illiterate 
lumpenengineeriat.

Olwen


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.techfak.uni-bielefeld.de/pipermail/systemsafety/attachments/20200918/ce1bfb07/attachment.html>


More information about the systemsafety mailing list