[SystemSafety] Software reliability (or whatever you would prefer to call it)

Fri Mar 6 11:55:34 CET 2015

Martyn suggests that we put the language to one side.

My take on the core problem.

IEC 61508-7 [2010] Annex D "provides initial guidelines on the use of a
probabilistic approach to determining software safety integrity for
pre-developed software based on operational experience. This approach is
considered particularly appropriate as part of the qualification of
operating systems, library modules, compilers and other system software."

In effect, I select an appropriate set of test data, run my system for a
long time (or run lots of systems for a short time) and conclude - if no
failures are detected - that the system is safe.  

The longer that I test for, the higher the SIL level that can be assigned to
the component that is being evaluated.

In my book, this is Black-Box testing.

If we revise this appendix as Peter proposes, then we may be able to help
people to select more appropriate test data (and this may be an improvement)
- but this will still be Black Box testing.

If we can't avoid this appendix altogether (and I'm sure that Bertrand is
right about this), then we should - surely - be able to require some
additional "White Box" assessments, such as code reviews, design reviews,
etc (in line with the rest of the standard).  

If we can achieve this, I would sleep more easily.

Michael.

-----Original Message-----
From: systemsafety-bounces at lists.techfak.uni-bielefeld.de
[mailto:systemsafety-bounces at lists.techfak.uni-bielefeld.de] On Behalf Of
Martyn Thomas
Sent: 06 March 2015 09:56
To: systemsafety at lists.techfak.uni-bielefeld.de
Subject: [SystemSafety] Software reliability (or whatever you would prefer
to call it)

I'm puzzled by much of this discussion. Consider this common example:

A company creates a software package and submits it for beta testing by a
group of users. Assume that the package reports how often it is used and for
how long, and the users report all errors they encounter. Assume there is a
single instance of the software on a server that all the users use.

The company corrects some of the errors that are reported.

The company calculates some measure of the amount of usage before failure.
Call it MTBF.

The MTBF is observed to increase.

What word shall we use to describe the property of the software that is
increasing?

I'd call it "reliability". If you would, too, then how can software
reliability not exist?

I don't mind if you want to use a different word to describe the property.
Let's just agree one, do a global replace in the offending standards and
move on ...

... to discussing a practical upper bound on the "reliability" that can be
assessed in this way - and on the assumptions that should be made explicit
before using any such assessment as a prediction of future performance.

Martyn

_______________________________________________
The System Safety Mailing List
systemsafety at TechFak.Uni-Bielefeld.DE