[SystemSafety] Qualifying SW as "proven in use"

Mon Jun 17 14:06:32 CEST 2013

Can I suggest that even the term "proven in use" is itself hazardously misleading? I have a simple example of a trivial 1/2 page of code that can't be fully tested (exhaustive input coverage) in the age of the known universe. In fact, even if one were able to execute 1 million test cases per second and one had started the testing 14 billion years ago (estimated "big bang") one would still be about 10 to the 74th power MILLION YEARS short of completely testing this mere half page of code.

As Boris Beizer has said about testing, "Our objective must shift from an absolute proof to a suitably convincing demonstration".

Calling something "proven in use" is patently absurd. Calling it "suitably demonstrated by adequate real-world use that the probability of serious defects is sufficiently low" would be much more appropriate (however one would need to seriously word-smith the second statement--smile).

-- steve

From: Martyn Thomas <martyn at thomas-associates.co.uk<mailto:martyn at thomas-associates.co.uk>>
Reply-To: "martyn at thomas-associates.co.uk<mailto:martyn at thomas-associates.co.uk>" <martyn at thomas-associates.co.uk<mailto:martyn at thomas-associates.co.uk>>
Date: Monday, June 17, 2013 4:13 AM
To: "systemsafety at techfak.uni-bielefeld.de<mailto:systemsafety at techfak.uni-bielefeld.de>" <systemsafety at techfak.uni-bielefeld.de<mailto:systemsafety at techfak.uni-bielefeld.de>>
Subject: Re: [SystemSafety] Qualifying SW as "proven in use"

I suggest that before any software  is permitted as part of a safety-related system, certain mandatory processes should be put in place:

  1.  The processes employed by the company to record failures in operation and changes to the software should be independently audited and certified to have enough integrity to justify the claims that are being made that the software has been "proven in use".
  2.  These processes, once audited and agreed to be adequate, must remain in place and be subject to annual independent audit.
  3.  Every failure, change to the software, (or change to the operating environment that makes it differ from the operating environment forecast in the safety case) must be reported to an independent safety assessor who must certify whether or not the safety case remains valid in the light of the failure or change.
  4.  If the safety case is deemed no longer valid, the safety of the system must be assured by other means or the system mut be withdrawn from service until it can be shown once again to meet the safety criteria.

These steps are proposed on the basis that they (a) provide assurance that the evidence for "proven in use" is robust. (b) provide assurance that if in-service use or subsequent changes invalidate the safety case then it will not be concealed, and (c) put the risk on the system owner that the "proven in use" claim turns out to be false.

Martyn

On 17/06/2013 11:32, Peter Bernard Ladkin wrote:
Folks,

there is a significant question how SW can be qualified as "proven in use" according to IEC 61508:2010. There is a judgement in some quarters (notably the German national committee) that the criteria in IEC 61508:2010 are inappropriate. I think it wouldn't be out of place to say that many in the IEC 61508 Maintenance Teams find the current criteria unsatisfactory in one way or another.

We in Germany have been discussing the issue and possible solutions for a couple of years, and recently the discussion has gone international. There seems to be a general feeling that qualifying SW statistically via the approach given by the exponential failure model is not practical, because the data requirements are overwhelming - it is regarded by most as implausible that companies will have the requisite data to the requisite quality even for SIL 2. But even if you qualify your SW for SIL 2 or higher without such data, then at some point some data will exist and people use such data as evidence that the original assessment was accurate. But what sort of evidence does it offer? The answer is probably a lot less than you might be convinced it does.

There seems to me to be a lack of examples where things can go wrong - at least a lack of examples specifically adapted to assessments according to IEC 61508:2010. So I wrote one up - fictitious but I hope still persuasive - to illustrate what (some of) the assurance issues are. I hope it can aid the debate.

http://www.rvs.uni-bielefeld.de/publications/WhitePapers/LadkinPiUessay20130614.pdf

PBL

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.techfak.uni-bielefeld.de/mailman/private/systemsafety/attachments/20130617/ab82dd19/attachment-0001.html>