[SystemSafety] What do we know about software reliability?

Nick Tudor njt at tudorassoc.com
Tue Sep 15 11:00:36 CEST 2020


Peter wrote:

"And statistical testing is used in the UK nuclear industry fore safety
critical systems,"

This is true, but note that the statement is safety critical 'systems'; not
'software'.  While the UK standards do talk about 'on demand' and some
numbers, it is actually focussed at the system level, not the software.
There is (at last) a distinct but still tacit acceptance that it is a
nonsense to accept software has a 'reliability' and hence to be treated in
the same way that hardware fails.  It is much better to give a reason why
the software should be trusted rather than rely (!) on some outdated
research for which the foundations are highly questionable.  This is now
gaining acceptance rather than a statistical approach.

As I recall, I have said before on this list, software has no wear out
mechanism so software reliability is somewhat meaningless.  I was widely
abused (some even said bullied) for suggesting that software reliability
was not the right way of thinking about software assurance.  It is
therefore with some trepidation that I dive into this thread.



Nick Tudor
Tudor Associates Ltd
Mobile: +44(0)7412 074654
www.tudorassoc.com

*77 Barnards Green Road*
*Malvern*
*Worcestershire*
*WR14 3LR*
*Company No. 07642673*
*VAT No:116495996*

*www.aeronautique-associates.com <http://www.aeronautique-associates.com>*


On Tue, 15 Sep 2020 at 09:46, Peter Bishop <pgb at adelard.com> wrote:

> On 14/09/2020 15:04, Martyn Thomas wrote:
>
> Why are you completely dismissing software reliablity?
>
> Is it not the case that if you can tolerate a failure rate of once in 1000
> hours, 99% confidence through testing would take about 200 days to
> demonstrate (so long as the test environment is "sufficiently" like the
> future operating environment and you are able to detaect every failure
> correctly)?
>
> And statistical testing is used in the UK nuclear industry fore safety
> critical systems, so it is not just abstract theory,
>
> Re your characterisation of confidence based statistical testing on P153
> (with no reference), I do not think it is fair to dismiss this because "p
> can vary by orders of magnitude". Testing presumes a fixed operational
> profile and a constant probability of failure.
>
> There has also been some work on the impact of profile change on the bound
> that can be claimed.
>
>
> https://www.researchgate.net/publication/307555914_Deriving_a_frequentist_conservative_confidence_bound_for_probability_of_failure_per_demand_for_systems_with_different_operational_and_test_profiles
>
> BTW, re, your summary of my paper on the same page, I think you missed the
> main point. This is a* predictive* theory to derive a worst case bound
> for some time in the future, i.e.
>
> Given N faults what is the worst possible reliability  at some future time
> T?
> - it assumes fault fixing  will occur during that time.
>
> You also only presented the theory of N=1, and you seem to assume the T
> has already happened with zero failures (not a requirement for this model)
>
> Might have been better to reference the original worst case bound version
> (which makes it clear that it is a long term forward prediction)
>
>
> https://www.researchgate.net/publication/3152200_A_conservative_theory_for_long-term_reliability-growth_prediction
>
> Of course, the testing would have to be repeated following a change to the
> software, unless you have enough formality to show that the change cannot
> affect reliability.
>
> In specific circumstances, you can do better than this. Bev Littlewood's
> published papers provide strong evidence and a rich bibliography. Bev's
> paper on "How reliable is a program that has never failed?" offers a useful
> rule-of-thumb: that aften n hours of fault free operation, there is about
> 50% chance of a failure in the following n hours (subject to some obvious
> constraints).
>
> The difficulties rapidly escalate when you need 10^-4 or better at >90%
> confidence.
>
> Martyn
> On 14/09/2020 14:14, SPRIGGS, John J wrote:
>
> In my experience, if Software Reliability is mentioned at a conference, at
> least one member of the audience will laugh, and if it is mentioned in a
> work discussion, at least one member of the group will get angry.
>
> Interestingly, some of the same people who say it is impossible to
> quantify software failure rates will set numerical requirements for
> Software Availability – if you get one of those, ask the Customer how (s)he
> wants you to demonstrate satisfaction of the requirement.
>
>
>
> John
>
> *From:* systemsafety <systemsafety-bounces at lists.techfak.uni-bielefeld.de>
> <systemsafety-bounces at lists.techfak.uni-bielefeld.de> *On Behalf Of *Derek
> M Jones
> *Sent:* 14 September 2020 12:54
> *To:* systemsafety at lists.techfak.uni-bielefeld.de
> *Subject:* [SystemSafety] What do we know about software reliability?
>
>
>
> All,
>
> What do we know about software reliability?
>
> The answer appears to be, not a lot:
>
> http://shape-of-code.coding-guidelines.com/2020/09/13/learning-useful-stuff-from-the-reliability-chapter-of-my-book/
>
> --
> Derek M. Jones Evidence-based software engineering
> tel: +44 (0)1252 520667 blog:shape-of-code.coding-guidelines.com
> _______________________________________________
> The System Safety Mailing List
> systemsafety at TechFak.Uni-Bielefeld.DE
> Manage your subscription:
> https://lists.techfak.uni-bielefeld.de/mailman/listinfo/systemsafety
>
>
> ------------------------------
> If you are not the intended recipient, please notify our Help Desk at
> Email Information.Solutions at nats.co.uk immediately. You should not copy
> or use this email or attachment(s) for any purpose nor disclose their
> contents to any other person.
>
> NATS computer systems may be monitored and communications carried on them
> recorded, to secure the effective operation of the system.
>
> Please note that neither NATS nor the sender accepts any responsibility
> for viruses or any losses caused as a result of viruses and it is your
> responsibility to scan or otherwise check this email and any attachments.
>
> NATS means NATS (En Route) plc (company number: 4129273), NATS (Services)
> Ltd (company number 4129270), NATSNAV Ltd (company number: 4164590) or NATS
> Ltd (company number 3155567) or NATS Holdings Ltd (company number 4138218).
> All companies are registered in England and their registered office is at
> 4000 Parkway, Whiteley, Fareham, Hampshire, PO15 7FL.
> ------------------------------
>
> _______________________________________________
> The System Safety Mailing Listsystemsafety at TechFak.Uni-Bielefeld.DE
> Manage your subscription: https://lists.techfak.uni-bielefeld.de/mailman/listinfo/systemsafety
>
>
> _______________________________________________
> The System Safety Mailing Listsystemsafety at TechFak.Uni-Bielefeld.DE
> Manage your subscription: https://lists.techfak.uni-bielefeld.de/mailman/listinfo/systemsafety
>
> --
>
> Peter Bishop
> Chief Scientist
> Adelard LLP
> 24 Waterside, 44-48 Wharf Road, London N1 7UX
>
> Email: pgb at adelard.com
> Tel:  +44-(0)20-7832 5850
>
> Registered office: 5th Floor, Ashford Commercial Quarter, 1 Dover Place, Ashford, Kent TN23 1FB
> Registered in England & Wales no. OC 304551. VAT no. 454 489808
>
> This e-mail, and any attachments, is confidential and for the use of
> the addressee only. If you are not the intended recipient, please
> telephone 020 7832 5850. We do not accept legal responsibility for
> this e-mail or any viruses.
>
> _______________________________________________
> The System Safety Mailing List
> systemsafety at TechFak.Uni-Bielefeld.DE
> Manage your subscription:
> https://lists.techfak.uni-bielefeld.de/mailman/listinfo/systemsafety
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.techfak.uni-bielefeld.de/pipermail/systemsafety/attachments/20200915/19bd9c11/attachment-0001.html>


More information about the systemsafety mailing list