[SystemSafety] Degraded software performance [diverged from Fault, Failure and Reliability Again]

Matthew Squair mattsquair at gmail.com
Fri Mar 6 08:18:21 CET 2015


So after all that discussion did we resolve anything?

Matthew Squair

MIEAust, CPEng
Mob: +61 488770655
Email; Mattsquair at gmail.com
Web: http://criticaluncertainties.com

On 6 Mar 2015, at 12:38 pm, Les Chambers <les at chambers.com.au> wrote:

  Nick

In support of your ideas I offer the following:

Peter's paper on software reliability presents a theoretical paradigm (or
pattern of thinking). If it included experimental evidence that it
faithfully explained the observed behaviour of software in the real world
it could be classed as a practical paradigm and as such become a tool of
engineering.  It clearly has not reached that stage. In fact I would shrink
from rating it with the geocentric theories of Aristotle and Ptolemy, that
had the sun revolving around the earth. That paradigm did have some
utility, in that it did explain some aspects of the motions of the
planets.  There were anomalies however. Observations that weren't explained
by the accepted paradigm.  Even though the heliocentric model existed as
early as 300 BC, (refer, Aristarchus of Samos) astronomers practiced wilful
denial of geocentric anomalies, encouraged by religious faith and the
threat of indictment as a heretic.  It took roughly 1800 years for
Copernicus and fellow travellers, Johannes Kepler and Galileo Galilei to
announce that the King had no clothes and the heliocentric paradigm
provided a better explanation. On the order of Pope Paul V, Galileo seemed
to suffer most for his honesty, being dragged before an inquisition and
forced to recant. But more on this later.

Back on subject, I assert that:

1.                   Standards are tools of engineering (they are
definitely not a publishing vehicle for scientific theories).

2.                   Given that there does not exist a paradigm that
adequately explains the observed behaviour of software in the reliability
context, in a broad enough set of computing environments, all references to
theoretical candidates should be deleted from 61508.



As I have said before, anyone developing or maintaining a standard should
be solidly focused on its audience and end use. This standard is attached
to contracts. When this happens development organisations must factor in
the cost of compliance. This becomes difficult to estimate as practitioners
spend endless meetings arguing the point with clients over immature
theoretical paradigms, not to mention the expensive and unproductive games
that are played around claiming compliance with clauses that offer more a
slippery idea than a hard validatable fact/rule/process. This adds cost to
complex systems development and reduces industry productivity while adding
zero value to the end product. The 61508 team should reflect on this.



Further, in many of the posts on this subject, I have noted a degree of
cringing before Peter's mathematical elegance.  Sentences such as "I am not
a mathematician ... I'm not qualified to judge ... ". I encourage these
respondents to think like engineers.  It is not necessary to understand the
details of Peter's theory (I lost the ability to prove Maxwell's equations
decades ago). All we need from him is a set of experimental results that
prove PBL Paradigm X (described in terms of implementable rules, equations
and processes) adequately explains the behaviour we observe in the real
world (frankly I don't care how many papers he's published in the past).
This is the standard engineering attitude - it's harsh but fair.



RE: The IEC's bureaucratic inability to delete bad elements of existing
standards:

One solution is to open 61508 to broader review and comment. At the SCSC
symposium in Bristol I asked Peter if there was a mechanism for
practitioners (such as I) to comment on drafts of 61508 - as I have in the
past with IEEE standards. I received two responses, one verbal and the
other nonverbal. Both of them gave me grave concern.

1.                   The verbal: "The IEC believes it has a process that
works ... Comments on the standard are confined to committee members." (or
words to that effect)

2.                   The non verbal: he looked at me as though I had just
broken wind.  The sort of look you get from the functionaries in the Paris
Ritz when you approach the Hemingway Bar in a pair of jeans and dusty shoes
- unwashed and unworthy.



My point is this: the 61508 maintenance effort must not be the domain of an
exclusive club of wise men.  It must be opened for comment from its end
users - engineers. Further, the makeup of the committee should be biased in
favour of engineers as opposed to scientists  (is this true now? How many
people on that committee have ever had direct responsibility in any
capacity for any element of a safety critical system?  Especially
responsibility for making the case to the client that a development
organisation has complied).

If it's broken engineers should have the power to fix it, simply because we
are its end users.



RE: Reducing 61508 to practice

In reading the posts on this subject I have the sense of wilful denial not
unlike that of the astronomers of old. It's been stated that IEC 61508-7
Annex D is anomalous (just like the geocentric model of astronomy), but we
are powerless to fix it for political (religious?) reasons.

Who runs the IEC: the Pope? What do you fear: excommunication? I say to the
people on the committee, "Try harder. Do what is right and damn the
torpedoes."

For normative guidance: If cull we must, let's apply a test. There are two
simple questions you could ask about any element of the standard.

1.           Is this paradigm supported by empirical evidence?

2.           Is this paradigm supported by the engineering community.



And as for Peter; press on mate.  We could be witnessing the birth of a
novel theory. Have courage, ignore the critics. Remember Galileo as they
dragged him out of the inquisition toward internal exile, mumbling "Eppur
si muove" (and yet it moves). But please reconsider including theoretical
paradigms in a standard that cries out for practical implementability.
Instead, get back to us when you have some compelling experimental evidence
(engineers are a patient lot, we're capable of waiting 1800 years).

... And, in the meantime, consider pushing for:

1.           Deletion of Annex D and any of its theoretical ilk.

2.           Broader industry review of 61508 drafts so the people who
suffer from its shortcomings can at last have a say.

3.           A reduction in price to the point where an average Joe can
afford to read it.



Cheers

Les



*From:* systemsafety-bounces at lists.techfak.uni-bielefeld.de [
mailto:systemsafety-bounces at lists.techfak.uni-bielefeld.de
<systemsafety-bounces at lists.techfak.uni-bielefeld.de>] *On Behalf Of *Nick
Tudor
*Sent:* Thursday, March 5, 2015 7:42 PM
*To:* Peter Bernard Ladkin
*Cc:* The System Safety List
*Subject:* Re: [SystemSafety] Degraded software performance [diverged from
Fault, Failure and Reliability Again]



Peter et al



I am going to close my input to this thread with the following:



This thread was started based upon the request for feedback on the idea of
software reliability; I think that request has been fulfilled in spades and
is a good use of this forum. There is a plan to update a standard IEC61508
with material about how one might use software reliability in safety
systems.  Standards are supposed to represent the consensus of the
community and it has been reported by others on this list that many
standards do not recognise this approach.  Some of these standards claim to
be based upon the template of IEC61508, EN50128 being a good example and
ISO26262 being a bad one.  Neither of these recognise 'software
reliability' and, as I and others have pointed out, the aerospace standard
DO-178C and its predecessors don't either.  These bodies of work represent
quite a consensus in the community that there is no recognised basis for
the use of software reliability.  While I know that for example, the UK
H&SE have been briefed by consultants the notion that such a phenomenon
exists, it has, in my and many others views in the UK, held back the
sensible use of software in systems.  It continues to hamper efforts to
update aged and ageing analogue nuclear power systems (for which there are
no like-for-like analogue parts manufactured any more) with digital
systems.  This is costing the industry and, much more importantly, the tax
payer and is not necessarily helping make the systems safer, only more
costly.



I was fortunately able to support the working groups for DO-178C and
continue to do so.  Like many in industry, I cannot afford the time nor the
money to support more than one.  It is therefore beholden upon those who
can support such fora that they take note of the wider consensus.



Standards are there to help industry to do many things, one of which is to
control costs.  Basing development on a phenomenon that does not have
consensus across the discipline of computer science/software engineering
would be a disservice to the wider community.  I therefore request that the
proposed update to 61508 removes any reference to software reliability.


  Nick Tudor

Tudor Associates Ltd

Mobile: +44(0)7412 074654

www.tudorassoc.com



*77 Barnards Green Road*

*Malvern*

*Worcestershire*


*WR14 3LR Company No. 07642673*

*VAT No:116495996*



*www.aeronautique-associates.com <http://www.aeronautique-associates.com>*



On 5 March 2015 at 06:24, Peter Bernard Ladkin <ladkin at rvs.uni-bielefeld.de>
wrote:

I think that Drew has pointed out a number of phenomena which result in
software changing
inadvertently and/or unobserved over time. Attributing the causes of those
phenomena to "SW" or "HW"
is I suggest of secondary interest. What is of primary interest is that
they do occur and as a
result software changes/can so change with the passage of time.

Suppose one wants to talk about that phenomenon. To me, "software
degradation" seems a reasonable
term to use. Others may prefer to invent another term.

It's clear to me that the phenomena (maybe not all of them, but at least
some) do occur/have
occurred and anyone preserving critical software for any length of time
should take them into
account; devise detection and prophylaxis mechanisms and so on. Sort of
like hardware, really.

PBL

Prof. Peter Bernard Ladkin, Faculty of Technology, University of Bielefeld,
33594 Bielefeld, Germany
Je suis Charlie
Tel+msg +49 (0)521 880 7319  www.rvs.uni-bielefeld.de





_______________________________________________
The System Safety Mailing List
systemsafety at TechFak.Uni-Bielefeld.DE



_______________________________________________
The System Safety Mailing List
systemsafety at TechFak.Uni-Bielefeld.DE
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.techfak.uni-bielefeld.de/mailman/private/systemsafety/attachments/20150306/9a489c0a/attachment-0001.html>


More information about the systemsafety mailing list