[SystemSafety] FW: Software reliability (or whatever you would prefer to call it) [UNCLASSIFIED]

Fri Mar 6 18:39:28 CET 2015

This is a most interesting discussion.  I very much appreciate the
comments; especially those from my distinguished colleague, Michael
Holloway at NASA Langley.

Next week, NASA Ames is hosting a technical workshop entitled, ³Transition
to Autonomy.²  Every morning I harvest general ideas and comments from
this discussion thread to give my little grey cells something to
contribute at this event.

The topic of how to characterize, measure, and assure the Œperformance¹ of
safety-critical software in new autonomous, automated systems is either 1.
a potential show-stopper or 2. an enabler to implementing such advanced
software in aviation - depending on one¹s perspectiveŠ

Another colleague of mine at Langley is what he terms an 'optimistic
skeptic' when it comes to automation.  He asserts that software-enabled
autonomy may be a great thing, but we are implementing it incorrectly
because just shuts the human out of the loop and expects him/her to Œpay
attention¹ But we know that humans can¹s sustain that over long periods of
time. There are many who believe that we just need better displays and
better information driven by more Œreliable¹ (or what ever term we can
agree on) software. That helps, but it¹s not the answer. Better displays
and information won¹t help in routine commercial flight. We need to look
at what we want the human to do and then provide a function allocation
between human and machine that allows and even enhances his/her ability to
do that. Instead, we have put them in a quiet, dark flight deck with a
nice engine thrum (except on the A380) and tell them to pay attention to
the outputs from the avionics software. And then we usually startle the
heck out of them and demand that they respond quickly.  Similar arguments
could be made with respect to air traffic controllers and how they
interact with ground-based Performance Based Navigation systems.

Let¹s not forget that today¹s remarkable record in flight safety has been
achieved by the triad of software, hardware, and Œliveware¹ (people) - to
borrow a term from the ICAO SCHELL model.  Much of the discussion on this
forum has centered on the SW/HW dichotomy.  These two elements facilitate
the total system capabilities we want and perhaps are not goals to be
pursued in and of themselves.  They need to be set in context.

DOT/FAA/AR-10/27 is a 2010 document entitled, "Flight Crew Intervention
Credit in System Safety Assessments: Evaluation by Manufacturers and
Definition of Application Areas.²  The research effort described in that
reference is motivated by the following statement:

"According to current regulations for type certification of large
commercial aircraft, certification credit may be taken for correct and
appropriate action for both quantitative and qualitative assessments
provided that some general criteria are fulfilled. According to the same
regulations, quantitative assessments of the probabilities of flight crew
errors are not considered feasible. As a consequence, the system designer
is allowed to take 100% credit for correct flight crew action in response
to a failure. Previous research indicates that this leads to an
overestimation of flight crew performance."

So assessing human Œreliability¹ may be as difficult and as critical in
improving system-wide safety as assessing software Œreliability.¹  Each
has its own thorny challenges in how we define terms and measure
performance.

I very much look forward to following this thread in the days ahead...

Brian E. Smith

Special Assistant for Aeronautics
Human Systems Integration Division
Bldg N262, Room 120; Mail Stop 262-11
NASA Ames Research Center
P.O. Box 1000
Moffett Field, CA 94035

(v) 650.604.6669, (c) 650.279-1068, (f) 650.604.3323

Never let an airplane or a motorcycle take you somewhere your brain didn't
go five seconds earlier.

On 3/6/15, 6:17 AM, "RICQUE Bertrand (SAGEM DEFENSE SECURITE)"
<bertrand.ricque at sagem.com> wrote:

>Right, and this is the problem at least for process industries making
>huge use of this type of behaviour.
>
>Bertrand Ricque
>Program Manager
>Optronics and Defence Division
>Sights Program
>Mob : +33 6 87 47 84 64
>Tel : +33 1 58 11 96 82
>Bertrand.ricque at sagem.com
>
>
>-----Original Message-----
>From: systemsafety-bounces at lists.techfak.uni-bielefeld.de
>[mailto:systemsafety-bounces at lists.techfak.uni-bielefeld.de] On Behalf Of
>King, Martin (NNPPI)
>Sent: Friday, March 06, 2015 2:31 PM
>To: systemsafety at techfak.uni-bielefeld.de
>Subject: Re: [SystemSafety] Software reliability (or whatever you would
>prefer to call it) [UNCLASSIFIED]
>
>This message has been marked as UNCLASSIFIED by King, Martin (NNPPI)
>
>
>Many safety shutdown systems will spend a considerable proportion of
>their time (90%+) in one of two plant states (operational and
>maintenance) with parameters that are quite limited in range.  The two
>dominant states usually have parameter values that are quite disparate.
>Most of the remainder of the time is spent transitioning between these
>two states.  In an ideal world the limited range of parameter values that
>will cause a shutdown will never occur - in practise they will normally
>occur extremely rarely over the life of the plant.  Is this really the
>input value distribution that we want to test our equipment with?
>
>Martin King
>(My opinions etc, not necessarily those of my employer or colleagues!)
>
>
>-----Original Message-----
>From: systemsafety-bounces at lists.techfak.uni-bielefeld.de
>[mailto:systemsafety-bounces at lists.techfak.uni-bielefeld.de] On Behalf Of
>Martyn Thomas
>Sent: 06 March 2015 13:04
>To: systemsafety at lists.techfak.uni-bielefeld.de
>Subject: Re: [SystemSafety] Software reliability (or whatever you would
>prefer to call it)
>
>I agree. That's why I added the point about explicit assumtions before
>using such measurements to predict the future.
>
>There is usually a hidden assumption that the future input distribution
>will match that encountered during the measurement. But it's hard to
>justify having high confidence that such an assumption will prove correct.
>
>Martyn
>
>On 06/03/2015 12:32, Derek M Jones wrote:
>> Martyn,
>>
>>> The company calculates some measure of the amount of usage before
>>> failure. Call it MTBF.
>>
>> Amount of usage for a given input distribution.
>>
>> A complete reliability model has to include information on the
>> software's input distribution.
>>
>> There is a growing body of empirical work that builds fault models
>> based on reported faults over time.  Nearly all of them suffer from
>> the flaw of ignoring the input distribution (they also tend to ignore
>> the fact that the software is changing over time, but that is another
>> story).
>>
>
>_______________________________________________
>The System Safety Mailing List
>systemsafety at TechFak.Uni-Bielefeld.DE
>
>The following attachments and classifications have been attached:
>The data contained in, or attached to, this e-mail, may contain
>confidential information. If you have received it in error you should
>notify the sender immediately by reply e-mail, delete the message from
>your system and contact +44 (0) 1332 622800(Security Operations Centre)
>if you need assistance. Please do not copy it for any purpose, or
>disclose its contents to any other person.
>
>An e-mail response to this address may be subject to interception or
>monitoring for operational reasons or for lawful business practices.
>
>(c) 2015 Rolls-Royce plc
>
>Registered office: 62 Buckingham Gate, London SW1E 6AT Company number:
>1003142. Registered in England.
>
>_______________________________________________
>The System Safety Mailing List
>systemsafety at TechFak.Uni-Bielefeld.DE
>#
>" Ce courriel et les documents qui lui sont joints peuvent contenir des
>informations confidentielles, être soumis aux règlementations relatives
>au contrôle des exportations ou ayant un caractère privé. S'ils ne vous
>sont pas destinés, nous vous signalons qu'il est strictement interdit de
>les divulguer, de les reproduire ou d'en utiliser de quelque manière que
>ce soit le contenu. Toute exportation ou réexportation non autorisée est
>interdite Si ce message vous a été transmis par erreur, merci d'en
>informer l'expéditeur et de supprimer immédiatement de votre système
>informatique ce courriel ainsi que tous les documents qui y sont
>attachés."
>******
>" This e-mail and any attached documents may contain confidential or
>proprietary information and may be subject to export control laws and
>regulations. If you are not the intended recipient, you are notified that
>any dissemination, copying of this e-mail and any attachments thereto or
>use of their contents by any means whatsoever is strictly prohibited.
>Unauthorized export or re-export is prohibited. If you have received this
>e-mail in error, please advise the sender immediately and delete this
>e-mail and all attached documents from your computer system."
>#
>
>_______________________________________________
>The System Safety Mailing List
>systemsafety at TechFak.Uni-Bielefeld.DE