[SystemSafety] Validating AI driven systems

Tue Apr 18 09:02:44 CEST 2017

On 2017-04-18 01:21 , Les Chambers wrote:
> The advent of AI driven safety critical systems is about to render obsolete all engineering
> processes we currently apply to validating trusted systems.

People have been looking at this issue for quite a while, with varied success.

25 years ago I was having conversations about V&V of neural-network-based software algorithms, that
is, serial programs which simulate NNs, with Leslie Smith at Stirling. Here is a recent position
paper on "Deep NNs" by Leslie http://www.cs.stir.ac.uk/~lss/recentpapers/AGI2016_lssintro.pdf

When I say "NN", I and others usually mean a simulated NN on a traditional von Neumann machine (or a
Valiant BSP machine, I guess).

I was pretty sure that Judy Crow, then at SRI International, did some work in the early 1990's on
prospects for verifying NNs, but today I can't find the project report.

More recently,

John Rushby wrote a paper in 2009 on assurance of adaptive NNs:
http://www.csl.sri.com/users/rushby/abstracts/aiaa09

Here's a blog post by Lindsey Kuper on a 2010 CAV paper by Pulina and Tacchella on verifying NNs:
http://composition.al/blog/2016/09/30/thoughts-on-an-abstraction-refinement-approach-to-verification-of-artificial-neural-networks/
The paper itself from Springer costs almost €30 :-(
I looked at Luca Pulina's WWW page, and found a reference to an archived copy at semanticscholar:
https://pdfs.semanticscholar.org/72e5/5b90b5b791646266b0da8f6528d99aa96be5.pdf

Boeing has flown an MD-11 (I think it was) with dynamically-adaptive control software which affected
engines as well as control surfaces, and reconfigured the control to continue to fly the aircraft in
the face of non-operable control surfaces. It was an attempt to see if adaptive algorithms could
help in the face of the kind of control destruction that happened in the Sioux City DC-10 accident.

The answer was that indeed it could. The flights were successful. The MD-11 is not a FBW aircraft;
the system was built-on. Here is something by Liu et al on engine-thrust compensation
https://ntrs.nasa.gov/archive/nasa/casi.ntrs.nasa.gov/20050238478.pdf A fair amount was also done by
Anthony Calise at Georgia Tech, but I can't seem to find many of his papers.

NASA has also flown an F-15 with adaptive control.
https://ntrs.nasa.gov/archive/nasa/casi.ntrs.nasa.gov/20060023991.pdf There is a chapter on it in
Schumann/Liu (below).

Here is a paper by Krishnakumar et al. at NASA Ames on adaptive resilient aircraft control
https://ti.arc.nasa.gov/publications/1320/download/ It might be in Schumann/Liu, I forget.

The problems of V&V are discussed in J. Schumann and Y. Liu, Applications of Neural Networks in High
Assurance Systems, Springer Verlag, 2010, http://www.springer.com/us/book/9783642106897 . The intro
chapter is particularly interesting. In the rest of the book, there is a lot of discussion of
validating Liapunov conditions. There is a chapter on the F-15's adaptive control system.

Johann Schumann's NASA home page is https://ti.arc.nasa.gov/profile/schumann/

As far as I know, nobody in aerospace can see at the moment how to validate adaptive NNs for civil
aeronautical use, what arguments will suffice to render them demonstrably airworthy. The case for
statically-trained NNs is rather easier, because in principle the math lets you get a handle on it,
even though the math is hard. The known math is in any case insufficient to let us detect the areas
of control-system instability even with traditional control; flight testing is essential. I imagine
there could be an argument that this is not essentially different with statically-trained NN control.

We have a robotics group in Bielefeld which produced control SW for (ground-based) robots using NN
design. Their networks are all statically trained. (Past tense, because the group leader moved to
the Technical University of Braunschweig in mid-2016.)

The aerospace situation is much more constrained than such use on roads. The references above all
concern the operation of physical control systems, methods of achieving a specific dynamic state,
whereas the road situation also involves decision procedures, such as we have considered here in
brief discussions of Trolleyology.

Knight's article from the MIT Tech Review concerns a third aspect, namely when the networks are
trained/have trained themselves on masses of data ("deep learning"), how to explain
decisions/predictions the network then makes.

If you are talking about illness prediction, a large component of which has been known for a long
time to have a statistical character, then you do need to conjure up explanations, for you can't put
people on prophylactic therapy, which might itself be life-altering, without having a good idea why.
"The machine said to do it" is not a good reason.

However, I am not sure I see the issue with providing explanation as a major problem with
traditional safety-critical applications. If the self-driving car does something funny, then that
will be logged as an incident and the questions are those of liability and rectification. The car
manufacturer may have the problem of trying to discover why its network algorithms provoked the
anomalous behaviour, but it is not clear anyone else involved has that issue, certainly not the
licensing authorities, who can just withdraw a licence to operate.

If an adaptive control system does something funny, the question of how to rectify is non-trivial
(you can't just reboot it, for then you lose the adaptive learning to that point, and that might
land you in even deeper water). But, again, nobody except the manufacturer needs to know why it did
what it did. The rectification task might be to recognise the parameters which led to the anomaly,
and avoid those parameter values in the future.

If you completely automated your nuclear-power-plant emergency systems using these methods (NN in
NPP?), then after an incident you might well want to know just why your automated system did what it
did. But I don't see such automation happening in the foreseeable future, and in any case I wonder
whether there is enough data on NPP anomalies and their rectification to enable the kind of deep
learning being discussed in the article. Like commercial-airplane accidents, each incident is
different; also like commercial airplane accidents, there aren't very many of them. It's not like
"what human drivers do at STOP signs in Mountain View, California".

PBL

Prof. i.R. Peter Bernard Ladkin, Bielefeld, Germany
MoreInCommon
Je suis Charlie
Tel+msg +49 (0)521 880 7319  www.rvs-bi.de

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 163 bytes
Desc: OpenPGP digital signature
URL: <https://lists.techfak.uni-bielefeld.de/mailman/private/systemsafety/attachments/20170418/479457c4/attachment.pgp>