[SystemSafety] Fwd: Re: How Many Miles of Driving Would It Take to Demonstrate Autonomous Vehicle Reliability?

Mario Gleirscher mario.gleirscher at tum.de
Sat Apr 23 09:02:35 CEST 2016


Dear Les and Phil,

I particularly followed your interesting discussion on autonomous
vehicle validation.

On our opinion here in Munich, at the core of this validation problem we
perceive two challenges:
 1. carefully boiling down the complexity of assumptions we can make on
the loop driver + road-infrastructure + vehicle + driving,
 2. coming up with a largely extended, fast, reasonably conservative but
practical fail-operational strategy (as Phil also stated in his SAE paper).

We are currently working on 2.:
Fail-operational in the autonomous driving context involves a number of
issues, among them:
 a. figuring out when a failure or, even more important, a combination
of a failure and a situation of low vigilance/operability happens,
 b. designing a sophicated implementation of the monitor/actuator
pattern (to speak in Phils words) to realize fail-operational in the
right moments.

For a. we might think of a statistical observer for the many procedures
and machine learning algorithms as an abstraction, detecting glitches of
low confidence in the sensor subsystems (including extra-vehicle
sensors), low vigilance/operability at the driver side, or, last but not
least, of usual failure modes of vehicle (control) subsystems.

For b. we might think of an (extended) monitor responsible for a. and an
actuator maintaining or bringing the driving process back to the most
immediately reachable fail-operational safe state.

I am not exhaustive in my enumerations, of course. But I hope to have
provided some further insight, and would be glad to get some feedback.

Thanks for the stimulating discussions here on this mailing list (as I
am quite new to the community).

Best regards,
Mario Gleirscher

On 23.04.2016 06:34, Les Chambers wrote:
> Hi Phil
> Your paper covers a fascinating subject. I'm glad you're looking into it.
> I agree that the core problem is validating non-deterministic algorithms.
> 
> The proactive measures you're suggesting such as fault injection should definitely be in the toolbox. An untested bit flip was responsible for one of the two near misses in my career. A more robust system that triggered a safety-related control action with a 32-bit word instead of one bit would have made the system impervious to this kind of fault. I sincerely hope that no one programming a stores management system drops a bomb or fires a missile from a warplane with anything less than a 64-bit unique command word - so too the control that applies the brakes to a vehicle.
> 
> I can't help thinking though that this is all pretty low-level stuff, in the realm of best practice that we should be practising already. The Monitor/Actuator architecture is also a good idea. We were practising this in the 1970 s on chemical reactor control systems. A process specialist worked independently of the control system development team identifying unsafe states of the plant (I think this is what you were referring to as "deductively-generated safety envelope"). When an unsafe state was detected (we called them abort conditions) the monitor software took over control and unconditionally restored the plant to a safe state. Fortunately for us the failover mechanism was often simple. That is, de-energising outputs to final control elements, which caused the return springs to close control valves. In chemical processing just shutting a valve or putting a reactor on full cooling is enough to preserve safety. This is clearly not the case with driverless cars - orders of magnitud
e more complex. 
> 
> Which leads me to my major point on your paper: you mentioned that "Vehicle-level testing won’t be enough to ensure safety." - I agree, it's necessary but not sufficient. But I'd point out that it is more necessary than ever before and needs to be put on steroids. I'm referring to not physical vehicle testing but simulated vehicle testing. The behaviour of non-deterministic algorithms , in terms of pass/fail, become deterministic when they hit the real world, or at least the simulated real-world. A sensor system mistakes a human being for a paper bag: yes/no. A sensor recognises a bus in heavy rain with lightning flashes: yes/no. As you pointed out, the infinite variety of possible scenarios can never be tested by driving a vehicle around the street. No matter how many vehicles you deploy you will never see enough black swans to fully test your system. With simulators however you can whistle up a different simulated black swan every few milliseconds. I learnt this on a project whe
re substantial effort was sunk into an automated test rig. The three things I learnt from this which are relevant to driverless cars are:
> 1. The time scaling that becomes possible with automated testing can expose a system to orders of magnitude more bad scenarios (black swans) than it will ever see in its operational life. With enough computing power you could expose a vehicle sensor system to a few hundred years worth of human beings, buses, paper bags and plastic bags and so on - overnight.
> 2. Without automated testing it would have been impossible for us to properly regression test all the software modifications that were coming through during the life of the project. This is particularly relevant to automobiles in the current environment where Elon Musk is routinely providing vehicle owners with upgrades, and like lemmings his customers lap it up, it's accepted because it's part of the culture now,  they think they are driving mobile phones.
> 3. Effective automated test rigs are expensive to build. You need a whole team working on them. They also require maintenance. This inevitably puts the V&V group in conflict with the project manager and any salesman or managing director who has a perverted need to make a profit out of a project (I just can't stand those guys).
> 
> The next obvious question is: how to build an effective test rig for a sensor system built on machine learning? My solution is to leverage the special effects technology that is now incredibly mature in the movie business. I watched the bonus material "making of" stuff on Game of Thrones season five last night and was blown away by how far they have progressed. Anyone who has watched the episode featuring "The Massacre at Hardhome" would have to agree. 
> Millions of dollars must've been poured into a 20 minute sequence, which demonstrates anything can be done these days with the will and enough money. Surely this is justified if the world is going to embrace self driving vehicle technology. Aviation couldn't function without simulators why not automobiles?
> 
> Full disclosure: I am heavily biased towards simulators. After the above case study project I had the pleasure of flying a strike jet into the ground mach 2.6. I survived, apparently. I'm therefore in furious agreement with your pronouncement: "Thus, alternate methods of validation are required, potentially including approaches such as simulation ...". "potentially????" Not potentially, it must be a major area of focus.
> 
> As a sidebar (and I hope I have your meaning correct) your comment: "One way to manage the complexity of requirements is to constrain operational concepts and engage in a phased expansion of requirements." Your pronouncement is sensible and eminently logical but, as we speak, it is being gleefully ignored by companies such as Tesla. Have you seen the horror videos of people sitting in the back seat of their cars having breakfast while the vehicle drives them to work. The Internet is redolent with videos of near misses in hands-off-the-wheel scenarios. The "phased expansion" is happening on Internet time and the driving public clearly does not have the maturity or the training to absorb it safely. Contrast this with the years of training required to qualify a person to drive an automated air vehicle. Even Tesla is advising its customers to "be careful", which is a bit like a casino advising problem gamblers to "gamble responsibly".  I've seen this human factors phenomenon several tim
es in my career and here it is happening again: technology creep. Technology creeps into a new application domain and is given unjustifiable trust by incumbent bunnies blinded by the light of "cool" and accidents happen. What can we do but endure?
> 
> Thanks for the paper.
> 
> Cheers
> Les
> 
> -----Original Message-----
> From: systemsafety [mailto:systemsafety-bounces at lists.techfak.uni-bielefeld.de] On Behalf Of Philip Koopman
> Sent: Saturday, April 23, 2016 8:58 AM
> To: systemsafety at techfak.uni-bielefeld.de
> Subject: [SystemSafety] Fwd: Re: How Many Miles of Driving Would It Take to Demonstrate Autonomous Vehicle Reliability?
> 
> 
> I presented a paper on exactly this set of related problems at the SAE
> World Congress last week.  Validating machine learning is for sure a
> tough problem. So is deciding how ISO 26262 fits in.  And quality of the
> training data.  And some other problems besides. Below is abstract and
> pointer to the paper and presentation slides. Constructive feedback
> welcome for follow-on work we are doing, although likely I will reply
> individually rather than to the list. (Note that this paper was camera
> ready before the RAND report was public.  Several folks have been
> thinking about this topic for quite a while and just now are the results
> becoming public.)
> 
> http://betterembsw.blogspot.com/2016/04/challenges-in-autonomous-vehicle.html
> 
> Challenges in Autonomous Vehicle Testing and Validation
>         Philip Koopman & Michael Wagner
>         Carnegie Mellon University; Edge Case Research LLC
>         SAE World Congress, April 14, 2016
> 
> Abstract:
> Software testing is all too often simply a bug hunt rather than a well
> considered exercise in ensuring quality. A more methodical approach than
> a simple cycle of system-level test-fail-patch-test will be required to
> deploy safe autonomous vehicles at scale. The ISO 26262 development V
> process sets up a framework that ties each type of testing to a
> corresponding design or requirement document, but presents challenges
> when adapted to deal with the sorts of novel testing problems that face
> autonomous vehicles. This paper identifies five major challenge areas in
> testing according to the V model for autonomous vehicles: driver out of
> the loop, complex requirements, non-deterministic algorithms, inductive
> learning algorithms, and fail operational systems. General solution
> approaches that seem promising across these different challenge areas
> include: phased deployment using successively relaxed operational
> scenarios, use of a monitor/actuator pair architecture to separate the
> most complex autonomy functions from simpler safety functions, and fault
> injection as a way to perform more efficient edge case testing. While
> significant challenges remain in safety-certifying the type of
> algorithms that provide high-level autonomy themselves, it seems within
> reach to instead architect the system and its accompanying design
> process to be able to employ existing software safety approaches.
> 
> 
> Cheers,
> -- Phil
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5053 bytes
Desc: S/MIME Cryptographic Signature
URL: <https://lists.techfak.uni-bielefeld.de/mailman/private/systemsafety/attachments/20160423/d98eb625/attachment-0001.bin>


More information about the systemsafety mailing list