[SystemSafety] A real-world Byzantine failure

Dewi Daniels dewi.daniels at software-safety.com
Fri Sep 17 10:47:27 CEST 2021


The Taiwan Transportation Safety Board has published the final report on a
serious incident involving an Airbus A330. While landing at Taipei, all
three flight control primary computers shut down, resulting in the
spoilers, autobrake and engine thrust reversers failing to operate. The
aircraft stopped just 30 ft before the end of the runway.

https://www.ttsb.gov.tw/english/18609/18610/26634/post
https://www.ttsb.gov.tw/media/4913/ci202_executive-summary_release.pdf
https://www.ttsb.gov.tw/media/4936/ci-202-final-report_english.pdf

The report says that the flight control primary computers were shut down
because the COM/MON pairs weren't synchronised closely enough. This
resulted in the COM and MON channels reading different input values and
disagreeing with each other. This is claimed to be an unusual edge case
that has not been seen before.

This is an instance of a well-known problem in computer science called the
Byzantine Generals problem. Leslie Lamport's seminal paper presented a
solution to the Byzantine Generals problem.

http://lamport.azurewebsites.net/pubs/pubs.html#byz

Kevin Driscoll has published a number of papers describing how the Boeing
777 flight control system was designed to avoid Byzantine failures.

Yours,

Dewi Daniels | Director | Software Safety Limited

Telephone +44 7968 837742 | Email d <ddaniels at verocel.com>
ewi.daniels at software-safety.com

Software Safety Limited is a company registered in England and Wales.
Company number: 9390590. Registered office: Fairfield, 30F Bratton Road,
West Ashton, Trowbridge, United Kingdom BA14 6AZ
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.techfak.uni-bielefeld.de/pipermail/systemsafety/attachments/20210917/5127706c/attachment.html>


More information about the systemsafety mailing list