[SystemSafety] Scalability of Formal Methods [was: Logic]

Wed Feb 26 19:55:19 CET 2014

Perhaps it has only not come up in this conversation because everyone already understands it, but at the risk of stating the obvious I’ll point out that neither SPARK proofs nor any other formal verification efforts are deductive proof that the software will operate as specified when deployed on real target hardware.  Fetzer's Program Verification: The Very Idea, Comm. of the ACM, 31(9):1048–1063, Sept. 1988 discusses general reasons for this.  More specifically and obviously in the case of SPARK, the proofs relate to the source code, not the object code.  I have personally seen a compiler take SPARK source code that had been proven to refine a specification and generate dangerously flawed object code.  (Testing easily revealed the problem.)

There is also the issue of mixed non-SPARK code in a process and its impact on conclusions that can be drawn from SPARK proofs.  There might be cases where an entire executable can be written purely in SPARK, but I’ve seen cases where practical matters forced blending in a little C or plain Ada.  For example, the compiler I was using automatically generated calls to memset and memcpy.  I can’t see how you could possibly write these in SPARK and a bug in them has the potential to corrupt any memory that the processor’s memory system allows write access to.  Because the proof obligations generated by the SPARK tools assume that non-volatile memory locations do not change unless the SPARK code changes them, a memory-stomping bug in non-SPARK code has the power to undermine conclusions about the SPARK code’s behaviour that a complete set of proofs might appear to support.

Then there is the issue of floating-point arithmetic.  I am far from an expert in verifying this, but my general understanding was that if you turn on the SPARK support for it, what you get is obligations that show that the arithmetic would be correct if the floating-point unit performed infinite-precision rational-number arithmetic.  But real floating point units have limited precision.  Useful though limited, and I can see why the manuals I have for the SPARK tools direct the reader to enquire about floating-point support.

If all of this sounds a bit critical of SPARK or formal verification generally, please understand that I am a big fan of the language and deeply appreciate formal verification’s power to find defects.  I am also conscious that there are some kinds of problems it is by nature blind to.  Appreciation of the classes of defects a given formal verification technique /can’t/ find is needed to plan a V&V approach that covers all of the bases.

— Patrick

Dr Patrick John Graydon
Postdoctoral Research Fellow
School of Innovation, Design, and Engineering (IDT)
Mälardalens Högskola (MDH), Västerås, Sweden