[SystemSafety] Claims for formal methods

Thu Feb 20 14:45:45 CET 2014

Hello Monsieur MENTRE,

> Hello Mr. Jones,
>
>> The French group in the above links did some interesting work on
>> compiler code generation and I believe they now refer to it as a
>> verified compiler backend (which is what it idoes) rather than a
>> verified C compiler.
>
> They are doing more than just a C backend and they are progressing
> slowly but surely towards parsing the C grammar. For example, you can

The issue was about the gulf between the claims made and what had
actually been achieved.

A soap powder manufacturer cannot make excessive advertising claims
about its current products just because it has researchers in a lab
trying to cook up a better powder.

>    http://gallium.inria.fr/blog/verifying-a-parser-for-a-c-compiler/
>    http://gallium.inria.fr/blog/verifying-a-parser-for-a-c-compiler-2/

As always with this kind of work the devil is in the detail.

They currently have what they consider to be a reliable proof of the
correctness of one of the tools they used.

As they freely admit there are issues with their current approach:
"... our solution worked, but was not fully satisfactory, because the 
"lexer" was so complicated that it included a full non-certified parser, 
which could, if buggy, in some corner cases, introduce bugs in the whole 
parser."

I think that holes like this will always exist in formal proofs and
perhaps a better approach would be to try and figure out how many nine's
might be claimed for the likelihood of the  'proof' behaving correctly.

>> 2.    The CompCert system may successfully translate programs
>> containing undefined behavior. Any proof statements made about such
>> programs may not be valid.
>
> Yes. But this is not the issue with the Verified C compiler but with the
> input C program itself. You can use other tools (like Frama-C, Astrée or
> Polyspace) to check your C program has no undefined behaviour.

So you are agreeing that any proof statement made by the 'Verified' C
compiler may be wrong.

Your proposed solution is to first run the code through unverified tools
to weed out those cases that the 'Verified' compiler gets wrong?

I have spoken to some people who were at recent talk by the French
group, who tell me that they now (correctly) refer to what they have
done as a verified backend and not a verified C compiler.

>> 3.    The support tools are assumed to be correct; primarily the Coq
>> proof assistant, which is written in OCaml.
>
> Yes. But Coq is designed in such a way (a very small, manually verified
> trusted core and then libraries built upon that core) that the
> possibility to have an issue in Coq is very very low (but not zero, I
> agree).

Essentially this is one of the axioms of the proof.

This is the kind of information Soap powder manufacturers put in the
small print.

>> 4.    The CompCert system makes decisions about implementation
>> dependent behaviors and any proofs only apply in the context of these
>> decisions.
>
> Yes. As any compiler. At least those decisions are formally documented.

More small print.

>> 5;    The CompCert system makes decisions about unspecified behaviors
>> and any proofs only apply in the context of these decisions.
>
> Yes. As any compiler. At least those decisions are formally documented.

I did not manage to find this document.  But more small print.

> As pointed out on the comment of your blog, an independent research
> paper (http://www.cs.utah.edu/~regehr/papers/pldi11-preprint.pdf) has
> shown that ComptCert is a very strong compiler, much stronger that any
> others. Quoting Regehr's paper:

CompCert's put in a good performance here.  However, Regehr's research
targets code generations issues, the part of the compiler on which the
verification work has been done.

There is the issue of quantity of optimization performed.
CompCert does some optimizations, but not nearly as much as gcc and
llvm; you would expect to find fewer faults because there is less
code in CompCert to go wrong.  Were no faults found in CompCert because
insufficient tests were run?  Calculating a scaling factor to adjust for
quantity of optimization is a large, open, research problem.

-- 
Derek M. Jones                  tel: +44 (0) 1252 520 667
Knowledge Software Ltd          blog:shape-of-code.coding-guidelines.com
Software analysis               http://www.knosof.co.uk