[SystemSafety] Claims for formal methods

Thu Feb 20 16:58:57 CET 2014

Dear Mr. Jones,

Le 20/02/2014 14:45, Derek M Jones a écrit :
> The issue was about the gulf between the claims made and what had
> actually been achieved.
>
> A soap powder manufacturer cannot make excessive advertising claims
> about its current products just because it has researchers in a lab
> trying to cook up a better powder.

Talking about a "gulf" between claims made and what is achieved is 
excessive in my mind. :-)

Formal methods are not silver bullet and they should be carefully 
applied. But if applied correctly, they do bring significant assurance 
on the part they cover (which is not everything, we agree).

>> http://gallium.inria.fr/blog/verifying-a-parser-for-a-c-compiler/
>> http://gallium.inria.fr/blog/verifying-a-parser-for-a-c-compiler-2/
>
> As always with this kind of work the devil is in the detail.
>
> They currently have what they consider to be a reliable proof of the
> correctness of one of the tools they used.
>
> As they freely admit there are issues with their current approach:
> "... our solution worked, but was not fully satisfactory, because the
> "lexer" was so complicated that it included a full non-certified parser,
> which could, if buggy, in some corner cases, introduce bugs in the whole
> parser."

That's why the second blog post explains how they can avoid such issue.

CompCert authors are aware of such potential issues (and warn about it):
  http://compcert.inria.fr/compcert-C.html#archi
"""
This part of CompCert (transformation of C source text to CompCert C 
abstract syntax trees) is not formally verified. However, CompCert C is 
a subset of C, and the compiler can output the generated CompCert C code 
in C concrete syntax (flag -dc), therefore the result of this 
transformation can be manually inspected. Moreover, most static analysis 
and program verification tools for C operate on a simplified C language 
similar to CompCert C. By conducting the analysis or the program 
verification directly on the CompCert C form, bugs potentially 
introduced by this first part of the compiler can be detected.
"""

> I think that holes like this will always exist in formal proofs and
> perhaps a better approach would be to try and figure out how many nine's
> might be claimed for the likelihood of the  'proof' behaving correctly.

I think most real-world practitioners of formal methods perfectly 
understand the limits of the used approaches and complement it with 
needed reviews, tests, etc.

>>> 2.    The CompCert system may successfully translate programs
>>> containing undefined behavior. Any proof statements made about such
>>> programs may not be valid.
>>
>> Yes. But this is not the issue with the Verified C compiler but with the
>> input C program itself. You can use other tools (like Frama-C, Astrée or
>> Polyspace) to check your C program has no undefined behaviour.
>
> So you are agreeing that any proof statement made by the 'Verified' C
> compiler may be wrong.

No, not at all.

> Your proposed solution is to first run the code through unverified tools
> to weed out those cases that the 'Verified' compiler gets wrong?

What I wanted to say is that there is a separation of concern between 
the program and the compiler:

  * The verified compiler ensures that the resulting code corresponds 
100% to the initial C program, *including its bugs* (e.g. division by 
zero). This is not the purpose of the compiler to find such undefined 
behaviour;

  * You need other tools to check that the input C program is bug free 
(e.g. absence of division by zero).

Both combined, you get a "perfect" software, e.g. absence of run-time 
error compiled to running code.

Of course, definition of "perfect" depends on your needs. For example 
absence of run-time error might be a minimal requirement but it does not 
ensure the software will not put the system in unsafe state (and other 
formal tools can help you prove that, not CompCert).

If for you "Verified Compiler" means "Compiler that stops on bugs in my 
program", then CompCert is not fulfilling this definition.

> CompCert's put in a good performance here.  However, Regehr's research
> targets code generations issues, the part of the compiler on which the
> verification work has been done.

As I said, the verification work does not cover all the compiler 
(CompCert authors never claimed that, from the very beginning). But they 
do cover a significant part of it, which is increasing over the years.

> There is the issue of quantity of optimization performed.
> CompCert does some optimizations, but not nearly as much as gcc and
> llvm;

For tight loops, gcc or llvm are much better, but for regular code, 
CompCert claims 20% slower than gcc -O3 
(http://compcert.inria.fr/compcert-C.html#perfs).

> you would expect to find fewer faults because there is less
> code in CompCert to go wrong.  Were no faults found in CompCert because
> insufficient tests were run?

No, guaranties from CompCert does not come from test but from proof, by 
structure. This is very different from gcc or llvm design. Therefore 
such result is valid *for any kind of input*. To take a simple example, 
the result of the unverified register allocation algorithm is formally 
checked. Therefore they cannot have any errors in register allocation. 
By structure.

Of course, it is your right to be sceptical about such claims (it is a 
sane attitude). That's why CompCert authors are putting effort to gather 
all the fine prints (assumed axioms, how the compiler and its proof are 
structured, assumptions put on Coq or OCaml, etc.) of the claimed proved 
properties and put them in front of certification authorities (in 
DO178B/C context if I'm correct), the objective being to convince them 
that the claimed properties are indeed rightly ensured.

Sincerely yours,
D. Mentré