[SystemSafety] Machine models under formal methods

Daniel Kästner kaestner at absint.com
Fri Nov 10 16:30:55 CET 2017


I completely agree that on some processor it’s impossible to give any kind 
of reasonble timing guarantees.

However, there are timing-predictable COTS processors; even in case of 
multi-core architectures, there are some with limited interference between 
the cores (due to shared buses, prefetch buffers, etc.) where the 
interference effects can be bounded by reasonable limits (e.g. PowerPC 
MPC5777M, or AURIX TC275). (Rg. notion of timing predictability cf. 
http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.172.4533&rank=1)

Abstract interpretation-based formal WCET analysis is based on an abstract 
model of the processor, which is typically developed by the tool provider. 
Obviously these models have to be validated with respect to the hardware. To 
this end what we’re doing is trace-based validation; in a nutshell the 
analysis result is not only a WCET number but a prediction graph which 
contains a superset of observable hardware events, down to a quite detailed 
level, e.g. mispredicted branches, cache hits/misses, etc. All observed 
traces must be covered by the prediction graph. In contrast to attempts at 
end-to-end measurements, or instrumentation-based timing tracing, it is 
possible to have adequate confidence in a reasonable trace coverage even on 
complex processors. Essentially such abstract models then do not model the 
whitesheet processor specification but the behavior of the physical machine 
(in some cases we have different models for different versions of the same 
core where some silicon bugs have been removed). Based on such a model the 
contribution of formal WCET analysis is a safe upper bound on the worst-case 
execution time on a timing-predictable processor.

Regarding WCET analysis in general: WCET analysis is concerned with the 
maximal execution time of a sequential piece of code. Preemption effects, 
blocking time, etc. have to be considered as a part of a WCRT (worst-case 
response time) analysis, e.g., RMA, or RTA, which essentially computes the 
worst-case interruption scenario. However, any kind of WCRT analysis always 
assumes the WCET of the tasks to be given as an input, so WCET is a 
prerequisite of WCRT. What static WCET analysis also can contribute are 
preemption costs, e.g. cache-related preemption delays (an interrupting task 
can evict parts of the cache needed by the interrupted task), which then 
also have to be taken into account in the WCRT analysis.

Best regards,
  Daniel.
---
Dr.-Ing. Daniel Kaestner ----------------------------------------------
AbsInt Angewandte Informatik GmbH      Email: kaestner at AbsInt.com
Science Park 1                         Tel:   +49-681-3836028
66123 Saarbruecken                     Fax:   +49-681-3836020
GERMANY                                WWW:    <http://www.AbsInt.com> 
http://www.AbsInt.com
----------------------------------------------------------------------
Geschaeftsfuehrung: Dr.-Ing. Christian Ferdinand
Eingetragen im Handelsregister des Amtsgerichts Saarbruecken, HRB 11234
Von: systemsafety 
[mailto:systemsafety-bounces at lists.techfak.uni-bielefeld.de] Im Auftrag von 
Todd Carpenter
Gesendet: Freitag, 10. November 2017 15:40
An: systemsafety at lists.techfak.uni-bielefeld.de
Betreff: [SystemSafety] Machine models under formal methods

Correct me if I'm wrong (I'm sure I am), but the "[SystemSafety] Bursting 
the anti formal methods bubble" discussion seems to assume that we have a 
clue about what's going on beneath our software. As a caveat, I agree with 
the premise that if you can't even assure your software using a simple 
machine model (instruction X copies the values stored in memory locations 
to/from registers, instruction Y performs ALU operations on registers, and 
memory values don't change unless you use instructions X or Y), then you 
can't hope to know what your software does on a complex machine.

Unfortunately, we no longer have simple machines. I've heard claims from 
microkernel, RTOS, and secure OS vendors that they verify their code (using 
a variety of means, including formal proofs, analysis, etc.) against 
instruction set models. But those instruction sets are often backed by 
microcode, which was not developed under formal methods. Worse, on some 
processors, that microcode is mutable, and loaded at run-time. Furthermore, 
there are all sorts of other things going on under your simple ALU, 
including Turing complete MMUs, IOMMUs, and DMA engines.

If you want to be entertained by what the vendors are doing to you, check 
out the following recent articles about Intel's Management Engine back door 
that they baked into their chipsets. Sorry, it's not a back door, it is a 
"management portal." Whatever. These portals are on endpoint machines. They 
even put support for this capability in ATOM SoCs. These management engines 
can arbitrarily access memory. (Oh, wait, Intel doesn't do evil. It's not 
arbitrary... Trust them.) Server class machines have the equivalent 
"Baseboard management controllers" which can also interact with memory. All 
the BMCs we've studied use old versions of BusyBox, and are riddled with 
opportunities. Think about how that impacts the argument that starts with, 
"assuming a solid foundation, we can formally show that this program does 
X."

    http://www.theregister.co.uk/2017/10/21/purism_cleanses_laptops_of_intel_management_engine/
    https://puri.sm/posts/deep-dive-into-intel-me-disablement/
    http://www.zdnet.com/article/minix-intels-hidden-in-chip-operating-system/
    http://www.theregister.co.uk/2017/11/09/chipzilla_come_closer_closer_listen_dump_ime/

(My apologies to the academics: these are popular press articles rather than 
peer reviewed articles. Several of the articles point to the real thing.)

So now lets assume that we develop processors, MMUs, caches, and inter-core 
networks that are well behaved. It's possible. What happens when that 
machine suffers Single Event Errors?

Note that I am NOT arguing against formal methods in the slightest. I am 
merely pointing out that unless someone has a good model of the machine, it 
might not be sufficient.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.techfak.uni-bielefeld.de/mailman/private/systemsafety/attachments/20171110/ded4cbdd/attachment-0001.html>


More information about the systemsafety mailing list