[SystemSafety] Software reliability (or whatever you would prefer to call it)

Daniel Kästner kaestner at absint.com
Tue Mar 10 10:20:49 CET 2015


Well, worst-case execution time is a topic for itself.

In this area it has to be understood that meeting real-time deadlines does 
not only impose constraints on software architecture, but also on hardware 
architecture. There are some architectures for which it is not possible to 
give worst-case execution time guarantees, agreed. However, my conclusion is 
a different one: in that case I doubt the right approach is to choose an 
unpredictable architecture and apply stochastic methods, instead of choosing 
a predictable architecture and proving sound worst-case execution time 
bounds. The latter is possible on complex single-core processors, but also 
on predictable multi-core processors like some PowerPC multi-cores (beware: 
not all of them) or Kalray's processors, for example.

Furthermore, there is no consensus yet about applying multi-core processors 
in a safety-critical context at all, cf. the recent CAST paper about 
multi-core processors:
http://www.faa.gov/aircraft/air_cert/design_approvals/air_software/cast/cast_papers/media/cast-32.pdf

Daniel.

Dr.-Ing. Daniel Kaestner ----------------------------------------------
AbsInt Angewandte Informatik GmbH      Email: kaestner at AbsInt.com
Science Park 1                         Tel:   +49-681-3836028
66123 Saarbruecken                     Fax:   +49-681-3836020
GERMANY                                WWW:    <http://www.AbsInt.com> 
http://www.AbsInt.com
----------------------------------------------------------------------
Geschaeftsfuehrung: Dr.-Ing. Christian Ferdinand
Eingetragen im Handelsregister des Amtsgerichts Saarbruecken, HRB 11234
Von: systemsafety-bounces at lists.techfak.uni-bielefeld.de 
[mailto:systemsafety-bounces at lists.techfak.uni-bielefeld.de] Im Auftrag von 
Ian Broster
Gesendet: Dienstag, 10. März 2015 09:50
An: systemsafety at lists.techfak.uni-bielefeld.de
Betreff: Re: [SystemSafety] Software reliability (or whatever you would 
prefer to call it)


Here's a different view on software reliability and an example.

We know that:

1. We /can/ write software that is very well defined and does not exhibit 
any stochastic behaviour.

2. We /can/ also intentionally (or unintentionally) write software that does 
exhibit unpredictable failure behaviour, which can be characterized using 
statistical techniques (and therefore called stochastic behaviour). You can 
achieve this through the use of random number generators for example. (1)


The challenge, as software grows in size and complexity, is the practical 
difficulty in writing software (like 1) that is so well defined and verified 
that it does not exhibit the stochastic failure behaviour (of 2).

Indeed, at some point in the size/complexity scale, the development and 
verification of fully deterministic software will become a practical 
impossibility and therefore we have little other option than to use some 
statistical metric of confidence that we have achieved the goal of no 
failure.

One example of this that is developing traction is the PROXIMA EU project, 
which is specifically focused on software timing for multi-core processors. 
The basic idea is that for very complex hardware/software systems, it is 
beyond practical feasibility to understand the worst case execution time of 
the software. ("How can you possibly have tested/analysed sufficient inputs, 
initial states, and the impact from other cores to give a bound which is 
both accurate and *practically/economically small enough*.")

The direction in this project is to intentionally produce a system that is 
designed to have a stochastic timing behaviour at the low level. And by 
doing so, you can then legitimately start to use all kinds of statistical 
methods that are not available to a digital system normally.

Therefore, you have a software computation that has a probability of failing 
to produce its result within its allotted time. However, you also have a 
reliable method of computing that probability, which can be well below the 
oft-quoted 10^-9/hour.

Ian

(1) [You could also map a partially testable massive input domain to a 
random-number generator, or consider race conditions driven by apparently 
randomly timed input data and the like].





-- 
Dr Ian Broster, General Manager
Rapita Systems Ltd
Tel: +44 1904 413 945 Mob: +44 7963 469 090

Stay informed by joining the Rapita Systems mailing list 
<http://www.rapitasystems.com/contact/mailing_list>
For real-time verifications issues and discussion, follow the Rapita Systems 
blog <http://www.rapitasystems.com/blog>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.techfak.uni-bielefeld.de/mailman/private/systemsafety/attachments/20150310/9cd6c48c/attachment-0001.html>


More information about the systemsafety mailing list