[SystemSafety] Component Reliability and System Safety

Peter Bernard Ladkin ladkin at causalis.com
Mon Sep 17 14:19:31 CEST 2018


On 2018-09-17 11:06 , Paul Sherwood wrote:
> But software is a very big field. It seems to me that most of the software we are relying on these
> days was developed without following coding standards in general, ....

That may be true in general; I wouldn't know. It is specifically not true for safety-related systems.

IEC 61508-3 Table B.1 entry 1 says that "Use of coding standard to reduce errors" is Highly
Recommended (HR) for all SILs. ("Highly Recommended" is the strongest form of encouragement for any
specific technology in the standard).

If you don't use a coding standard, your assessor is going to want to know why. (Telling himher you
think they are outmoded is not generally regarded as an acceptable answer.)

> We could insist that the software be developed in Haskell, or Rust, or some other technology that
> provides a higher level of control over the code creation.

Not in developing safety-related systems, you can't. At least, not at the moment. With Haskell, the
need for garbage collection gets in the way of being used (it potentially interferes with timing
constraints in a non-deterministic way).

> Coding standards can actually be counter-productive, for example if
> ......
> - they are used when they shouldn't be
> 
> This ..... is exactly the reason for my original question.

Since the general E/E/PE functional safety standard HR's use of coding standards, what could be the
scope of "used when they shouldn't be"?

> dependable != safe

Thank you.
>>  I would go further - it is important for any system
>> which is not deliberately built to subvert the purposes of the client.
> 
> Sorry, I don't understand this comment at all.

I was referring to malware.

>> A question. What important safety properties of a bicycle are *not*
>> reducible to component reliability?
> 
> For simple systems, where the safety mechanisms are expressly mechanical, reliability obviously
> matters.

Yes. The question was genuine. I couldn't think of a common safety property of a bicycle which is
not ensured by a subsystem/component.

> But for **safety** of complex systems, I'm guessing that current best practice must involve
> designing-in safety from multiple directions, with failsafes, redundancy and/or similar?

Current best practice involves following the strictures of IEC 61508 or its so-called "derivatives"
for non-aerospace, non-medical systems, and ISO 26262 for road vehicles; DO-178C/EA-12C for critical
aerospace code, including EA-217 and EA-218. (I have forgotten the numbers of the medical-systems
E/E/PE safety standards. They come largely from IEC TC 62 and 66, I think.)

> Presumably the architectural level safety considerations must include the **expectation of failure
> in components**, and lead to designs which mitigate against expected (bound to happen) failures, to
> satisfy safety goals?

This is too vague for me to judge. A brief review of the philosophy behind IEC 61508 may be found at
https://rvs-bi.de/publications/books/ComputerSafetyBook/12-Kapitel_12.pdf I wrote this before I got
involved in the maintenance of IEC 61508. The civil aerospace airworthiness requirements specify
certain ultrahigh reliability requirements for individual components. Components here are bits of
airplanes which do things, not SW modules.

> If our safety depends on the reliable behaviour of even a small program on (say) a modern multi-core
> microprocessor interacting with other pieces of software in other devices, I think "we are lost" again.

It does, and it will do for the foreseeable future.

> I'm worrying about autonomous vehicles and other systems of similar complexity. As I understand it
> most of the software in these systems won't even be written in C, let alone following MISRA C rules.

I wouldn't know.

>> However, when prefaced with "dear fellow safety professionals", one
>> might consider them banal.
> 
> I'm not a "safety professional".

Ah, OK. Since it costs a four-figure sum of money, I guess that also means you don't have a copy of
IEC 61508 to hand. You also may not know the analysis methods required to be used for safety-related
systems.

You may also not know what safety-related systems look like. Lots of little and mid-sized boxes
plugged together would not be an inappropriate picture.

> However I am relatively experienced in large scale software, and (as you can see) I'm struggling to
> understand how 'safety professionals' can advocate the application of principles from mechanical
> reliability engineering, plus "things we learned on microcontroller-scale projects several decades
> ago" to complex software-intensive systems in 2018.

a. Because the standard requires it.
b. Because it largely works better than purely winging it.
c. Safety-related systems don't have a lot of "large scale software" in them. They have a lot of
relatively-small-scale software and firmware running on dedicated devices and a bunch of
configuration. (I say "relatively small scale" - even jet engine control software, which has pretty
straightforward tasks to perform, can end up being 8-figure megabytes. I can't find an easy
reference. Other people on this list can be more specific.)

>> Similarly, those who have never flown an airplane may wonder why
>> checklists are used for
>> configuration for key phases of flight such as landing. Once you have
>> flown an airplane and learned
>> a little of what happens to others who fly, it becomes banal.
> 
> Fair point, but a little off topic imo.

I don't think so. You might be surprised how useful checklists are in complex installations.
Although everyone likes to decry them.

>> Also consider the project
>> mentioned above which exercised the Sun HW. The aerospace manufacturer
>> paid (lots) for bespoke
>> analysis. There was a reason for that.
> 
> I'm sorry but I remain unconvinced that the lessons from the 90s are still relevant.

From the 80's.

Let us go back even twenty years earlier, to the 60's, when strong data typing was first explicitly
characterised and used in Algol.

I have observed (for many years) that well over 80% of the vulnerabilities noted by CERT in its
first five years of existence would have been avoided if people had programming in languages which
enforce strong typing.

I just came across another couple of examples from ICS-CERT. ICSA-18-254-05
https://ics-cert.us-cert.gov/advisories/ICSA-18-254-05 You can apparently brick this "industrial
Ethernet switch" by giving it "abnormal" input. A colleague observed that many or most of the CERT
advisories still concern phenomena that could have been avoided through the use of (rigorous,
reliable) strong typing.

> I've seen the reboot screens of infotainment systems on several commercial aeroplanes - generally a
> version of u-boot and a Red Hat Linux from some decades prior to the time of the crash/reboot. I'm
> hoping that these systems are not connected to the same network as the instrumentation and
> controllers, 

In the Boeing 787 series, they are. In all other commercial aircraft AFAIK, they are not.

> but also I'm wondering how safety is assured when
> 
> a) passengers have been told for years not to use personal devices, 'because safety' (probably
> nonsense, I know)
> b) some planes now expressly provide internet facilities for passenger devices

Concerning a), many commercial airplanes used twenty years ago were not specifically evaluated
against and qualified for resistance to EM fields inside the Faraday cage of a strength easily
generated by consumer electronics (in particular malfunctioning consumer electronics). Nowadays,
they are.

> In automotive I know that some user-facing (and even internet-facing) systems *do* sit on the CAN
> bus, alongside multiple subsystems/components which are (presumed safe because they were) developed
> in accordance with MISRA C guidelines.

*The* CAN bus?

PBL

Prof. Peter Bernard Ladkin, Bielefeld, Germany
MoreInCommon
Je suis Charlie
Tel+msg +49 (0)521 880 7319  www.rvs-bi.de





-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <https://lists.techfak.uni-bielefeld.de/mailman/private/systemsafety/attachments/20180917/304e9a6d/attachment.sig>


More information about the systemsafety mailing list