[SystemSafety] Collected stopgap measures

Fri Nov 16 13:41:13 CET 2018

Martyn Thomas wrote:

“I interpreted Paul's emails as a request for help because he would like to be able to argue for better software engineering but finds himself frustrated by a software industry that mostly does not use rigorous engineering and gets away with it.”

I think the industry has mostly been able to get away with it so far. The frustration in the user base is increasing. It has to reach a tipping point before too long. Software systems are becoming more and more critical every day, and some highly paid amateur programmer is going to screw something up that either harms or kills a bunch of people. Then we won’t be able to get away with it any more.

“I interpreted Paul's emails as a request for help because he would like to be able to argue for better software engineering but finds himself frustrated by a software industry that mostly does not use rigorous engineering and gets away with it.”

I am happy to help, but it does seem like my offers to help generally fall on deaf ears. . .

“I would like the discussion to focus on what we might be able to do to radically improve software engineering standards across industry, when those companies that do follow a professional engineering design process find themselves regularly underbid by less professional competitors who rely on being able to persuade the customer to extend the project budgets and timescales when the absence of documented, complete and consistent requirements make that necessary.”

Agreed. And I also offer a solution.

The whole reason the code exists in the first place is so that it can (automatically) enforce some set of policies and carries out some set of processes. In banking, for example, policies would include things like:
• Each savings account must have a balance
• Each savings account must have an overdraft limit
• The status of each savings account can only be normal or overdrawn
• Each savings account must be owned by at least one customer
• A customer must own at least one bank account, but may own many
• A customer’s date of birth cannot be later than today
• …

Banking software would involve processes like:
• Create a customer
• Open a savings account
• Deposit money into a savings account
• See how much money is in a savings account
• Withdraw money from a savings account
• Transfer money from one savings account to another
• Close a savings account
• …

These policy and process semantics are completely independent of the implementation technology (I.e., programming language). Note that there are no bits, no bytes, no threads, no function calls, . . . in those policy and process semantics. One may correctly interpret these policy and process semantics as “the functional requirements”.

If the developers hope to successfully automate some set of policy and process semantics for the benefit of some user base, then those developers need to understand the policy and process semantics at least as well—if not better—than the business domain experts do. This leads to the need for a precise, concise specification of policy and process semantics. Les Chambers has already stated here how he has been highly successful in validating process semantics using finite state machines. I agree that finite state machines are necessary, but in my experience they are not sufficient in the general case. In the general case, we need a precise, concise specification of policy semantics in the form of a (purely abstract, logical) “data model”.

Now, it turns out that we can also capture, concisely and precisely, the policy and process semantics of the programming language. For example, policies of Java include:
• An entity type is either a primitive type or a class
• A (sub-) class may extend at most one (super-) class
• A (super-) class may be extended by zero to many (sub-) classes
• A class may be final or not
• A class is implemented by one to many members
• A member may not exist outside of a class
• A member is either a variable or an operation
• A member has a name
• A member has an accessibility (public, protected, or private)
• A member may be static or not
• An operation is implemented by zero to many statements
• Every statement is in the implementation of exactly one operation
• Every member has exactly one declared entity type
• A statement is either: assignment, if, for, while, switch/case, try, … return, or a block
• A block contains zero to many statements
• An ‘if’ statement must have a ‘then’ clause but does not need to have an ‘else’ clause
• …

The process semantics of Java include:
• The result of an assignment statement is to …
• The order of operations in evaluating an expression is …
• An expression using divide when the denominator is 0 results in NaN
• Accessing a reference (a pointer) when its value is null results in throwing “NullPointerException”
• When executing a statement that includes ‘super.’, the most immediate superclass’ definition is used
• The ‘then’ clause on an ‘if’ statement will only be executed when the logical expression evaluates to ‘true’
• …

Further, it can be shown that the lines of code that a developer writes needs to be a mapping (in the set theory sense) from the policy and process semantics the users want automated onto the policy and process semantics of the programming language. That mapping has to satisfy three key properties:
• It must be sufficiently complete, everything in the user policy and process semantics that they want automated needs to have been mapped
• The mapping has to preserve the user policy and process semantics
• The mapping also has to satisfy the non-functional requirements

The industry uses the (way too cute) term “bug”, when the term “defect” is far more appropriate. But, in fact, we need to realize that these so-called bugs are nothing more and nothing less than semantic inconsistencies between the policy and process semantics that the users want vs. the policy and process semantics the code actually delivers. Either the mapping expressed in the code does not include things the users wanted automated, or the user’s policy and process semantics have not been preserved (or, one or more non-functional requirements have not been satisfied).

Much more detail on all of this:
• Examples of how code needs to be a mapping of user policy and process semantics onto technology semantics
• How to create and validate precise, concise specifications of user policy and process semantics
• How to preserve user policy and process semantics when mapping to technology semantics
• How to scale this up to systems involving millions of lines of code
• How economic decision-making (a key part of true engineering) factors in
• Numerous examples
• . . .
are all contained in the manuscript for a new book titled “How to Engineer Software”. The manuscript is available on DropBox at:

https://www.dropbox.com/sh/jjjwmr3cpt4wgfc/AACSFjYD2p3PvcFzwFlb3S9Qa?dl=0

The benefits of developing software this way are consistent:
• The software gets delivered in about half the time as normally expected for that scope of functionality
• The cost of delivering the software is also cut in about half
• The users encounter no more than one tenth of the number of defects they would normally encounter, usually even less
• The long-term maintenance costs of the software drop by somewhere between a factor of 4 and a factor of 8

I would propose that the question of how to professionally engineer software has already been answered. We just need get developers to start doing it.

— steve

From: systemsafety <systemsafety-bounces at lists.techfak.uni-bielefeld.de<mailto:systemsafety-bounces at lists.techfak.uni-bielefeld.de>> on behalf of Martyn Thomas <martyn at thomas-associates.co.uk<mailto:martyn at thomas-associates.co.uk>>
Date: Friday, November 16, 2018 at 2:27 AM
To: "systemsafety at lists.techfak.uni-bielefeld.de<mailto:systemsafety at lists.techfak.uni-bielefeld.de>" <systemsafety at lists.techfak.uni-bielefeld.de<mailto:systemsafety at lists.techfak.uni-bielefeld.de>>
Subject: Re: [SystemSafety] Collected stopgap measures

I think this discussion is missing the point.

To summarise: Paul Sherwood observed that most successful software lacked the basic requirements of a professional engineering design process, specifically documented requirements or documented design. He also said that in his opinion this was not the right way to develop software, especially for safety functions. He further observed that some safety-related software incorporates components that lack documented requirements or documented design. I agree with these statements.

I interpreted Paul's emails as a request for help because he would like to be able to argue for better software engineering but finds himself frustrated by a software industry that mostly does not use rigorous engineering and gets away with it.

In response he received a measure of abuse and quotations from international standards that are known to be flawed, rather than a reasoned discussion of the issues that he had raised.

I would like the discussion to focus on what we might be able to do to radically improve software engineering standards across industry, when those companies that do follow a professional engineering design process find themselves regularly underbid by less professional competitors who rely on being able to persuade the customer to extend the project budgets and timescales when the absence of documented, complete and consistent requirements make that necessary.

Martyn

On 16/11/2018 08:41, Peter Bernard Ladkin wrote:

I have just come back from a meeting of the 61508 MTs in Grenoble and this feels to me like a
parallel universe.

On 2018-11-16 02:42 , Paul Sherwood wrote:

-----Original Message-----

For the software only properties, it's obvious that we DO NOT need documented requirements, or

documented design. Software is often (almost
always, these days, in agileworld?) successfully evolved and consumed  without either of these.

... but I still stand by this statement.

IEC 61508 and (as far as I am aware) ISO 26262 require there to be a software safety requirements
specification. In IEC 61508 there is a whole subclause, 7.2, specifying it, which is 3.5pp long.

So are we talking about so-called "safety" applications which, through some magic, do not have to
conform to applicable safety standards? Or are we talking cowboy developers who claim they are
producing software for "safety" applications but in fact aren't?

The people who commission, install and operate safety-related systems in any sector except medical,
automotive and aerospace do not, as far as I am aware, commission software from companies which are
not able to produce conformance documentation.

So where are all these software engineers producing software for safety applications who don't
produce documentation?

I can all but guarantee they are not producing software for market-leading safety-related systems
developers and integrators, because all of those of which I am aware require adherence to the
applicable safety standards, otherwise one accident and the lawyers will force them to close up shop
(and in the UK the Board would have to work hard to stay out of jail).

AFAIK there were never any a-priori requirements or architecture for:

- linux kernel
- openssh
- gcc
- llvm
- python

... or most of the software that Google runs internally (i'm sure others can provide many additional
examples).

The fact that such software exists and is widely relied upon and trusted is enough to justify the
statement.

No it is evidently not. It is enough to justify the statement that relatively reliable software has
been developed for some applications without documented requirements or documented design. It does
not follow from that that all software for all applications can be developed without.... The fact
that I don't need a map to travel around Bielefeld doesn't mean I don't need maps for other places.

I can't see how anyone could claim to have engineered a system for safety or security without
stating what losses/hazards/threats that aim to address (requirements) and how the solution is
supposed to be achieved (architecture). But these are system properties etc etc.

I can't parse the first sentence, but you are right that a risk analysis is required, and safety
requirements based on this risk analysis must be formulated, and the software design must be
accompanied by documentation showing how the software safety requirements are met. None of these are
"system properties etc etc". They are documentation.

And yet I keep on encountering supposedly expert safety folks who are happy to claim things like
"with this 'safe' hypervisor you can run untrusted code in an internet-facing guest alongside safety
critical functions."

It is not at all clear what you mean here by "expert safety folks". Lots of people want to talk
about E/E/PE system safety, but that doesn't mean they are expert. I have found a handy rule of
thumb is to ask a question involving "E/E/PE" to see if they know what it means.

I once saw an advertisement for a conference on safety of software, with some moderately well-known
computer-science theoreticians on the program committee - and not a single person recognised in the
"safety community". So I mailed one of these distinguished people to ask how come she could help
organise a conference on safety without a safety expert? What would they be discussing? She evaded
the question, referring me to the committee chairman. I imagine it was another bunch of people
wanting to talk to each other about reliability of such systems - the usual confusion. (Not that it
is at all bad to talk about reliability!)

PBL

Prof. Peter Bernard Ladkin, Bielefeld, Germany
MoreInCommon
Je suis Charlie
Tel+msg +49 (0)521 880 7319  www.rvs-bi.de<http://www.rvs-bi.de>

_______________________________________________
The System Safety Mailing List
systemsafety at TechFak.Uni-Bielefeld.DE<mailto:systemsafety at TechFak.Uni-Bielefeld.DE>
Manage your subscription: https://lists.techfak.uni-bielefeld.de/mailman/listinfo/systemsafety
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.techfak.uni-bielefeld.de/mailman/private/systemsafety/attachments/20181116/9ef3e052/attachment-0001.html>