[SystemSafety] Another question

Fri Sep 21 00:25:47 CEST 2018

In response to my statement:

“The clear alternative is to replace non-value-added work with value-added
work.”

Paul Sherwood wrote:

“This 'alternative' has never been clear on any real-scale project I’ve
encountered in my whole career.”

See, that’s where we are different. I have used exactly an alternative on
several real-scale projects I have encountered in my career. Real-world
projects involving up to 350 developers.

I have to give a requisite nod to Derek Jones when I say this because the
data I am citing below is being recalled from my on-project experience
rather than “fully documented, independently validated, . . . “ metrics
data. But since I was there, I trust the data. I hope that you and others
can trust it as well.

That said, I do not do requirements work the way most software projects do
it. I also do not do design work the way most projects do it. The key is
to realize that the reason the code exist in the first place is to
automate enforcement of some collection of “policies” and executing of
some collection of processes. Policy and process semantics, to be precise.
In the end, the code ends up being a set-mapping from stakeholder policy
and process semantics on to programming language semantics.

Most people call them bugs. I call them defects. They are really nothing
more and nothing less than semantic inconsistencies between the
stakeholder-desired semantics and the as-implemented semantics.

So the way I develop software is to focus first on precisely specifying
(as, essentially, blueprints) the stakeholder’s desired policy and process
semantics, completely independent of any and all computing technology.

We then validate the policy & process semantics with the stakeholders.

The next step is to decide on general mappings from stakeholder semantics
to language semantics—I.e., design. Some prototyping can easily be
involved here to investigate alternatives.

Once the general (“design-level”) mappings have been determined, the next
step is to determine the detailed mappings—I.e., the actual coding.

Along the way we hold formal inspections of the stakeholder semantics
blueprints and the various “design" documentation.

In such real-world projects involving as many as 350 developers developing
the Mission Systems software for the Boeing P-8 Poseidon, an entirely
reasonable estimate of total project rework is under 10%. This is
calculated by comparing the total amount of time spent creating the
various deliverable documents to the time spent correcting those same
deliverables after the formal inspections. If 10 work-days were invested
in creating a section of a stakeholder semantic specification, and less
than one work-day was spent on the formal inspection and the rework caused
by defects identified in that formal inspection then the rework ratio (R%)
can’t be any higher than 10%.

Over many projects (B-767 Engine Sim Automated Test Equipment, B-777
Automated Test Equipment, B-787 Automated Test Equipment, . . .) that
exact pattern holds: Technical work involved in creating a project
deliverable takes X time. The formal inspection and rework of defects
identified in that formal inspection take no more than 0.1 * X time.

These projects end up taking about half of the time people thought the
projects would take. These projects deliver software to stakeholders who
report user-encountered defect rates of around 0.3 defects per 1000 lines
of code. 1 to 3 defects per 1000 lines of code is a pretty good
field-reported defect rate for a commercial product. Some have stated
expectations of as many as 15 to 17 defects per 1000 lines of code for
real-time / embedded software. B-767 ES ATE, B-777 ATE, B-787 ATE, and P-8
Mission Systems are significantly real-time and mostly embedded. So note
that we have built complex real-time embedded software where the expected
user-reported defect rates are much higher than commercial products and we
deliver at defect rates significantly lower than those same commercial
products.

The approach of precise semantic specification and maintaining semantic
consistency in design and code prevents whole categories of defects from
ever happening in the first place. And, of course, defects clearly
happened on all of those projects. But the vast majority of the defects
that did happen were found in the formal inspections. We ran “defect
removal effectiveness” calculations on the results of the B-777 ATE
project and found our inspections were averaging around 95% defect-find
effectiveness.

This approach clearly replaces the on-the-order-of 60% non-value-added
rework with on-the-order-of 10% non-value-added-rework, thereby increasing
value-added work from 40% to 90%.

It can happen. It has happened on several very real software projects.

It just can’t happen on your project unless you abandon the chaotic
mainstream software development processes and replace them with processes
that focus on doing as much of the work as possible as right as possible
the first time, and focus on finding & removing the inevitable unavoidable
defects as quickly as possible.

— steve

-----Original Message-----
From: Paul Sherwood <paul.sherwood at codethink.co.uk>
Date: Thursday, September 20, 2018 at 2:04 PM
To: Steve Tockey <Steve.Tockey at construx.com>
Cc: Derek M Jones <derek at knosof.co.uk>,
"systemsafety at lists.techfak.uni-bielefeld.de"
<systemsafety at lists.techfak.uni-bielefeld.de>
Subject: Re: [SystemSafety] Another question

On 2018-09-20 21:42, Steve Tockey wrote:
> ³You cannot claim that just because some factor contributed the largest
> amount, that this was somehow bad.  What were the alternatives?²
> 
> When that one largest factor is rework, yes I can.

Not necessarily. As much as I think the Agile folks are/were snake-oil
salesmen, we can't expect to "get it right first time" for most serious
human endeavours. In fact not even for tiny endeavours... try turning on
a key logger and then replaying your own keystrokes to see how many
errors you make.

Our initial understanding of the requirements ** will be wrong **.

> Rework, in the Deming sense, is waste. It does not add value to the
> product being built or maintained.

Tough. Better add contingency then :)

> Requirements, design, construction‹and
> to an extent‹testing work had better add value. The clear alternative
> is
> to replace non-value-added work with value-added work.

This 'alternative' has never been clear on any real-scale project I've
encountered in my whole career.

> 60% non-value-added work cannot be the cheapest and fast way to
> anything.

Possibly true, but maybe not. We are short of evidence, as has been
expressed in other emails.