[SystemSafety] Koopman replies to concerns over Toyota UA case

Sat Dec 30 23:03:51 CET 2017

Derek and all,
Several points, IMHO:

1) Cyclomatic complexity doesn¹t correlate with defect density any better
than lines of code do ***if you correlate at the whole-program level***.
If you correlate at the method / function level, the correlation is there.

2) As others have said, Cyclomatic complexity by itself is insufficient.
Many other measures need to be considered at the same time, specifically:
Depth of decision nesting, Number of parameters, and Fan out. As a
consultant friend of mine, Meilir Page-Jones, says, ³Software complexity
is not a number, it¹s a vector².

3) Even the four measures I claim (in #2), while necessary, are still
insufficient. I argue that we need to also be looking at things like the
number of conditions in boolean expressions, number number of methods
overridden in inheritance, and so on. The problem is that nobody (to my
knowledge) has done any correlation analysis on these to show that they
are worth looking at. A ton of things about software are measurable, but
not many of them actually matter (referring specifically to the volume of
metrics that come out of a typical static analysis tool). A whole bunch of
vitally important research needs to be done here.

4) So-called ³static analysis² only considers the *syntax* (structure) of
the code, not the *semantics* (meaning). Proper evaluation of code quality
also depends on semantic analysis. This would include things like,
Abstraction, Encapsulation (Design by Contract), Liskov Substitutability,
Cohesion and Coupling, etc.

Your counter-argument of arbitrarily chopping a 1400 line of code function
into 100 functions with 14 lines each is valid from a purely theoretical
perspective but falls apart from a professional, engineering perspective.
No well-educated, well-meaning, truly professional grade software engineer
would ever even dream of such an arbitrary chopping. Any even modestly
professional software engineer would only ever consider chopping those
1400 lines into semantically-meaningful units. Each sub-function would
abstract only one, single aspect of the original 1400 line function. Each
sub-function would encapsulate its one semantically-meaningful aspect and
only expose a signature (interface syntax) and contract (interface
semantics). Each sub-function would be highly cohesive about that single,
semantically-meaningful aspect. Each sub-function would be as loosely
coupled with all other sub-functions as it could practically be, both in
terms of call-return and shared (i.e., common or global) data.

Each sub-function would also have appropriately minimal syntactic
(structural) complexity, e.g., low Cyclomatic complexity, low Depth of
decision nesting, low Number of parameters, and low Fan out (and,
hopefully, every other yet-to-be-determined) relevant structural
complexity metric.

You are perfectly correct in saying that all of the ³intrinsic complexity²
remains. It has to, otherwise the restructured code doesn¹t have the same
behavior as the original. The issue is that when some chunk of complex
behavior is implemented in a single, poorly structured function it is
difficult or impossible to reason about it. It¹s also hard to test. So
it¹s hard to say that the function a) does everything it is supposed to
do, and b) does nothing it is not supposed to do. On the other hand, a
complex function that has been appropriately partitioned into
well-defined, well-structured (both syntactically and semantically)
sub-functions is exponentially easier to reason about. It is exponentially
easier (and cheaper) to test. It is exponentially easier to say that
overall a) it does everything it is supposed to do, and b) it does nothing
it is not supposed to do.

‹ steve

-----Original Message-----
From: systemsafety <systemsafety-bounces at lists.techfak.uni-bielefeld.de>
on behalf of Derek M Jones <derek at knosof.co.uk>
Organization: Knowledge Software, Ltd
Date: Saturday, December 30, 2017 at 12:44 PM
To: "systemsafety at lists.techfak.uni-bielefeld.de"
<systemsafety at lists.techfak.uni-bielefeld.de>
Subject: Re: [SystemSafety] Koopman replies to concerns over Toyota UA case

Clayton,

>  I think this was just used as an example for laypersons.

Yes, I think this is probably the case.

But it is a bad example in that it is training laypeople to think
of code as the problem, rather than the people doing/managing the
coding.

If you were given two sets of source code metrics, one from
a thoroughly tested system and one from a basis tests only system,
do you think you could tell which was which?  I know I would
have problems telling them apart.

I was at a workshop on code review last month
http://crest.cs.ucl.ac.uk/cow/56/
and asked why researchers spent most of their time measuring
source code, rather than talking to the people who wrote it.
The answer was that computing was considered part of
engineering/science and that the social or psychological aspects
of code reviews was 'social science', which is not what people
in computing were expected to do or perhaps even wanted to do.

>> Did anybody talk to the engineer who wrote the function for which
>> "Throttle angle function complexity = 146²?
> 
> That is the big question, isn¹t it?  AFAIK, there was little evidence
>during development of anyone asking that question, much less providing an
>answer.  I believe in the testimony it was stated there was little
>evidence of code reviews.

This is the key evidence and what we should putting in examples to
inform laypeople about the important human factors.

 From Matthews reply:
 > But if you add to that McCabes metric of 146 that the throttle angle
 > function software code was 1400 lines long, and that there was no

McCabes highly correlates with lines of code.

I could chop up that 1,400 lines into 100 functions, containing an
average of 14 lines each.  The metric values would merge into the
background noise, but the intrinsic complexity would remain.

>>   Claiming
>> that code is untestable or unmaintainable is a marketing statement, not
>> engineering.
> Slides aside, I believe the engineering position was  "infeasible # of
>tests requiredŠ² or something like that.

Infeasible from what perspective?  Money budgeted, maximum that could
be spent and the company still make a profit, maximum the customer is
willing to pay for a car (the regulatory could have a say in the last
option)?

Chopping the 1,400 lines up into 100 functions does not make the
testability problem go away, the ticked boxes on the metrics sheet
just makes it appear to have gone away.

-- 
Derek M. Jones           Software analysis
tel: +44 (0)1252 520667  blog:shape-of-code.coding-guidelines.com
_______________________________________________
The System Safety Mailing List
systemsafety at TechFak.Uni-Bielefeld.DE