[SystemSafety] Koopman replies to concerns over Toyota UA case

Sun Dec 31 05:23:39 CET 2017

Derek,
Are you saying that there should be NO constraints whatsoever on the code
a developer writes? Are you willing to accept the following because I have
seen all of these with my own eyes:

*) A single C++ class that had over 3400 lines of code, but all of that
code was in one single method. Further, the Cyclomatic complexity of that
one method was over 2400, meaning that 2 out of every 3 lines of code was
a decision of some sort.

*) A function that had 57 input parameters.

*) A single if() statement with 36 conditions in the boolean expression.

We may end up having to agree to disagree, but having no limits on
structural complexity whatsoever completely removes an organization’s
ability to say “NO!” to the kinds of schlock code that today’s “highly
paid amateur programmers” spew out every day. Are you OK with the current
state of affairs in the software industry? I’m definitely not.

You then wrote:

“I can reduce the complexity (by creating lots of smaller functions) and
nobody will be any the wiser. Where does the 'missing' complexity go? It
now exists in the dependency relationships between all the functions I
have created; these dependency relationships are not included in the book
keeping (as it exists today, somebody may invent a way of measuring it
tomorrow).

This is an accounting fraud.”

Again, if all you ever measure is Cyclomatic complexity then yes, I agree.
But that is explicitly not what I said, was it? I said that we need a
collection of measures (again, “Software complexity is not a number, it's
a vector”). I didn’t say it explicitly in my last reply, there need to be
at least two categories of software structural complexity metrics:

*) “Local" complexity metrics--like Cyclomatic complexity and Depth of
decision nesting, these measure the complexity inside of a single function.

*) “Global" complexity metrics--like Number of parameters and Fan out,
these measure how functions fit into their larger environment.

We can trade one for the other: local complexity for global vs. global
complexity for local. If I reduce local complexity, then surely it must
re-appear somehow as global complexity. If I pull a subset of the
decisions out of Function1() and move them into new Function2() then I
will have necessarily reduced the local complexity of Function1(). But I
will have necessarily increased global complexity (e.g., Fan out). As I
thought I at least implied, the complexity didn’t go away, it it just
moved to a different part of the code. The numbers in the complexity
vector shifted.

The goal isn’t to eliminate complexity. You clearly can’t do that.
Whatever complexity is in the abstract (i.e., essential) function being
implemented has to be the minimum complexity of the code that implements
that function. The engineering goal has to be to balance an appropriate
amount of local complexity against an appropriate amount of global
complexity, thus minimizing overall (total) complexity.

Your Book keeping and Accounting fraud analogy are appropriate. But the
solution is not to throw out accounting systems all altogether—which is
what I interpret you as saying. The solution has to be to develop an
appropriate accounting system that exposes things like just moving
complexity from one place to another.

You did write, “(as it exists today, somebody may invent a way of
measuring it tomorrow)”

Precisely. That’s my point: we *need* to invent a way of measuring it. If
not tomorrow, then as soon as we can.

You then wrote:

“All this talk of other metrics is vacuous hand waving (I tend to treat
inventions of new metrics as unfunny jokes; it is possible somebody has
invented something useful and I have missed it).”

“Inventing" a new metric IS merely vacuous hand waving ***unless and
until*** someone can show a strong correlation between that metric and
something we care about, like defect density.

But I interpret what you are saying as, “Until we have the perfect
accounting system then we can't have any accounting system”. I can’t agree
with that at all. Today’s accounting system is far from perfect. But to
say that we can’t have any limit whatsoever on things like Cyclomatic
complexity—even though we already know it is correlated with defect
density—is a step backwards not forwards.

— steve

-----Original Message-----
From: systemsafety <systemsafety-bounces at lists.techfak.uni-bielefeld.de>
on behalf of Derek M Jones <derek at knosof.co.uk>
Organization: Knowledge Software, Ltd
Date: Saturday, December 30, 2017 at 4:04 PM
To: "systemsafety at lists.techfak.uni-bielefeld.de"
<systemsafety at lists.techfak.uni-bielefeld.de>
Subject: Re: [SystemSafety] Koopman replies to concerns over Toyota UA case

All,

Terminology: McCabe Cyclomatic Complexity is known by one or
more combinations of those three words.

Steve,

> 2) As others have said, Cyclomatic complexity by itself is insufficient.
> Many other measures need to be considered at the same time, specifically:
> Depth of decision nesting, Number of parameters, and Fan out. As a
> consultant friend of mine, Meilir Page-Jones, says, ³Software complexity
> is not a number, it¹s a vector².

No. software complexity is a sales pitch.

Yes, there are lots of measures.  But none of them will tell you that
a function with a complexity metric of 146 is the best choice.  That
information can only be obtained by talking to the people involved.

The box tickers need to be kept happy.  Keep your metric values
low and you don't get a walk on role in some profs slide deck:
Page 38 of 
https://users.ece.cmu.edu/~koopman/pubs/koopman14_toyota_ua_slides.pdf
or someone's blog:
https://criticaluncertainties.com/2013/11/11/toyota-and-the-sphagetti-monst
er/

I can reduce the complexity (by creating lots of smaller functions)
and nobody will be any the wiser.
Where does the 'missing' complexity go?
It now exists in the dependency relationships between all the functions
I have created; these dependency relationships are not included
in the book keeping (as it exists today, somebody may invent a way of
measuring it tomorrow).

This is an accounting fraud.

All this talk of other metrics is vacuous hand waving (I tend to treat
inventions of new metrics as unfunny jokes; it is possible somebody
has invented something useful and I have missed it).

> 
> 3) Even the four measures I claim (in #2), while necessary, are still
> insufficient. I argue that we need to also be looking at things like the
> number of conditions in boolean expressions, number number of methods
> overridden in inheritance, and so on. The problem is that nobody (to my
> knowledge) has done any correlation analysis on these to show that they
> are worth looking at. A ton of things about software are measurable, but
> not many of them actually matter (referring specifically to the volume of
> metrics that come out of a typical static analysis tool). A whole bunch
>of
> vitally important research needs to be done here.
> 
> 
> 4) So-called ³static analysis² only considers the *syntax* (structure) of
> the code, not the *semantics* (meaning). Proper evaluation of code
>quality
> also depends on semantic analysis. This would include things like,
> Abstraction, Encapsulation (Design by Contract), Liskov Substitutability,
> Cohesion and Coupling, etc.
> 
> Your counter-argument of arbitrarily chopping a 1400 line of code
>function
> into 100 functions with 14 lines each is valid from a purely theoretical
> perspective but falls apart from a professional, engineering perspective.
> No well-educated, well-meaning, truly professional grade software
>engineer
> would ever even dream of such an arbitrary chopping. Any even modestly
> professional software engineer would only ever consider chopping those
> 1400 lines into semantically-meaningful units. Each sub-function would
> abstract only one, single aspect of the original 1400 line function. Each
> sub-function would encapsulate its one semantically-meaningful aspect and
> only expose a signature (interface syntax) and contract (interface
> semantics). Each sub-function would be highly cohesive about that single,
> semantically-meaningful aspect. Each sub-function would be as loosely
> coupled with all other sub-functions as it could practically be, both in
> terms of call-return and shared (i.e., common or global) data.
> 
> Each sub-function would also have appropriately minimal syntactic
> (structural) complexity, e.g., low Cyclomatic complexity, low Depth of
> decision nesting, low Number of parameters, and low Fan out (and,
> hopefully, every other yet-to-be-determined) relevant structural
> complexity metric.
> 
> 
> You are perfectly correct in saying that all of the ³intrinsic
>complexity²
> remains. It has to, otherwise the restructured code doesn¹t have the same
> behavior as the original. The issue is that when some chunk of complex
> behavior is implemented in a single, poorly structured function it is
> difficult or impossible to reason about it. It¹s also hard to test. So
> it¹s hard to say that the function a) does everything it is supposed to
> do, and b) does nothing it is not supposed to do. On the other hand, a
> complex function that has been appropriately partitioned into
> well-defined, well-structured (both syntactically and semantically)
> sub-functions is exponentially easier to reason about. It is
>exponentially
> easier (and cheaper) to test. It is exponentially easier to say that
> overall a) it does everything it is supposed to do, and b) it does
>nothing
> it is not supposed to do.
> 
> 
> 
> ‹ steve
> 
> 
> 
> 
> -----Original Message-----
> From: systemsafety <systemsafety-bounces at lists.techfak.uni-bielefeld.de>
> on behalf of Derek M Jones <derek at knosof.co.uk>
> Organization: Knowledge Software, Ltd
> Date: Saturday, December 30, 2017 at 12:44 PM
> To: "systemsafety at lists.techfak.uni-bielefeld.de"
> <systemsafety at lists.techfak.uni-bielefeld.de>
> Subject: Re: [SystemSafety] Koopman replies to concerns over Toyota UA
>case
> 
> Clayton,
> 
>>   I think this was just used as an example for laypersons.
> 
> Yes, I think this is probably the case.
> 
> But it is a bad example in that it is training laypeople to think
> of code as the problem, rather than the people doing/managing the
> coding.
> 
> If you were given two sets of source code metrics, one from
> a thoroughly tested system and one from a basis tests only system,
> do you think you could tell which was which?  I know I would
> have problems telling them apart.
> 
> I was at a workshop on code review last month
> http://crest.cs.ucl.ac.uk/cow/56/
> and asked why researchers spent most of their time measuring
> source code, rather than talking to the people who wrote it.
> The answer was that computing was considered part of
> engineering/science and that the social or psychological aspects
> of code reviews was 'social science', which is not what people
> in computing were expected to do or perhaps even wanted to do.
> 
>>> Did anybody talk to the engineer who wrote the function for which
>>> "Throttle angle function complexity = 146²?
>>
>> That is the big question, isn¹t it?  AFAIK, there was little evidence
>> during development of anyone asking that question, much less providing
>>an
>> answer.  I believe in the testimony it was stated there was little
>> evidence of code reviews.
> 
> This is the key evidence and what we should putting in examples to
> inform laypeople about the important human factors.
> 
>   From Matthews reply:
>   > But if you add to that McCabes metric of 146 that the throttle angle
>   > function software code was 1400 lines long, and that there was no
> 
> McCabes highly correlates with lines of code.
> 
> I could chop up that 1,400 lines into 100 functions, containing an
> average of 14 lines each.  The metric values would merge into the
> background noise, but the intrinsic complexity would remain.
> 
>>>    Claiming
>>> that code is untestable or unmaintainable is a marketing statement, not
>>> engineering.
>> Slides aside, I believe the engineering position was  "infeasible # of
>> tests requiredŠ² or something like that.
> 
> Infeasible from what perspective?  Money budgeted, maximum that could
> be spent and the company still make a profit, maximum the customer is
> willing to pay for a car (the regulatory could have a say in the last
> option)?
> 
> Chopping the 1,400 lines up into 100 functions does not make the
> testability problem go away, the ticked boxes on the metrics sheet
> just makes it appear to have gone away.
> 

-- 
Derek M. Jones           Software analysis
tel: +44 (0)1252 520667  blog:shape-of-code.coding-guidelines.com
_______________________________________________
The System Safety Mailing List
systemsafety at TechFak.Uni-Bielefeld.DE