[SystemSafety] Koopman replies to concerns over Toyota UA case

Sun Dec 31 01:04:16 CET 2017

All,

Terminology: McCabe Cyclomatic Complexity is known by one or
more combinations of those three words.

Steve,

> 2) As others have said, Cyclomatic complexity by itself is insufficient.
> Many other measures need to be considered at the same time, specifically:
> Depth of decision nesting, Number of parameters, and Fan out. As a
> consultant friend of mine, Meilir Page-Jones, says, ³Software complexity
> is not a number, it¹s a vector².

No. software complexity is a sales pitch.

Yes, there are lots of measures.  But none of them will tell you that
a function with a complexity metric of 146 is the best choice.  That
information can only be obtained by talking to the people involved.

The box tickers need to be kept happy.  Keep your metric values
low and you don't get a walk on role in some profs slide deck:
Page 38 of 
https://users.ece.cmu.edu/~koopman/pubs/koopman14_toyota_ua_slides.pdf
or someone's blog:
https://criticaluncertainties.com/2013/11/11/toyota-and-the-sphagetti-monster/

I can reduce the complexity (by creating lots of smaller functions)
and nobody will be any the wiser.
Where does the 'missing' complexity go?
It now exists in the dependency relationships between all the functions
I have created; these dependency relationships are not included
in the book keeping (as it exists today, somebody may invent a way of
measuring it tomorrow).

This is an accounting fraud.

All this talk of other metrics is vacuous hand waving (I tend to treat
inventions of new metrics as unfunny jokes; it is possible somebody
has invented something useful and I have missed it).

> 
> 3) Even the four measures I claim (in #2), while necessary, are still
> insufficient. I argue that we need to also be looking at things like the
> number of conditions in boolean expressions, number number of methods
> overridden in inheritance, and so on. The problem is that nobody (to my
> knowledge) has done any correlation analysis on these to show that they
> are worth looking at. A ton of things about software are measurable, but
> not many of them actually matter (referring specifically to the volume of
> metrics that come out of a typical static analysis tool). A whole bunch of
> vitally important research needs to be done here.
> 
> 
> 4) So-called ³static analysis² only considers the *syntax* (structure) of
> the code, not the *semantics* (meaning). Proper evaluation of code quality
> also depends on semantic analysis. This would include things like,
> Abstraction, Encapsulation (Design by Contract), Liskov Substitutability,
> Cohesion and Coupling, etc.
> 
> Your counter-argument of arbitrarily chopping a 1400 line of code function
> into 100 functions with 14 lines each is valid from a purely theoretical
> perspective but falls apart from a professional, engineering perspective.
> No well-educated, well-meaning, truly professional grade software engineer
> would ever even dream of such an arbitrary chopping. Any even modestly
> professional software engineer would only ever consider chopping those
> 1400 lines into semantically-meaningful units. Each sub-function would
> abstract only one, single aspect of the original 1400 line function. Each
> sub-function would encapsulate its one semantically-meaningful aspect and
> only expose a signature (interface syntax) and contract (interface
> semantics). Each sub-function would be highly cohesive about that single,
> semantically-meaningful aspect. Each sub-function would be as loosely
> coupled with all other sub-functions as it could practically be, both in
> terms of call-return and shared (i.e., common or global) data.
> 
> Each sub-function would also have appropriately minimal syntactic
> (structural) complexity, e.g., low Cyclomatic complexity, low Depth of
> decision nesting, low Number of parameters, and low Fan out (and,
> hopefully, every other yet-to-be-determined) relevant structural
> complexity metric.
> 
> 
> You are perfectly correct in saying that all of the ³intrinsic complexity²
> remains. It has to, otherwise the restructured code doesn¹t have the same
> behavior as the original. The issue is that when some chunk of complex
> behavior is implemented in a single, poorly structured function it is
> difficult or impossible to reason about it. It¹s also hard to test. So
> it¹s hard to say that the function a) does everything it is supposed to
> do, and b) does nothing it is not supposed to do. On the other hand, a
> complex function that has been appropriately partitioned into
> well-defined, well-structured (both syntactically and semantically)
> sub-functions is exponentially easier to reason about. It is exponentially
> easier (and cheaper) to test. It is exponentially easier to say that
> overall a) it does everything it is supposed to do, and b) it does nothing
> it is not supposed to do.
> 
> 
> 
> ‹ steve
> 
> 
> 
> 
> -----Original Message-----
> From: systemsafety <systemsafety-bounces at lists.techfak.uni-bielefeld.de>
> on behalf of Derek M Jones <derek at knosof.co.uk>
> Organization: Knowledge Software, Ltd
> Date: Saturday, December 30, 2017 at 12:44 PM
> To: "systemsafety at lists.techfak.uni-bielefeld.de"
> <systemsafety at lists.techfak.uni-bielefeld.de>
> Subject: Re: [SystemSafety] Koopman replies to concerns over Toyota UA case
> 
> Clayton,
> 
>>   I think this was just used as an example for laypersons.
> 
> Yes, I think this is probably the case.
> 
> But it is a bad example in that it is training laypeople to think
> of code as the problem, rather than the people doing/managing the
> coding.
> 
> If you were given two sets of source code metrics, one from
> a thoroughly tested system and one from a basis tests only system,
> do you think you could tell which was which?  I know I would
> have problems telling them apart.
> 
> I was at a workshop on code review last month
> http://crest.cs.ucl.ac.uk/cow/56/
> and asked why researchers spent most of their time measuring
> source code, rather than talking to the people who wrote it.
> The answer was that computing was considered part of
> engineering/science and that the social or psychological aspects
> of code reviews was 'social science', which is not what people
> in computing were expected to do or perhaps even wanted to do.
> 
>>> Did anybody talk to the engineer who wrote the function for which
>>> "Throttle angle function complexity = 146²?
>>
>> That is the big question, isn¹t it?  AFAIK, there was little evidence
>> during development of anyone asking that question, much less providing an
>> answer.  I believe in the testimony it was stated there was little
>> evidence of code reviews.
> 
> This is the key evidence and what we should putting in examples to
> inform laypeople about the important human factors.
> 
>   From Matthews reply:
>   > But if you add to that McCabes metric of 146 that the throttle angle
>   > function software code was 1400 lines long, and that there was no
> 
> McCabes highly correlates with lines of code.
> 
> I could chop up that 1,400 lines into 100 functions, containing an
> average of 14 lines each.  The metric values would merge into the
> background noise, but the intrinsic complexity would remain.
> 
>>>    Claiming
>>> that code is untestable or unmaintainable is a marketing statement, not
>>> engineering.
>> Slides aside, I believe the engineering position was  "infeasible # of
>> tests requiredŠ² or something like that.
> 
> Infeasible from what perspective?  Money budgeted, maximum that could
> be spent and the company still make a profit, maximum the customer is
> willing to pay for a car (the regulatory could have a say in the last
> option)?
> 
> Chopping the 1,400 lines up into 100 functions does not make the
> testability problem go away, the ticked boxes on the metrics sheet
> just makes it appear to have gone away.
> 

-- 
Derek M. Jones           Software analysis
tel: +44 (0)1252 520667  blog:shape-of-code.coding-guidelines.com