[SystemSafety] McCabe¹s cyclomatic complexity and accounting fraud

Fri Mar 30 01:02:22 CEST 2018

Steve,

> With all due respect, I am very conscious of and concerned about the
> Accounting Fraud issue you focus so much on. I even agree that it is a
> serious problem today. However, I don’t think that taking an approach of

There are plenty of box tickers out there who insist on metrics being 
within some specified bounds.  Accounting fraud gets peoples' attention
and hopefully gets them asking questions.

> As I said, I have seen a single C++ method of over 3400 SLOC and a
> Cyclomatic Complexity of over 2400. I routinely see functions in the 150
> to 350 range. I have seen functions with over 50 parameters. Are you

What is your point?  Does the existence of this code validate the use
of McCabe's cyclomatic complexity in some way?

> “A vector of what, low viability metrics?” ― no, of course not. I can’t
> tell you what that vector needs look like today because nobody has done
> enough empirical research. I am convinced that a meaningful set of high
> value code complexity metrics do exist, It’s just that nobody knows
> exactly what it is yet.

We agree.

Let's stop clinging to proposals from the 1970 and 80's that never had
empirical support and only continue to linger because... I have no idea

> 
> “The whole point is to commit accounting fraud?” ― again, no. Once that
> meaningful vector has been established, it won’t allow accounting fraud.
> The fault you keep raising with cyclomatic complexity and just squeezing
> the complexity somewhere else would be solved if a sufficient set of other
> code complexity metrics were known to catch it and prevent too much of it
> being pushed there.
> 
> 
> “What is an "appropriate balance"?  Do you have a formula for this?” ― No,
> unfortunately, I don’t have a complete formula for it. Yet. Again, more
> empirical research is required here. What I am pretty confident of today
> is:
> *) Function-level cyclomatic complexity certainly less than 15, ideally
> less than 10
> *) Function-level decision nesting certainly less than 7, ideally less
> than 4
> *) Parameters on a function certainly fewer than 7, ideally fewer than 4
> *) Function fan out certainly less than 11, ideally less than 7
> 
> Should there be other metrics with other limits? Yes. What should they be?
> I don’t know. I have some suspicions, but I’m not ready to set limits on
> them. Yet.
> 
> I can detail how I think we need to push the research to find that whole
> formula (essentially Multi-variate correlation analysis). The problem is
> that research has not been done yet. But, rather than saying “this whole
> field is crap”, I think we should be saying, “Here is what we think we
> know today, but we really do need to do a lot more work in this area
> before we can claim to know everything. Are you interested enough in the
> subject to help us push it forward?"
> 
> 
> “What does having an appropriate balance buy you?” ― it buys you
> syntactically well-structured code that is a whole lot easier to write,
> read, and maintain. Are you really willing to assert that code with single
> functions that have over 150 decisions is as easy to write, read, and
> maintain as code where no single function has over 14 decisions? Are you
> really willing to assert that code with single functions having over 50
> parameters is as easy to write, read, and maintain as code where no single
> function has over 6 parameters?
> 
> Further, such code is also highly likely to contain fewer defects than
> otherwise because it is so well-structured. Are you really willing to
> claim that code with single functions that have over 150 decisions has no
> more defects than code where no single function has over 14 decisions? Are
> you really willing to claim that code with single functions having over 50
> parameters has no more defects than code where no single function has over
> 6 parameters?
> 
> Ok, I freely admit that I cannot support this (today) with a ton of
> empirical research. But having been in the software industry for over 40
> years and having worked directly or indirectly with thousands of projects,
> I’m pretty damn confident.
> 
> 
> “Are you not embarrassed, having to rely on this figure?” ― I certainly
> wish there was more, and more reliable, data that I could cite. But this
> is all that seems to be available at this point. When more, and more
> reliable, empirical data is available I will certainly switch to it
> instead.
> 
> 
> So let me ask you, “How do you propose to get developers to create code of
> any reasonable quality at all?”
> 
> 
> 
> 
> Cheers,
> 
> ― steve
> 
> 
> 
> 
> -----Original Message-----
> From: Derek M Jones <derek at knosof.co.uk>
> Organization: Knowledge Software, Ltd
> Date: Wednesday, March 28, 2018 at 3:01 PM
> To: Steve Tockey <Steve.Tockey at construx.com>,
> "systemsafety at lists.techfak.uni-bielefeld.de"
> <systemsafety at lists.techfak.uni-bielefeld.de>
> Subject: Re: [SystemSafety] McCabe¹s cyclomatic complexity and accounting
> fraud
> 
> Steve,
> 
>> ³Software complexity is not a number, it is a vector²
> 
> A vector of what, low viability metrics?  Throw enough in and
> some pattern will emerge?
> 
>> Of course one can simply refactor code to reduce Cyclomatic Complexity
>> and
>> yet the inherent complexity didn¹t go away. It just moved. But that¹s
>> kinda the whole point. Knowing that, can I call it, ³local complexities²
> 
> The whole point is to commit accounting fraud?
> The box has to be ticked and everybody else does it.
> 
>> like Cyclomatic Complexity and Depth of Decision Nesting can be traded
>> for, can I call it, ³global complexities² like Fan Out, the developer¹s
>> goal should be to strike an appropriate balance between them. Not too
>> much
>> local complexity balanced with not too much local complexity.
> 
> What is an "appropriate balance"?  Do you have a formula for this?
> What does having an appropriate balance buy you?
> 
>> This is covered in a lot more detail in Appendix N of the manuscript for
>> my new book, available at:
>>
>> https://www.dropbox.com/sh/jjjwmr3cpt4wgfc/AACSFjYD2p3PvcFzwFlb3S9Qa?dl=0
> 
> You cite an author name for what I take to be this paper:
> https://pdfs.semanticscholar.org/e3d6/6c47ee0ddb37868c51ca30840084263ee1f1.
> pdf
> 
> More of a semi-puff piece paper than a description of serious
> research.Anyway, you are relying on Figure 4 to back up your claims
> (reproduced
> in your figure N-2).
> 
> "Figure 4 illustrates the results of using the cyclomatic complexity
> metric to analyze one of the PowerBuilder systems."
> 
> What about the other 17 systems that the paper refers to?
> Do they show very different behavior?
> 
> There are several ways of interpreting that plot.  It could be
> a percentage of lines of code or a percentage of methods.  The
> original paper is not clear.
> 
> There are lots of Not Availables in Table 1.
> 
> Are you not embarrassed, having to rely on this figure?
> 
>>
>>
>> -----Original Message-----
>> From: systemsafety <systemsafety-bounces at lists.techfak.uni-bielefeld.de>
>> on behalf of Derek M Jones <derek at knosof.co.uk>
>> Organization: Knowledge Software, Ltd
>> Date: Wednesday, March 28, 2018 at 7:21 AM
>> To: "systemsafety at lists.techfak.uni-bielefeld.de"
>> <systemsafety at lists.techfak.uni-bielefeld.de>
>> Subject: Re: [SystemSafety] McCabe¹s cyclomatic complexity and accounting
>> fraud
>>
>> Paul,
>>
>>> There is the reported McCabe Complexity value for each function in a
>>> system. Yes, you can do things to reduce individual function complexity,
>>> and probably should. However, you then need to take the measure a step
>>> further. For every function that calls other functions, you have to sum
>>> the
>>
>> I agree that the way to go is to measure a collection of functions
>> based on their caller/callee relationship.
>>
>> This approach makes it much harder to commit accounting fraud and
>> might well produce more reproducible results.
>>
>>> for the entire system on this basis. It becomes clear when you have too
>>> many
>>> functions with high complexity factors as it pushed up the average
>>> complexity
>>> value disproportionately. It still should not be the only measure
>>> though.
>>
>> Where do the decisions in the code (that creates this 'complexity' come
>> from)?  The algorithm that is being implemented.
>>
>> If the algorithm has lot of decision points, the code will contain
>> lots of decision points.  The measurement process needs to target the
>> algorithm first, and then compare the complexity of the algorithm with
>> the complexity of its implementation.  The code only needs looking at
>> if its complexity is much higher value than the algorithm.
>>
> 
> --
> Derek M. Jones           Software analysis
> tel: +44 (0)1252 520667  blog:shape-of-code.coding-guidelines.com
> 
> 
> ________________________________
> This message may contain confidential information and is intended only for the individual(s) named. If you are not a named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this communication by mistake and delete it from your system. E-mail transmission cannot be guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or contain viruses. The sender therefore does not accept liability for any errors or omissions in the content of this message that arise as a result of e-mail transmission.
> 
> MannKind Corporation All Rights Reserved
> 30930 Russell Ranch Rd., Suite 301, Westlake Village, CA 91362
> [mnkd2018]
> 

-- 
Derek M. Jones           Software analysis
tel: +44 (0)1252 520667  blog:shape-of-code.coding-guidelines.com