[SystemSafety] Qualifying SW as "proven in use" [Measuring Software]

Mon Jun 24 20:21:24 CEST 2013

Actually, getting the evidence isn't that tricky, it's just a lot of work.
Essentially all one needs to do is to run a correlation analysis
(correlation coefficient) between the proposed quality measure on the one
hand, and defect tracking data on the other hand.

For example, the code quality measure "Cyclomatic Complexity" (reference:
Tom McCabe, ³A Complexity Measure², IEEE Transactions on Software
Engineering, December, 1976) was validated many years ago by simply
finding a strong positive correlation between the cyclomatic complexity of
functions and the number of defects that were logged against those same
functions (I.e., code in that function needed to be changed in order to
repair that defect).

According to one study of 18 production applications, code in functions
with cyclomatic complexity <=5 was about 45% of the total code base but
this code was responsible for only 12% of the defects logged against the
total code base. On the other hand, code in functions with cyclomatic
complexity of >=15 was only 11% of the code base but this same code was
responsible for 43% of the total defects. On a per-line-of-code basis,
functions with cyclomatic complexity >=15 have more than an order of
magnitude increase in defect density over functions measuring <=5.

What I find interesting, personally, is that complexity metrics for
object-oriented software have been around for about 20 years and yet
nobody (to my knowledge) has done any correlation analysis at all (or, at
a minimum they have not published their results).

The other thing to remember is that such measures consider only the
"syntax" (structure) of the code. I consider this to be *necessary* for
code quality, but far from *sufficient*. One also needs to consider the
"semantics" (meaning) of that same code. For example, to what extent is
the code based on reasonable abstractions? To what extent does the code
exhibit good encapsulation? What are the cohesion and coupling of the
code? Has the code used "design-to-invariants / design-forchange"? One can
have code that's perfectly structured in a syntactic sense and yet it's
garbage from the semantic perspective. Unfortunately, there isn't a way
(that I'm aware of, anyway) to do the necessary semantic analysis in an
automated fashion. Some other competent software professionals need to
look at the code and assess it from the semantic perspective.

So while I applaud efforts like SQALE and others like it, one needs to be
careful that it's only a part of the whole story. More work--a lot
more--needs to be done before someone can reasonably say that some
particular code is "high quality".

Regards,

-- steve

-----Original Message-----
From: Peter Bishop <pgb at adelard.com>
Date: Friday, June 21, 2013 6:04 AM
To: "systemsafety at techfak.uni-bielefeld.de"
<systemsafety at techfak.uni-bielefeld.de>
Subject: Re: [SystemSafety] Qualifying SW as "proven
in	use"	[Measuring	Software]

I agree with Derek

Code quality is only a means to an end
We need evidence to to show  the means actually helps to achieve the ends.

Getting this evidence is pretty tricky, as parallel developments for the
same project won't happen.
But you might be able to infer something on average over multiple projects.

Derek M Jones wrote:
> Thierry,
> 
>> To answer your questions:
>> 1°) Yes, there is some objective evidence that there is a correlation
>> between a low SQALE index and quality code.
> 
> How is the quality of code measured?
> 
> Below you say that SQALE DEFINES what is "good quality" code.
> In this case it is to be expected that a strong correlation will exist
> between a low SQALE index and its own definition of quality.
> 
>> For example ITRIS has conducted a study where the "good quality" code
>> is statistically linked to a lower SQALE index, for industrial
>> software actually used in operations.
> 
> Again how is quality measured?
> 
>> No, there is not enough evidence, we wish there would be more people
>> working on getting the evidence.
> 
> Is there any evidence apart from SQALE correlating with its own
> measures?
> 
> This is a general problem, lots of researchers create their own
> definition of quality and don't show a causal connection to external
> attributes such as faults or subsequent costs.
> 
> Without running parallel development efforts that
> follow/don't follow the guidelines it is difficult to see how
> reliable data can be obtained.
> 

-- 

Peter Bishop
Chief Scientist
Adelard LLP
Exmouth House, 3-11 Pine Street, London,EC1R 0JH
http://www.adelard.com
Recep:  +44-(0)20-7832 5850
Direct: +44-(0)20-7832 5855
_______________________________________________
The System Safety Mailing List
systemsafety at TechFak.Uni-Bielefeld.DE