[SystemSafety] Qualifying SW as "proven in use" [Measuring Software]

Wed Jun 19 10:57:53 CEST 2013

Hi Martyn,
I'm very short and to the point here, simplifying in the extreme, but this should give a small overview(!). For in-depth information, the website and the published papers are available. In particular, take a look at the SQALE method definition document.

SQALE identifies several quality characteristics, and hierarchizes them. For example, the base quality characteristic is "Testability". Based on the lifecycle, if testability is not there, none of the additional quality characteristics will be really useful. Within the testability characteristic, there are two subcharacteristics, "unit testing" and "integration testing". Withing each subcharacteristic, a small number of check points, which are contractually requirements to be complied to are selected. The check points identify erroneous constructions in a software work products. Currently most SQALE models identify defects in the source code, but there is no reason to limit oneself to the source code. Requirements, design, test scenarios could all be included in SQALE. 
On source code, for a typical check point is the presence of redundant code (100-token similarity), for unit testability. During code static analysis, any duplication of code will result in 1 additional non-conformity added to the work product (typically a piece of code). Another example is, for integration testability, the presence of cyclic dependencies. Each cyclic dependency will add one non-conformity.
There are different quality models available depending on the technology used, and this may be adapted prior to the commencement of a SQALE assessment, or after initial results.
Once the quality model is defined, and applied, the result is a set of non-conformities defined per characteristic and sub-characteristic, and attached to each work product.

SQALE's analysis model defines in addition the "remediation factor" for each non-conformity. What SQALE does is to assign a certain number of work units to each type of non-conformity in order to correct it. This information can generally be known, to a degree in the general case, and more precisely in an organisation that starts measuring. 
So a cyclic dependency will be rated as a high remediation (rework to be done on several pieces of code, re-design and re-testing), what SQALE classifies as a type 5, architectural change. Removal of a copy/paste is less costly and will be rated as type 4, medium impact (cutting a function in 2, creating another one, etc). Smaller changes like correcting comments or automatic indentation for readability will rate very small remediation factors.
In this manner, each non-conformity can be rated using the "SQALE index" and the indexes can now be summed as they all represent the same quantity, within the characteristic. Testability is the base characteristic, which indices are the most critical. Other characteristics on top of testability add their indices with the testability indices. Typically, this shows that for maintainability for example, if one tries to correct pure maintainability non-conformities, if the base testability characteristic is not OK, maintainability will not be OK either (too much time and effort for testing, testing inadequate after changes, etc).

There is a second index definition for "non-remediation criticality". This would probably be similar to your "severity". The idea is the same: each non-conformity is hierarchized as perceived by the product owner. It measures the "impact", depending on the product, its functions, the environment, of the non-compliance remaining in the software. Typically an organisation may rate the presence of a "divide by zero" non-conformity in a software as having a high non-remediation factor, and a naming convention defect as having a low impact. In the end, the SQALE indices can be computed and attached to both work products and characteristics. 

Finally index densities can be computed against SLOCs or sum(V(G)s to compare relative quality in different pieces of software.

With these indices and index densities available, developers, project managers and customers can take informed decisions:
- accept or reject a piece of software, possibly selectively (for a customer)
- focus the maintenance team on the most debt-reduction non-compliances (to improve both quality and productivity)
- focus the team on the most non-remediation index reduction non-compliances (to improve perceived quality)
- improve one's programming on a daily basis by regularly checking the SQALE indices (when SQALE indices are computed for each build during development)

Experience shows that performing a one-shot SQALE assessment is a relatively easy job for trained consultants and is often used in a third party mode to establish a common view of the project between customer and developers. 
Setting up SQALE in the tooling is longer term but is much more profitable for the developer team, as they have a much shorter feedback loop on their quality. In this case, it is clearly a requirement that developers, team leaders and QA people should really have no effort to do at all to gather the indices. This is one of the reasons why SQALE is so heavily automated.

Thierry Coq
PS. The opinions expressed here represent my own and not necessarily those of my employer.

-----Original Message-----
From: systemsafety-bounces at techfak.uni-bielefeld.de [mailto:systemsafety-bounces at techfak.uni-bielefeld.de] On Behalf Of Martyn Thomas
Sent: mardi 18 juin 2013 16:27
To: systemsafety at techfak.uni-bielefeld.de
Subject: Re: [SystemSafety] Qualifying SW as "proven in use" [Measuring Software]

How are the defects identified so that they can be counted? How is their severity determined so that correction can be prioritised?

Martyn

On 18/06/2013 15:22, Thierry.Coq at dnv.com wrote:
> Dear all,
> There is a method, called SQALE (Software Quality Assessment Based on Lifecycle Expectations), for measuring quality: www.sqale.org, and a derivative on PLC code called PSaQC (PLC Software automated Quality Control, or "psychic"). It is based on the fact that the lack of data for software engineering is troublesome, and the lack of data on quality of software.
> SQALE is supported by a number of tool vendors (static analysis), but is not limited to static analysis of code. It is freely usable (see the license).
>
> Organizations are starting to use it for several reasons:
> - it provides an economical way to measure software, in a standard 
> manner, in the project and when accepting the software,
> - it is objective and comparable across languages and techniques 
> (especially the SQALE index density)
> - it measures the "technical debt" present in the software: the amount 
> of work to improve quality to a defined level,
> - it hierarchizes which defects should be corrected first,
> - most basic errors of measuring quality in software have been removed by the SQALE quality and analysis models.
>
> There are limitations:
> - there is no link between a SQALE index and a "probability of failure": SQALE measures basically a defect density. As usual, the relationship between defect density and failures is difficult.
> - it provides a measure of the "internal quality", as seen from a 
> developer or project manager or customer, not a measure directly of 
> the "external quality", of  an end-user for example,
> - measurement points for real-time critical or PLC software are still very insufficient.
>
> Jean-Louis Letouzey has published several papers on SQALE.
> With Jean-Pierre Rosen, we have published data on open source ADA software, as an example of applying SQALE. With Denis Chalon, we have published SQALE data on PLC code.
>
> Comments or suggestions for improvement are welcome.
> I hope this helps.
>
> Thierry Coq
> PS. The opinions expressed here represent my own and not necessarily those of my employer.
>
> -----Original Message-----
> From: systemsafety-bounces at techfak.uni-bielefeld.de 
> [mailto:systemsafety-bounces at techfak.uni-bielefeld.de] On Behalf Of 
> Derek M Jones
> Sent: lundi 17 juin 2013 12:59
> To: systemsafety at techfak.uni-bielefeld.de
> Subject: Re: [SystemSafety] Qualifying SW as "proven in use"
>
> Software engineering has a culture of not measuring and keeping data.
> This is starting to change, but empirical software engineering has only just started:
> http://shape-of-code.coding-guidelines.com/2011/03/31/empirical-softwa
> re-engineering-is-five-years-old/ If anybody knows of any interesting 
> datasets do please let me know.
> I am making all data+my analysis code public and so have no interest in data I cannot freely share.

_______________________________________________
The System Safety Mailing List
systemsafety at TechFak.Uni-Bielefeld.DE

**************************************************************************************
The contents of this e-mail message and any attachments are confidential and are intended solely for the addressee. If you have received this transmission in error, please immediately notify the sender by return e-mail and delete this message and its attachments. Any unauthorized use, copying or dissemination of this transmission is prohibited. Neither the confidentiality nor the integrity of this message can be vouched for following transmission on the Internet.
**************************************************************************************