[SystemSafety] Analysis of some Work Breakdown Structure projects

Wed Jun 9 15:21:09 CEST 2021

Martyn,

> I'm interested in whether the CMU data shows how many defects were introduced during development and later found during 
> later stages of development – and, where it does, whether that has been further analysed to show any relationships with 
> component size, for example.

You are after Tables 18 to 27 in the report:
https://resources.sei.cmu.edu/library/asset-view.cfm?assetid=528467
which are derived from the same data.

I have not rederived the numbers from the data, and ought to do this
if only to break it out by project.

> On the reasonable assumption that "usage data" means usage after development has ceased and the product has been 
> released into use, "usage data" is irrelevant to the question I asked about the CMU datasets. Of ourse, many defects are 
> only found in use, or never found, so the defect injection data doesn't tell us the defect density in the *delivered* code.

Finding a defect requires some form of usage, whether during development or
usage.

Developers working in a given phase may have an incentive to
concentrate on their own work, leaving other peoples' problems
for others to handle.  Are developers experiencing mistakes made
in earlier phases while implementing their 'bit' or are they
doing more general testing?  After all there are explicit test phases.

> 
> There is a *separate* issue about how frequently defects in software manifest themselves as evident errors in use. The 
> only alalysis I have ever seen on this was the paper by Adams, who analysed huge datasets collected by IBM from the logs 
> of mainframe users (I think in the 1980s). That showed that many latent defects only become manifest very infrequently 
> (tens of thousands of hours of user connect time, as I recall). I haven't the reference to hand but I'll find it if you 
> want it.
> 
> 
> Martyn
> 
> 
> On 09/06/2021 12:16, Derek M Jones wrote:
>> Martyn,
>>
>> The 2106.03679 paper data derives from the Watts Humphrey's work
>> at CMU.  There is lots more to be found in this data, and the
>> paper is a start.
>>
>>>> Defect per KLOC is meaningless unless it is connected with usage
>>>> data, e.g., there can be zero defects per KLOC (because the software
>>>> has no users), or lots per KLOC because it has millions of users.
>>>
>>> The datasets from http://arxiv.org/abs/2106.03679 that you analysed contain defects injected and defects found later 
>>> in development and repaired. have you analysed those?
>>
>> This is the data behind figure 6.42.
>>
>> But there is no usage data.
>>
>>>> I've never seen a breakdown by individual.  It's possible to do, when
>>>> mining github (actually this is by user id, and there are cases of
>>>> the same person having multiple ids), but again usage needs to be
>>>> taken into account.
>>>
>>> Again, the http://arxiv.org/abs/2106.03679 data seems to show individuals. The Watts Humphrey study below does that too.
>>
>> Yes, tasks are associated with individuals.
>> But again, no usage data.
>>
>>>>> There was data of this sort from the SEI 30 years ago and some from UK MoD, and some reports by the CHAOS group 
>>>>> twenty years ago but nothing I know of recently.
>>>>
>>> The SEI data I referred to was from a study carried out by Watts Humphrey, of the Software Engineering Institute at 
>>> Carnegie-Mellon University, analysed the fault density of more than 8000 programs written by 810 industrial software 
>>> developers. resources.sei.cmu.edu/asset_files/SpecialReport/2009_003_001_15035.pdf p132
>>
>> Thanks for the link.  I had not seen this collection of Watts Humphrey
>> columns before.
>>
>> The column name of table 25-1 should read:
>> "Average detected defects per KLOC".
>> The question is then: How much effort was put into
>> detecting defects?
>> The metric Defects_per_KLOC only makes sense when effort
>> to detect the defacts is taken into account.
>> I can create programs with zero Defects_per_KLOC, simply by
>> putting zero effort into detecting defects.
>>
>>>> UK MoD?  This does not ring any bells for me.  Do you have a reference,
>>>>
>>> My reference was to the analysis of Boeing flight control software published in Crosstalk
>>>     German, A.: Software static code analysis lessons learned. Crosstalk
>>>     16(11) (2003)
>>
>> Thanks for the reference.
>> Table 1 lists Anomalies per lines of code.
>> But again, no indication of the effort involved in detecting
>> those anomalies.
>>
>>> and to the review of the Full Authority Digital Engine Controller that was installed in Chinook helicopters; which is 
>>> described in a House of Commons report into the Mull of Kintyre Chinook accident on 2 June 1994 . This said:/In the 
>>
>> I will have a look at this, but I suspect that effort to detect data is
>> not included.
>>
>>> summer of 1993 an independent defence IT contractor, EDS-SCICON, was instructed to review the FADEC software; after 
>>> examining only 18 per cent of the code they found 486 anomalies and stopped the review/.
>>
>> Did they record the effort (I imagine their time), needed
>> to detect each anomaly?  This kind of data is rare.
>>
>>
> 
> 
> _______________________________________________
> The System Safety Mailing List
> systemsafety at TechFak.Uni-Bielefeld.DE
> Manage your subscription: https://lists.techfak.uni-bielefeld.de/mailman/listinfo/systemsafety
> 

-- 
Derek M. Jones           Evidence-based software engineering
tel: +44 (0)1252 520667  blog:shape-of-code.coding-guidelines.com