Software defect prediction : do different classifiers find the same defects?

Bowes, David and Hall, Tracy and Petrić, Jean (2018) Software defect prediction : do different classifiers find the same defects? Software Quality Journal, 26 (2). pp. 525-552. ISSN 0963-9314

Preview

PDF (sensitivity_sqj)
sensitivity_sqj.pdf - Accepted Version
Available under License Creative Commons Attribution.
Download (903kB)

Abstract

During the last 10 years, hundreds of different defect prediction models have been published. The performance of the classifiers used in these models is reported to be similar with models rarely performing above the predictive performance ceiling of about 80% recall. We investigate the individual defects that four classifiers predict and analyse the level of prediction uncertainty produced by these classifiers. We perform a sensitivity analysis to compare the performance of Random Forest, Naïve Bayes, RPart and SVM classifiers when predicting defects in NASA, open source and commercial datasets. The defect predictions that each classifier makes is captured in a confusion matrix and the prediction uncertainty of each classifier is compared. Despite similar predictive performance values for these four classifiers, each detects different sets of defects. Some classifiers are more consistent in predicting defects than others. Our results confirm that a unique subset of defects can be detected by specific classifiers. However, while some classifiers are consistent in the predictions they make, other classifiers vary in their predictions. Given our results, we conclude that classifier ensembles with decision-making strategies not based on majority voting are likely to perform best in defect prediction.

Item Type:

Journal Article

Journal or Publication Title:

Software Quality Journal

Additional Information:

The final publication is available at Springer via http://dx.doi.org/10.1007/s11219-016-9353-3

Uncontrolled Keywords:

/dk/atira/pure/subjectarea/asjc/1700/1712

Subjects:

?? machine learningprediction modellingsoftware defect predictionsoftwaresafety, risk, reliability and qualitymedia technology ??

Departments:

Faculty of Science and Technology > School of Computing & Communications

ID Code:

127409

Deposited By:

ep_importer_pure

Deposited On:

01 Oct 2018 13:18

Refereed?:

Yes

Published?:

Published

Last Modified:

11 Dec 2025 03:28

URI:

https://eprints.lancs.ac.uk/id/eprint/127409