Building an Ensemble for Software Defect Prediction Based on Diversity Selection

Petrić, Jean and Bowes, David and Hall, Tracy and Christianson, Bruce and Baddoo, Nathan (2016) Building an Ensemble for Software Defect Prediction Based on Diversity Selection. In: ESEM '16 Proceedings of the 10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement :. Association for Computing Machinery, Inc, ESP. ISBN 9781450344272

Preview

PDF (ESEM2016_paper_157)
ESEM2016_paper_157.pdf - Accepted Version
Available under License Creative Commons Attribution-NonCommercial.
Download (304kB)

Abstract

Background: Ensemble techniques have gained attention in various scientific fields. Defect prediction researchers have investigated many state-of-the-art ensemble models and concluded that in many cases these outperform standard single classifier techniques. Almost all previous work using ensemble techniques in defect prediction rely on the majority voting scheme for combining prediction outputs, and on the implicit diversity among single classifiers. Aim: Investigate whether defect prediction can be improved using an explicit diversity technique with stacking ensemble, given the fact that different classifiers identify different sets of defects. Method: We used classifiers from four different families and the weighted accuracy diversity (WAD) technique to exploit diversity amongst classifiers. To combine individual predictions, we used the stacking ensemble technique. We used state-of-the-art knowledge in software defect prediction to build our ensemble models, and tested their prediction abilities against 8 publicly available data sets. Conclusion: The results show performance improvement using stacking ensembles compared to other defect prediction models. Diversity amongst classifiers used for building ensembles is essential to achieving these performance improvements.

Item Type:

Contribution in Book/Report/Proceedings

Additional Information:

© ACM, 2016. This is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in ESEM '16 Proceedings of the 10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement http://dx.doi.org/10.1145/2961111.2962610

Uncontrolled Keywords:

/dk/atira/pure/subjectarea/asjc/1700/1706

Subjects:

?? diversityensembles of learning machinessoftware defect predictionsoftware faultsstackingcomputer science applicationssoftware ??

Departments:

Faculty of Science and Technology > School of Computing & Communications

ID Code:

127415

Deposited By:

ep_importer_pure

Deposited On:

12 Sep 2018 14:12

Refereed?:

Yes

Published?:

Published

Last Modified:

28 Jun 2025 01:30

URI:

https://eprints.lancs.ac.uk/id/eprint/127415