Building an Ensemble for Software Defect Prediction Based on Diversity Selection

Petrić, Jean and Bowes, David and Hall, Tracy and Christianson, Bruce and Baddoo, Nathan (2016) Building an Ensemble for Software Defect Prediction Based on Diversity Selection. In: ESEM '16 Proceedings of the 10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement :. Association for Computing Machinery, Inc, ESP. ISBN 9781450344272

[thumbnail of ESEM2016_paper_157]
Preview
PDF (ESEM2016_paper_157)
ESEM2016_paper_157.pdf - Accepted Version
Available under License Creative Commons Attribution-NonCommercial.

Download (304kB)

Abstract

Background: Ensemble techniques have gained attention in various scientific fields. Defect prediction researchers have investigated many state-of-the-art ensemble models and concluded that in many cases these outperform standard single classifier techniques. Almost all previous work using ensemble techniques in defect prediction rely on the majority voting scheme for combining prediction outputs, and on the implicit diversity among single classifiers. Aim: Investigate whether defect prediction can be improved using an explicit diversity technique with stacking ensemble, given the fact that different classifiers identify different sets of defects. Method: We used classifiers from four different families and the weighted accuracy diversity (WAD) technique to exploit diversity amongst classifiers. To combine individual predictions, we used the stacking ensemble technique. We used state-of-the-art knowledge in software defect prediction to build our ensemble models, and tested their prediction abilities against 8 publicly available data sets. Conclusion: The results show performance improvement using stacking ensembles compared to other defect prediction models. Diversity amongst classifiers used for building ensembles is essential to achieving these performance improvements.

Item Type:
Contribution in Book/Report/Proceedings
Additional Information:
© ACM, 2016. This is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in ESEM '16 Proceedings of the 10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement http://dx.doi.org/10.1145/2961111.2962610
Uncontrolled Keywords:
/dk/atira/pure/subjectarea/asjc/1700/1706
Subjects:
?? diversityensembles of learning machinessoftware defect predictionsoftware faultsstackingcomputer science applicationssoftware ??
ID Code:
127415
Deposited By:
Deposited On:
12 Sep 2018 14:12
Refereed?:
Yes
Published?:
Published
Last Modified:
20 Apr 2024 00:26