Ahmadian, Hanieh and Emami, Samaneh and Nasiri, Hamid (2025) From Black Box to Glass Box: SHAP-Explained XGBoost Model for Coronary Artery Disease Prediction. Algorithms, 18 (12): 771. ISSN 1999-4893
Full text not available from this repository.Abstract
Coronary artery disease (CAD) is a leading cause of death worldwide. Unfortunately, due to various reasons, this disease is currently spreading rapidly. Many heart disease sufferers die due to the inaccuracy of diagnostic tools or the delay in seeing a doctor. Therefore, the correct and timely diagnosis of this disease plays an important role in preventing deaths. The method used in this research to diagnose coronary artery disease (CAD) is a combination of the following algorithms: Support Vector Machine, Naïve Bayes, Logistic Regression, Random Forest, K-Nearest Neighbors, and XGBoost with ANOVA feature selection method. The dataset was evaluated using two complementary validation strategies: a hold-out test set and 10-fold cross-validation. In the proposed method, the SHAP algorithm was implemented on the output of the XGBoost model. Using this algorithm makes the output of the model interpretable and brings the model out of black box status. With the hold-out method, the K-Nearest Neighbor model with the features selected by the ANOVA method obtained an accuracy of 93.33%. Then, the Support Vector Machine and XGBoost models obtained the best results with an accuracy of 91.66%. With the 10-fold cross-validation method of the model, XGBoost achieved 86.16% accuracy and 87.41% recall value, which increased the results obtained in both methods compared to the state-of-the-art methods.