Shahzad, Muhammad and Tahir, Muhammad Atif and Khan, M. Atta and Jiang, Richard and Shams, Rauf Ahmed (2022) RETRACTED EBSRMF: Ensemble Based Similarity-Regularized Matrix Factorization to Predict Anticancer Drug Responses. Journal of Intelligent and Fuzzy Systems, 43 (3). pp. 3443-3452. ISSN 1064-1246
EBSRMF_JIFS_Final_for_Publication_.pdf - Accepted Version
Available under License Creative Commons Attribution.
Download (559kB)
Abstract
Drug sensitivity prediction to a panel of cancer cell lines using computational approaches has been a challenge for two decades. With the emergence of high-throughput screening technologies, thousands of compounds and cancer cell lines panels with drug sensitivity data are publicly available at various pharmacogenomics databases. Analyzing these data is crucial to improve cancer treatment and develop new anticancer drugs. In this work, we propose EBSRMF: Ensemble Based Similarity-Regularized Matrix Factorization, which is a bagging based framework to improve the drug sensitivity prediction on the Cancer Cell Line Encyclopedia (CCLE) data. Based on the fact that similar drugs and cell lines exhibit similar drug response, we have investigated cell line and drug similarity matrices based on gene expression profiles and chemical structure respectively. The drug sensitivity value is used as outcome values which are the half maximal inhibitory concentrations (IC50). In order to improve the generalization ability of the proposed model, a homogeneous ensemble based bagging learning approach is also investigated where multiple SRMF models are used to train N subsets of the input data. The outcome of each training algorithm is aggregated using the averaging method to predict the outcome. Experiments are conducted on two benchmark datasets: CCLE and GDSC. The proposed model is compared with state-of-the-art models using multiple evaluation metrics including Root Means Square Error (RMSE) and Pearson Correlation Coefficient (PCC). The proposed model is quite promising and achieves better performance on CCLE dataset when compared with the existing approaches.