Comparative studies for developing protein based cancer prediction model to maximise the ROC-AUC with various variable selection methods

作者:Kim Yongkang; Kwon Min Seok; Choi Yonghwan; Yi Sung Gon; Namkung Junghyun; Han Sangjo; Kwon Wooil; Kim Sun Whe; Jang Jin Young; Kim Hyunsoo; Kim Youngsoo; Lee Seungyeoun; Park Taesung*
来源:International Journal of Data Mining and Bioinformatics, 2016, 16(1): 64-76.
DOI:10.1504/IJDMB.2016.10000565

摘要

The era of protein data analysis is coming with more accurate quantification experiments such as the multiple reaction monitoring (MRM). Protein is easier to obtain than the other genetic variants or gene expression data, which makes it more suitable for early diagnosis of cancer. Each patient has unique patterns of protein data, which makes it imperative for the researcher to select the effective markers to construct a consistent model to predict the patients. This research focuses on finding the most effective variable selection method to be applied in the early diagnosis of the pancreatic cancer. In the process, we compare classical selection methods (stepwise selection based on AIC, BIC), machine learning based selection method (support vector machine recursive feature selection; SVM-REF), and stepwise selection method using the area under the receiver operating characteristic curve (Step-AUC). Based on the simulation and real data analysis, we suggest a Step-AUC method to maximise the prediction performance of the early diagnosis by protein data.

  • 出版日期2016

全文