Application of support vector machines to metabolomics experiments with limited replicates

作者:Heinemann Joshua; Mazurie Aurelien; Tokmina Lukaszewska Monika; Beilman Greg J; Bothner Brian*
来源:Metabolomics, 2014, 10(6): 1121-1128.
DOI:10.1007/s11306-014-0651-0

摘要

Identifying biomarkers that are indicative of a phenotypic state is difficult because of the amount of natural variability which exists in any population. While there are many different algorithms to select biomarkers, previous investigation shows the sensitivity and flexibility of support vector machines (SVM) make them an attractive candidate. Here we evaluate the ability of support vector machine recursive feature elimination (SVM-RFE) to identify potential metabolic biomarkers in liquid chromatography mass spectrometry untargeted metabolite datasets. Two separate experiments are considered, a low variance (low biological noise) prokaryotic stress experiment, and a high variance (high biological noise) mammalian stress experiment. For each experiment, the phenotypic response to stress is metabolically characterized. SVM-based classification and metabolite ranking is undertaken using a systematically reduced number of biological replicates to evaluate the impact of sample size on biomarker reproducibility and robustness. Our results indicate the highest ranked 1 % of metabolites, the most predictive of the physiological state, were identified by SVM-RFE even when the number of training examples was small (a parts per thousand yen3) and the coefficient of variation was high (%26gt; 0.5). An accuracy analysis shows filtering with recursive feature elimination measurably improves SVM classification accuracy, an effect that is pronounced when the number of training examples is small. These results indicate that SVM-RFE can be successful at biomarker identification even in challenging scenarios where the training examples are noisy and the number of biological replicates is low.

  • 出版日期2014-12