A comprehensive analysis about the influence of low-level preprocessing techniques on mass spectrometry data for sample classification

作者:Lopez Fernandez Hugo*; Reboiro Jato Miguel; Glez Pena Daniel; Fernandez Riverola Florentino
来源:International Journal of Data Mining and Bioinformatics, 2014, 10(4): 455-473.
DOI:10.1504/IJDMB.2014.064897

摘要

Matrix-Assisted Laser Desorption Ionisation Time-of-Flight (MALDI-TOF) is one of the high-throughput mass spectrometry technologies able to produce data requiring an extensive preprocessing before subsequent analyses. In this context, several low-level preprocessing techniques have been successfully developed for different tasks, including baseline correction, smoothing, normalisation, peak detection and peak alignment. In this work, we present a systematic comparison of different software packages aiding in the compulsory preprocessing of MALDI-TOF data. In order to guarantee the validity of our study, we test multiple configurations of each preprocessing technique that are subsequently used to train a set of classifiers whose performance (kappa and accuracy) provide us accurate information for the final comparison. Results from experiments show the real impact of preprocessing techniques on classification, evidencing that MassSpecWavelet provides the best performance and Support Vector Machines (SVM) are one of the most accurate classifiers.

  • 出版日期2014