A Novel Algorithm for Validating Peptide Identification from a Shotgun Proteomics Search Engine

作者:Jian Ling; Niu Xinnan; Xia Zhonghang; Samir Parimal; Sumanasekera Chiranthani; Mu Zheng; Jennings Jennifer L; Hoek Kristen L; Allos Tara; Howard Leigh M; Edwards Kathryn M; Weil P Anthony; Link Andrew J*
来源:Journal of Proteome Research, 2013, 12(3): 1108-1119.
DOI:10.1021/pr300631t

摘要

Liquid chromatography coupled with tandem mass spectrometry (LC-MS/MS) has revolutionized the proteomics analysis of complexes, cells, and tissues. In a typical proteomic analysis, the tandem mass spectra from a LC-MS/MS experiment are assigned to a peptide by a search engine that compares the experimental MS/MS peptide data to theoretical peptide sequences in a protein database. The peptide spectra matches are then used to infer a list of identified proteins in the original sample. However, the search engines often fail to distinguish between correct and incorrect peptides assignments. In this study, we designed and implemented a novel algorithm called De-Noise to reduce the number of incorrect peptide matches and maximize the number of correct peptides at a fixed false discovery rate using a minimal number of scoring outputs from the SEQUEST search engine. The novel algorithm uses a three step process: data cleaning, data refining through a SVM-based decision function, and a final data refining step based on proteolytic peptide patterns Using proteomics data generated on different types of mass spectrometers, we optimized the De Noise algorithm on the basis of the resolution and mass accuracy of the mass spectrometer employed in the LC-MS/MS experiment Our results demonstrate De Noise improves peptide identification compared to other methods used to process the peptide sequence matches assigned by SEQUEST. Because De Noise uses a limited number of scoring attributes, it can be easily implemented with other search engines.