A scoring model for phosphopeptide site localization and its impact on the question of whether to use MSA

作者:Fischer Juliana de S da G; dos Santos Marlon D M; Marchini Fabricio K; Barbosa Valmir C; Carvalho Paulo C*; Zanchin Nilson I T
来源:Journal of Proteomics, 2015, 129: 42-50.
DOI:10.1016/j.jprot.2015.01.008

摘要

The production of structurally significant product ions during the dissociation of phosphopeptides is a key to the successful determination of phosphorylation sites. These diagnostic ions can be generated using the widely adopted MS/MS approach, MS3 (Data Dependent Neutral Loss - DDNL), or by multistage activation (MSA). The main purpose of this work is to introduce a false-localization rate (FLR) probabilistic model to enable unbiased phosphoproteomics studies. Briefly, our algorithm infers a probabilistic function from the distribution of the identified phosphopeptides' XCorr Delta scores (XD-Scores) in the current experiment. Our module infers p-values by relying on Gaussian mixture models and a logistic function. We demonstrate the usefulness of our probabilistic model by revisiting the "to MSA, or not to MSA" dilemma. For this, we use human leukemia-derived cells (K562) as a study model and enriched for phosphopeptides using the hydroxyapatite (HAP) chromatography. The aliquots were analyzed with and without MSA on an Orbitrap-XL. Our XD-Scoring analysis revealed that the MS/MS approach provides more identifications because of its faster scan rate, but that for the same given scan rate higher-confidence spectra can be achieved with MSA. Our software is integrated into the PatternLab for proteomics freely available for academic community at http://www.patternlabforproteomics.org. Biological significance Assigning statistical confidence to phosphorylation sites is necessary for proper phosphoproteomic assessment. Here we present a rigorous statistical model, based on Gaussian mixture models and a logistic function, which overcomes shortcomings of previous tools. The algorithm described herein is made readily available to the scientific community by integrating it into the widely adopted PatternLab for proteomics. This article is part of a Special Issue entitled: Computational Proteomics.

  • 出版日期2015-11-3