Protein and gene model inference based on statistical modeling in k-partite graphs

作者:Gerster Sarah*; Qeli Ermir; Ahrens Christian H; Buehlmann Peter
来源:Proceedings of the National Academy of Sciences of the United States of America, 2010, 107(27): 12101-12106.
DOI:10.1073/pnas.0907654107

摘要

One of the major goals of proteomics is the comprehensive and accurate description of a proteome. Shotgun proteomics, the method of choice for the analysis of complex protein mixtures, requires that experimentally observed peptides are mapped back to the proteins they were derived from. This process is also known as protein inference. We present Markovian Inference of Proteins and Gene Models (MIPGEM), a statistical model based on clearly stated assumptions to address the problem of protein and gene model inference for shotgun proteomics data. In particular, we are dealing with dependencies among peptides and proteins using a Markovian assumption on k-partite graphs. We are also addressing the problems of shared peptides and ambiguous proteins by scoring the encoding gene models. Empirical results on two control data-sets with synthetic mixtures of proteins and on complex protein samples of Saccharomyces cerevisiae, Drosophila melanogaster, and Arabidopsis thaliana suggest that the results with MIPGEM are competitive with existing tools for protein inference.

  • 出版日期2010-7-6