Iterative Genome Correction Largely Improves Proteomic Analysis of Nonmodel Organisms

作者:Wu, Xiaohui; Xu, Lina; Gu, Wei; Xu, Qian; He, Qing-Yu*; Sun, Xuesong; Zhang, Gong
来源:Journal of Proteome Research, 2014, 13(6): 2724-2734.
DOI:10.1021/pr500369b

摘要

The current application and development of proteomic studies typically depend on the availability of sequenced genomes. Protein identification based on the detected peptides with liquid chromatography tandem mass spectrometry is limited by the absence of sequenced genomes in many nonmodel organisms. In this study, we demonstrated a new strategy based on our stable, accurate, and error-tolerant FANSe (Fast and Accurate mapping tool for Nucleotide Sequencing datasets) mapping algorithm to correct genome sequences in an iterative manner. To evaluate the efficiency of the corrected genome databases in proteomic study, MS/MS spectra of whole proteome extracted from a Bacillus pumilus strain without complete genome sequence were searched against the protein sequence databases derived from the complete reference genome sequence of a homologous bacterium and from the corrected genome sequence. The results indicated that the corrected protein sequence database could significantly facilitate peptide/protein identification. Importantly, this strategy can help to detect novel peptide variants. This strategy of genome correction will promote the development of functional proteomics in nonmodel organisms.