摘要

Proteogenoinit searches are useful for novel peptide identification from tandem mass spectra. Usually, separate and multistage approaches are adopted to accurately control the false discovery rate (FDR) for: proteogenomic search. Their performance on novel peptide identification has not been thoroughly evaluated; however, mainly due to the. difficulty in Confirming existence of identified novel peptides.' We, simulated a proteogenomic search controlled, spike-in proteomic data set. After confirming that the results of the simulated proteogenomic search were similar to those of a real proteogenomic search using,a human cell line data set, we evaluated the performance of six FDR Control methods-global, separate, and multistage FDR estimation) respectively, coupled to a target-decoy search and a mixture model-based: method on novel peptide identification. The multistage approach showed the highest accuracy for FDR. estimation. However, global and separate FDR estimation with the mixture model-based method showed higher sensitivities than others at the same true FDR. Furthermore, the mixture model based method performed equally well when applied without or with a reduced set of decoy sequences: Considering different prior probabilities for novel and known protein identification, we recommend using mixture model-based methods with separate FDR estimation for sensitive and reliable identification of novel peptides from proteogenomic searches.

  • 出版日期2017-6