Minimizing the overlap problem in protein NMR: a computational framework for precision amino acid labeling

作者:Sweredoski Michael J; Donovan Kevin J; Nguyen Bao D; Shaka A J; Baldi Pierre*
来源:Bioinformatics, 2007, 23(21): 2829-2835.
DOI:10.1093/bioinformatics/btm406

摘要

Motivation: Recent advances in cell-free protein expression systems allow specific labeling of proteins with amino acids containing stable isotopes (N-15, (13) C and H-2), an important feature for protein structure determination by nuclear magnetic resonance (NMR) spectroscopy. Given this labeling ability, we present a mathematical optimization framework for designing a set of protein isotopomers, or labeling schedules, to reduce the congestion in the NMR spectra. The labeling schedules, which are derived by the optimization of a cost function, are tailored to a specific protein and NMR experiment.
Results: For 2D N-15-H-1 HSQC experiments, we can produce an exact solution using a dynamic programming algorithm in under 2 h on a standard desktop machine. Applying the method to a standard benchmark protein, calmodulin, we are able to reduce the number of overlaps in the 500 MHz HSQC spectrum from 10 to 1 using four samples with a true cost function, and 10 to 4 if the cost function is derived from statistical estimates. On a set of 448 curated proteins from the BMRB database, we are able to reduce the relative percent congestion by 84.9 in their HSQC spectra using only four samples. Our method can be applied in a high-throughput manner on a proteomic scale using the server we developed. On a 100-node cluster, optimal schedules can be computed for every protein coded for in the human genome in less than a month.

  • 出版日期2007-11-1