摘要

A deep understanding of protein structure benefits from the use of a variety of classification strategies that enhance our ability to effectively describe local patterns of conformation. Here, we use a clustering algorithm to analyze 76,533 all-trans segments from protein structures solved at 1.2 angstrom resolution or better to create a purely phi,psi-based comprehensive empirical categorization of common conformations adopted by two adjacent phi,psi pairs (i.e., (phi,psi)(2) motifs). The clustering algorithm works in an origin-shifted four-dimensional space based on the two phi,psi pairs to yield a parameter-dependent list of (phi,psi)(2) motifs, in order of their prominence. The results are remarkably distinct from and complementary to the standard hydrogen-bond-centered view of secondary structure. New insights include an unprecedented level of precision in describing the phi,psi angles of both previously known and novel motifs, ordering of these motifs by their population density, a data-driven recommendation that the standard C-alpha i... C alpha i+3%26lt;7 angstrom criteria for defining turns be changed to 6.5 angstrom, identification of beta-strand and turn capping motifs, and identification of conformational capping by residues in polypeptide H conformation. We further document that the conformational preferences of a residue are substantially influenced by the conformation of its neighbors, and we suggest that accounting for these dependencies will improve protein modeling accuracy. Although the CUEVAS-4D(r(10)epsilon(14)) %26apos;parts list%26apos; presented here is only an initial exploration of the complex (phi,psi)(2) landscape of proteins, it shows that there is value to be had from this approach, and it opens the door to more in-depth characterizations at the (phi,psi)(2) level and at higher dimensions.

  • 出版日期2012-2-10