摘要

Superpositioning of atoms in an ensemble of biomolecules is a common task in a variety of fields in structural biology. Although several automated tools exist based on previously established methods, manual operations to define the atoms in the ordered regions are usually preferred. The task is difficult and lacks output efficiency for multi-core proteins having complicated folding topology. The new method presented here can systematically and quantitatively achieve the identification of ordered cores even for molecules containing multiple cores linked with flexible loops. In contrast to established methods, this method treats the variance of inter-atomic distances in an ensemble as information content using a non-linear (NL) function, and then subjects it to multi-dimensional scaling (MDS) to embed the row vectors in the inter-atomic distance variance matrix into a lower dimensional matrix. The plots of the identified atom groups in a one or two-dimensional map enables users to visually and intuitively infer well-ordered atoms in an ensemble, as well as to automatically identify them by the standard clustering methods. The performance of the NL-MDS method has been examined for number of structure ensembles studied by nuclear magnetic resonance, demonstrating that the method can be more suitable for structural analysis of multi-core proteins in comparison to previously established methods.

  • 出版日期2014-1