摘要

Advances in glycan array technology have provided opportunities to automatically and systematically characterize the binding specificities of glycan-binding proteins. However, there is still a lack of robust methods for such analyses. In this study, we developed a novel quantitative structure-activity relationship (QSAR) method to analyze glycan array data. We first decomposed glycan chains into mono-, di-, tri- or tetrasaccharide subtrees. The bond information was incorporated into subtrees to help distinguish glycan chain structures. Then, we performed partial least-squares (PLS) regression on glycan array data using the subtrees as features. The application of QSAR to the glycan array data of different glycan-binding proteins demonstrated that PLS regression using subtree features can obtain higher R-2 values and a higher percentage of variance explained in glycan array intensities. Based on the regression coefficients of PLS, we were able to effectively identify subtrees that indicate the binding specificities of a glycan-binding protein. Our approach will facilitate the glycan-binding specificity analysis using the glycan array. A user-friendly web tool of the QSAR method is available at http://bci.clemson.edu/tools/glycan_array.

  • 出版日期2012-4