摘要

Single-stranded DNA-binding proteins (SSBs) and double-stranded DNA-binding proteins (DSBs) play different roles in biological processes when they bind to single-stranded DNA (ssDNA) or double-stranded DNA (dsDNA). However, the underlying binding mechanisms of SSBs and DSBs have not yet been fully understood. Here, the authors firstly constructed two groups of ssDNA and dsDNA specific binding sites from two non-redundant sets of SSBs and DSBs. They further analysed the relationship between the two classes of binding sites and a newly proposed set of features (residue charge distribution, secondary structure and spatial shape). To assess and utilise the predictive power of these features, they trained a classification model using support vector machine to make predictions about the ssDNA and the dsDNA binding sites. The author's analysis and prediction results indicated that the two classes of binding sites can be distinguishable by the three types of features, and the final classifier using all the features achieved satisfactory performance. In conclusion, the proposed features will deepen their understanding of the specificity of proteins which bind to ssDNA or dsDNA.