A new taxonomy-based protein fold recognition approach based on autocross-covariance transformation

作者:Dong Qiwen; Zhou Shuigeng*; Guan Jihong
来源:Bioinformatics, 2009, 25(20): 2655-2662.
DOI:10.1093/bioinformatics/btp500

摘要

Motivation: Fold recognition is an important step in protein structure and function prediction. Traditional sequence comparison methods fail to identify reliable homologies with low sequence identity, while the taxonomic methods are effective alternatives, but their prediction accuracies are around 70%, which are still relatively low for practical usage. Results: In this study, a simple and powerful method is presented for taxonomic fold recognition, which combines support vector machine (SVM) with autocross-covariance (ACC) transformation. The evolutionary information represented in the form of position-specific score matrices is converted into a series of fixed-length vectors by ACC transformation and these vectors are then input to a SVM classifier for fold recognition. The sequence-order effect can be effectively captured by this scheme. Experiments are performed on the widely used D-B dataset and the corresponding extended dataset, respectively. The proposed method, called ACCFold, gets an overall accuracy of 70.1% on the D-B dataset, which is higher than major existing taxonomic methods by 2-14%. Furthermore, the method achieves an overall accuracy of 87.6% on the extended dataset, which surpasses major existing taxonomic methods by 9-17%. Additionally, our method obtains an overall accuracy of 80.9% for 86-folds and 77.2% for 199-folds. These results demonstrate that the ACCFold method provides the state-of-the-art performance for taxonomic fold recognition.