摘要

Prediction of secreted protein types based solely on sequence data remains to be a challenging problem. In this study, we extract the long-range correlation information and linear correlation information from position-specific score matrix ( PSSM). A total of 6800 features are extracted at 17 different gaps; then, 309 features are selected by a filter feature selection method based on the training set. To verify the performance of our method, jackknife and independent dataset tests are performed on the test set and the reported overall accuracies are 93.60% and 100%, respectively. Comparison of our results with the existing method shows that our method provides the favorable performance for secreted protein type prediction.

全文