摘要

Discovering a three dimensional structure of a protein is a challenging task in biological science. Classifying a protein into one of its folds is an intermediate step for deciphering the three dimensional protein structure. The protein fold recognition can be done by developing feature extraction techniques to accurately extract all the relevant information from a protein sequence and then by employing a suitable classifier to label an unknown protein. Several feature extraction techniques have been developed in the past but with limited recognition accuracy only. In this work, we have developed a feature extraction technique which is based on bi-grams computed directly from Position Specific Scoring Matrices and demonstrated its effectiveness on a benchmark dataset. The proposed technique exhibits an absolute improvement of around 10% compared with existing feature extraction techniques.

  • 出版日期2013-3-7

全文