A novel alignment-free method to classify protein folding types by combining spectral graph clustering with Chou%26apos;s pseudo amino acid composition

Tripathi Pooja; Pandey Paras N

doi:10.1016/j.jtbi.2017.04.027

摘要

The present work employs pseudo amino acid composition (PseAAC) for encoding the protein sequences in their numeric form. Later this will be arranged in the similarity matrix, which serves as input for spectral graph clustering method. Spectral methods are used previously also for clustering of protein sequences, but they uses pair wise alignment scores of protein sequences, in similarity matrix. The alignment score depends on the length of sequences, so clustering short and long sequences together may not good idea. Therefore the idea of introducing PseAAC with spectral clustering algorithm came into scene. We extensively tested our method and compared its performance with other existing machine learning methods. It is consistently observed that, the number of clusters that we obtained for a given set of proteins is close to the number of superfamilies in that set and PseAAC combined with spectral graph clustering shows the best classification results.

出版日期2017-7-7
单位上海生物信息技术研究中心

全文

访问全文

收藏分享被引(29) 浏览

更新时间：2021-01-17 16:37

A novel alignment-free method to classify protein folding types by combining spectral graph clustering with Chou%26apos;s pseudo amino acid composition

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友