Application of amino acid distribution along the sequence for discriminating mesophilic and thermophilic proteins

Zhang GY; Fang BS

doi:10.1016/j.procbio.2006.03.026

摘要

In this work, we have systematically analyzed the distribution of two neighboring amino acids in the sequences of thermophilic and mesophilic proteins. We observed that the occurrence of EE, KK, RR, PP, KI, VV, VE, KE and VK in thermophilic proteins were significantly higher, while the occurrence of QQ, AA, EQ, LL, QA, QL, NN, KQ, QG, RQ, QT and AQ were significantly lower. The thermostable mechanism was studied and we thought that the dipeptide composition contained more information than amino acid composition. Based on the information of dipeptide composition, we have developed a statistical method for discriminating thermophilic and mesophilic proteins. The accuracy of our method for the training dataset was 86.3%. Furthermore, the accuracy of the method for another two independent testing datasets was 85.5 and 89.7%, respectively. The influence of some specific dipeptides on prediction accuracy was also discussed.

出版日期2006-8
单位华侨大学

全文

访问全文

收藏分享被引(22) 浏览

更新时间：2021-02-19 08:47

Application of amino acid distribution along the sequence for discriminating mesophilic and thermophilic proteins

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友