A Knowledge-Based Multiple-Sequence Alignment Algorithm

Nguyen Ken D<sup>*</sup>; Pan Yi

doi:10.1109/TCBB.2013.102

摘要

A common and cost-effective mechanism to identify the functionalities, structures, or relationships between species is multiple-sequence alignment, in which DNA/RNA/protein sequences are arranged and aligned so that similarities between sequences are clustered together. Correctly identifying and aligning these sequence biological similarities help from unwinding the mystery of species evolution to drug design. We present our knowledge-based multiple sequence alignment (KB-MSA) technique that utilizes the existing knowledge databases such as SWISSPROT, GENBANK, or HOMSTRAD to provide a more realistic and reliable sequence alignment. We also provide a modified version of this algorithm (CB-MSA) that utilizes the sequence consistency information when sequence knowledge databases are not available. Our benchmark tests on BAliBASE, PREFAB, HOMSTRAD, and SABMARK references show accuracy improvements up to 10 percent on twilight data sets against many leading alignment tools such as ISPALIGN, PADT, CLUSTALW, MAFFT, PROBCONS, and T-COFFEE.

出版日期2013-8

全文

访问全文

收藏分享被引(7) 浏览

更新时间：2024-04-12 02:35

A Knowledge-Based Multiple-Sequence Alignment Algorithm

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友