摘要

Sequence alignment is a basic field in bioinformatics, especially the sequence alignment of remotely homologue proteins is a hot spot. In our previous work, we developed a new score matrix named transformation matrix which can greatly enhance the quality of the alignment of distant protein sequences. Here, by using the transformation score matrix, we assessed the statistical significance of the local sequence alignment. Compared with the traditional score matrix, the local sequence alignment method has the following features: (i) The optimal alignment scores approximately follow a normal distribution. (ii) The distribution is closely related with N, which represents the length of two sequence alignments but not the lengths of the two sequences being compared. Therefore, for a pair of two aligned protein sequences, we can calculate the P-value based on the N and the optimal alignment score.

全文