摘要

Based on the classification of the amino acid, the paper presents a mathematic representation of protein sequences, and then obtains a M matrix on the basis of mathematic expression. Then we computed the mathematic invariable according to the M matrix, namely five-dimensional feature vector. According to the angle between two vectors, we analyzed the similarity of 13 kinds of the original sequence of coronavirus N protein. With the software PHYLIP, we create a phylogenetic tree structure and compare the experimental results with the traditional one. The experimental results show that the mathematical model of this method is simple and have low computational complexity and better results. Such method of mathematic representation and similarity analysis of protein sequences is a new impetus for the comparison of protein sequences.