摘要

Alignment-free sequence comparison continues to play crucial roles in molecular sequence analysis. In this paper, we provide a novel probabilistic measure to compare the biological sequences without alignment. The probabilistic measure is derived from the similarity of several k-word distributions from Markov models. After presenting our method, we employ the probabilistic measure to classify nine chromosomes from three species. Our approach is then used to separate subtypes of the HIV-1 virus. In addition, our method allow us to reconstruct the phylogenetic tree of 48 HEV genome sequences. These results indicate that our method is an efficient and powerful tool for comparison of whole-genome sequences.