摘要
The purpose of the present study was designed to develop a new mathematical model for comparison of DNA sequences. Instead of the classical distances, a new distance based on dinucleotide absolute frequency in large DNA sequences is introduced. The proposed distance that requires neither homologous sequences nor prior sequence alignments is used to search for similar sequences from a database. This method was tested using a set of 39 DNA sequences and a set of 63 DNA sequences. The sensitivity and the selectivity are computed to evaluate and compare the performance of the proposed distance measure. Real data analysis shows that it is a very efficient, high-selective and high-sensitive comparison algorithm that can determine the relative dissimilarity in a large dataset of DNA sequences' very rapidly.
- 出版日期2011
- 单位山东大学