New powerful statistics for alignment-free sequence comparison under a pattern transfer model

作者:Liu, Xuemei; Wan, Lin; Li, Jing; Reinert, Gesine; Waterman, Michael S.; Sun, Fengzhu*
来源:Journal of Theoretical Biology, 2011, 284(1): 106-116.
DOI:10.1016/j.jtbi.2011.06.020

摘要

Alignment-free sequence comparison is widely used for comparing gene regulatory regions and for identifying horizontally transferred genes. Recent studies on the power of a widely used alignment-free comparison statistic D-2 and its variants D-2* and D-2(s) showed that their power approximates a limit smaller than 1 as the sequence length tends to infinity under a pattern transfer model. We develop new alignment-free statistics based on D-2, D-2* and D-2(s) by comparing local sequence pairs and then summing over all the local sequence pairs of certain length. We show that the new statistics are much more powerful than the corresponding statistics and the power tends to 1 as the sequence length tends to infinity under the pattern transfer model.