Algorithms for phylogenetic footprinting

作者:Blanchette M*; Schwikowski B; Tompa M
来源:Journal of Computational Biology, 2002, 9(2): 211-223.
DOI:10.1089/10665270252935421

摘要

Phylogenetic footprinting is a technique that identifies regulatory elements by finding unusually well conserved regions in a set of orthologous noncoding DNA sequences from multiple species. We introduce a new motif-finding problem, the Substring Parsimony Problem, which is a formalization of the ideas behind phylogenetic footprinting, and we present an exact dynamic programming algorithm to solve it. We then present a number of algorithmic optimizations that allow our program to run quickly on most biologically interesting datasets. We show how to handle data sets in which only an unknown subset of the sequences contains the regulatory element. Finally, we describe how to empirically assess the statistical significance of the motifs found. Each technique is implemented and successfully identifies a number of known binding sites, as well as several highly conserved but uncharacterized regions. The program is available at http://`bio.cs.washington.edu/Software.html.

  • 出版日期2002