A Clique-Based Method Using Dynamic Programming for Computing Edit Distance Between Unordered Trees

作者:Mori Tomoya; Tamura Takeyuki; Fukagawa Daiji; Takasu Atsuhiro; Tomita Etsuji; Akutsu Tatsuya*
来源:Journal of Computational Biology, 2012, 19(10): 1089-1104.
DOI:10.1089/cmb.2012.0133

摘要

Many kinds of tree-structured data, such as RNA secondary structures, have become available due to the progress of techniques in the field of molecular biology. To analyze the tree-structured data, various measures for computing the similarity between them have been developed and applied. Among them, tree edit distance is one of the most widely used measures. However, the tree edit distance problem for unordered trees is NP-hard. Therefore, it is required to develop efficient algorithms for the problem. Recently, a practical method called clique-based algorithm has been proposed, but it is not fast for large trees.
This article presents an improved clique-based method for the tree edit distance problem for unordered trees. The improved method is obtained by introducing a dynamic programming scheme and heuristic techniques to the previous clique-based method. To evaluate the efficiency of the improved method, we applied the method to comparison of real tree structured data such as glycan structures. For large tree-structures, the improved method is much faster than the previous method. In particular, for hard instances, the improved method achieved more than 100 times speed-up.

  • 出版日期2012-10

全文