摘要

Given a text T=T[1 ... n] and a circular pattern P=P[1 ... m], the circular pattern matching (CPM) problem is to find all occurrences of P in T. We present the first algorithm that exploits suffix links to solve the exact CPM (ECPM) problem in O(n log vertical bar Sigma vertical bar) time and O(n) space, where Sigma is the symbol alphabet. Then, we present a q-gram-based algorithm for the approximate CPM (ACPM) problem using the idea of bidirectional edit distance. Our algorithm finds all k-approximate occurrences of P in T. We then extend each algorithm to solve the all-against-all variant of the CPM problem for both exact and k-approximate matches. Although the CPM problem has been studied since the 1980s, this is the first attempt on the all-against-all variant, without using a trivial application of standard CPM algorithms. Given a database S=S-1$S-1(2)$(2) ... S-Z$(Z) of Z sequences, our algorithms solve the all-against-all ACPM problem in O(km(a)N) time on average and O(km(m)N(2)) worst case, where k is the error parameter, N=Sigmai=1Z(vertical bar S-i vertical bar+1), m(a)=N/Z and m(m)=max(i=1,2,... Z){vertical bar S-i vertical bar}. Space complexity is O(N). These can be compared with the O(N(2)m(a)log m(a)) average and O(N(3)log m(m)) worst case time required by the best available ACPM algorithm.