摘要
Given a set D of d patterns, the dictionary matching problem is to index D such that for any query text T, we can locate the occurrences of any pattern within T efficiently. When D contains a total of n characters drawn from an alphabet of size a, Hon et al. (2008) [12] gave an nH(k)(D)+o(n log sigma)-bit index which supports a query in O(vertical bar T vertical bar (log(epsilon) n+log d)+occ) time, where epsilon %26gt; 0 and H-k(D) denotes the kth-order entropy of D. Very recently, Belazzougui (2010) [3] has proposed an elegant scheme, which takes n log sigma + O(n) bits of index space and supports a query in optimal O(vertical bar T vertical bar + occ) time. In this paper, we provide connections between Belazzougui%26apos;s index and the XBW compression of Ferragina and Manzini (2005) [8], and show that Belazzougui%26apos;s index can be slightly modified to be stored in nH(k)(D) O(n) bits, while query time remains optimal; this improves the compressed index by Hon et al. (2008) [12] in both space and time.
- 出版日期2013-3-4
- 单位清华大学