A Bloom filter based semi-index on q-grams

作者:Grabowski Szymon*; Susik Robert; Raniszewski Marcin
来源:Software: Practice and Experience , 2017, 47(6): 799-811.
DOI:10.1002/spe.2431

摘要

We present a simple q-gram based semi-index, which allows to look for a pattern typically only in a small fraction of text blocks. Several space-time tradeoffs are presented. Experiments on Pizza & Chili datasets show that our solution is up to three orders of magnitude faster than the Claude et al. (Journal of Discrete Algorithms 2012; 11: 37) semi-index at a comparable space usage. Moreover, the construction of our data structure is fast and easily parallelizable.

  • 出版日期2017-6

全文