A Method of Mining Query Facets Based on Term Graph Analysis

作者:Dou, Zhi Cheng; Jiang, Zheng Bao; Li, Jin Xiu; Zhang, Yi Chun; Wen, Ji Rong
来源:Chinese Journal of Computers, 2017, 40(3): 556-569.
DOI:10.11897/SP.J.1016.2017.00556

摘要

A query facet is a list of homogeneous words or phrases that can describe an underlying aspect of the query. Existing algorithms use predefined patterns to extract frequent lists contained in the top search results of the query, then group these lists into clusters by using unsupervised or supervised learning methods to generate final query facets. The coverage of query facets and their items mined by these methods might be limited, because only a small number of search results are used. In order to solve this problem, we propose mining query facets by using a term graph constructed from a large number of web pages. The nodes in this graph represent different terms and the edges represent the similarity between terms. We first mine initial query facets from the top search results of the query, then find similar terms from the term graph as candidates. Different features of each candidate are extracted. Finally we use support vector machine to classify all candidates into two sets, namely positive set and negative set. All the positive terms are used to expand initial query facets. These steps are repeated until no more facet items are found. Experimental results show that the proposed method can significantly improve the quality of mined query facets, and it can especially improve the coverage of facet items.

全文