摘要

Access methods are a fundamental tool on Information Retrieval. However, most of these methods suffer the problem known as the curse of dimensionality when they are applied to objects with very high dimensionality representation spaces, such as text documents. In this paper we introduce a new parallel access method that uses several graphs as distributed index structure and a kNN search algorithm. Two parallel versions of the search method are presented, one based on master-slave scheme and the other based on a pipeline. A thorough experimental analysis on different datasets shows that our method can process efficiently large flows of queries, compete with other parallel algorithms and obtain at the same time very high quality results.

  • 出版日期2017-1

全文