摘要

Searching for similar documents is an essential task for document management. Most of the previous research regarding searching for similar documents is focused on classifying documents based on the contents of documents or improving the performance of existing algorithms. This paper proposes a multiple concept mechanism to solve the similarity problem. In addition, this study also considers the distribution of contents in conjunction with multiple concepts to improve the quality of searching for similar documents. The empirical evaluation result shows that the proposed technique is more effective than the traditional approaches.

  • 出版日期2003-10