A document clustering spectral algorithm that uses evidence accumulation

作者:Xu, Sen*; Lu, Zhi Mao; Zhang, Chun Xiang; Gu, Guo Chang; Zhang, Qi
来源:Journal of Harbin Engineering University, 2010, 31(8): 1043-1047.
DOI:10.3969/j.issn.1006-7043.2010.08.010

摘要

Spectral clustering';s weakness is an inability to choose a similarity measure. To resolve this, a document clustering spectral algorithm using evidence accumulation was proposed. In this algorithm, spherical K-means was first performed over document sets multiple times. Each time the partitioning results were regarded as evidence when judging whether two documents should be put in the same cluster or not. On this basis, the similarity matrix and normalized Laplacian matrix of the documents were constructed. Experiments on the Text REtrieval Conference (TREC) and Reuters document sets demonstrated the effectiveness of the proposed algorithm. It outperformed hierarchical clustering algorithms as well as the K-means algorithm provided in the CLUTO general purpose clustering toolkit.

全文