摘要

Bagging and boosting are two well-known methods of developing classifier ensembles. It is generally agreed that the clusterer ensemble methods that utilize the boosting concept can create clusterings with quality and robustness improvements. In this paper, we introduce a new boosting based hierarchical clusterer ensemble method called Bob-Hic. This method is utilized to create a consensus hierarchical clustering (h-clustering) on a dataset, which is helpful to improve the clustering accuracy. Bob-HiC includes several boosting iterations. In each iteration, first, a weighted random sampling is performed on the original dataset. An individual h-clustering is then created on the selected samples. At the end of the iterations, the individual clusterings are combined to a final consensus h-clustering. The intermediate structures used in the combination are distance descriptor matrices which correspond to individual h-clustering results. This final integration is done through an information theoretic approach. Experiments on popular synthetic and real datasets confirm that the proposed method improves the results of simple clustering algorithms. In addition, our experimental results confirm that this method provides better consensus clustering quality compared to other available ensemble techniques.

  • 出版日期2013-6