A comparison of clustering quality indices using outliers and noise

作者:Guerra L*; Robles V; Bielza C; Larranaga P
来源:Intelligent Data Analysis, 2012, 16(4): 703-715.
DOI:10.3233/IDA-2012-0545

摘要

Quality indices in clustering are used not only to assess the quality of the partitions but also to determine the number of clusters in the final result. When these indices are evaluated in a case study, real data conditions or different clustering algorithms are seldom taken into account. Here, some of the standard indices used in the literature are compared using more realistic databases that include outliers or noisy dimensions, which is more like a real problem-solving approach. Besides, three different clustering methods are used in an attempt to identify different behaviours. Also, the performance of the quality index-clustering algorithm tandem is compared to random grouping, with the aim of running an additional check. The indices are ranked, and index-based conclusions are drawn for all the scenarios.

  • 出版日期2012