摘要

This empirical study extends the results of Hullermeier, Rifqi, Henzgen, and Senge (2012). It examines the ability of a generalization of the Rand index and four related measures of similarity to recover the cluster structure of the data in the framework of fuzzy c-means clustering. The index range is also used as a criterion statistic. A Monte Carlo simulation is conducted for both the null case and where the data have a well-defined cluster structure. The fuzzy extension of the related measures is not so effective for imbalanced data. On the contrary, whether the index is Dice, Fowlkes and Mallows, Hurbert and Arabie, or Jaccard, it provides reliable results for noise data or for data containing fairly balanced clusters. The criticisms of the Rand index in the context of crisp clustering can also be extended to its fuzzy version.

  • 出版日期2017-2