摘要

In many applications, data objects are usually described by both numeric and categorical features. Recently, Klawonn and Hoppner proposed a new fuzzy c-means algorithm to overcome the problem that all data objects tend to influence all clusters. However, their method is only designed for numeric data. In this paper, we extend their method to mixed data. We first integrate mean and fuzzy centroid to represent the center of cluster, and use a new measure to evaluate the dissimilarity between data objects and centers of clusters. This measure takes into account the significance of different attributes towards the clustering process. Then we present our algorithm for clustering mixed data. The performance of the proposed method is demonstrated by a series of experiments on three real datasets in comparison with that of traditional clustering algorithms.

  • 出版日期2011

全文