摘要

In this study, a new fuzzy centroids clustering for categorical data is presented. The objective function of the fuzzy k-modes algorithm is modified by adding the between-cluster information so as to simultaneously minimize the wifhin-cluster dispersion and enhance the between-cluster separation. Due to the misclassification by using the hard centroids, a fuzzy centroids clustering with the between-cluster information for categorical data is provided. Furthermore, the dissimilarity measure between an object and the centroid at the feature level is given as 1 minus the frequency of the feature value of the object. On several real data sets from UCI, the proposed algorithm is effective and the performance of the novel algorithm outperforms the one with hard-type centroids.

全文