摘要

Traditional c-means clustering partitions a group of objects into a number of non-overlapping sets. Rough sets provide more flexible and objective representation than classical sets with hard partition and fuzzy sets with subjective membership function for a given dataset. Rough c-means clustering and its extensions were introduced and successfully applied in many real life applications in recent years. Each cluster is represented by a reasonable pair of lower and upper approximations. However, the most available algorithms pay no attention to the influence of the imbalanced spatial distribution within a cluster. The limitation of the mean iterative calculation function, with the same weight for all the data objects in a lower or upper approximation, is analyzed. A hybrid imbalanced measure of distance and density for the rough c-means clustering is defined, and a modified rough c-means clustering algorithm is presented in this paper. To evaluate the proposed algorithm, it has been applied to several real world data sets from UCI. The validity of this algorithm is demonstrated by the results of comparative experiments.