摘要

In real life, data objects are usually described by mixed numeric and categorical attributes. The k-prototypes algorithm is one of the most important algorithms for clustering this type of data. However, this method performs the hard partition, which may lead to misclassification for the data objects in the boundaries of regions. In this paper, first, we present a new representation for the center of a cluster, and a new measure to evaluate the dissimilarity between data objects and centers of clusters. Then we present our algorithm for clustering mixed data. Finally, the performance of proposed method is demonstrated by a series of experiments on two real datasets in comparison with that of traditional clustering algorithms.

  • 出版日期2012

全文