AN INTELLIGENT INITIALIZATION METHOD FOR THE K-MEANS CLUSTERING ALGORITHM

作者:Sheu Jyh Jian*; Chen Wei Ming; Tsai Wen Bin; Chu Ko Tsung
来源:International Journal of Innovative Computing Information and Control, 2010, 6(6): 2551-2566.

摘要

The K-Means algorithm is possessed of several advantages such as simple conception and stable efficiency for enormous data sets. While K-Means algorithm also has several shortcomings. The selection of initial clusters, decision of cluster number, and elimination of interference of outliers are the three important subjects for improving K-Means. However, most of the proposed methods of literatures treat only one of the three subjects mentioned above. In the paper, we propose a two-phase clustering method by modifying the initialization of K-Means algorithm, which can accomplish the following jobs simultaneously: (1) deciding the proper cluster number automatically, (2) choosing the better initial clusters, and (3) reducing the influence of outliers upon the result of clustering.