An efficient K-means clustering algorithm based on influence factors

作者:Leng, Mingwei*; Tang, Haitao; Chen, Xiaoyun
来源:8th ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing/3rd ACIS International Workshop on Self-Assembling Wireless Networks, Qungdao, PEOPLES R CHINA, 2007-07-30 to 2007-08-01.
DOI:10.1109/SNPD.2007.279

摘要

Clustering has been one of the most widely studied topics in data mining and pattern recognition, k-means clustering has been one of the popular, simple and faster clustering algorithms. but the right value of k is unkwown and selecting effectively initial points is also difficult. In view of this, a lot of work has been done on various versions of k-means, which refines initial points and detects the number of clusters. In this paper, we present a new algorithm, called an efficient k-means clustering based on influence factors,which is divided into two stages and can automatically achieve the actual value of k and select the right initial points based on the datasets characters. Propose influence factor to measure similarity of two clusters, using it to determine whether the two clusers should be merged into one. In order to obtain a faster algorithm, a theorem is proposed and proofed, using it to accelerate the algorithm. Experimental results from Gaussian datasets were generated as in Pelleg and Moore (2000)[11] show the algorithm has high quality and obtains a satisfying result.

全文