A new text clustering algorithm based on improved k_means

Xinwu Li<sup>*</sup>

doi:10.4304/jsw.7.1.95-101

摘要

Text clustering is one of the difficult and hot research fields in the internet search engine research. A new text clustering algorithm is presented based on Kmeans and Self-Organizing Model (SOM). Firstly, texts are preprocessed to satisfy succeed process requirement. Secondly, the paper improves selection of initial cluster centers and cluster seed selection methods of K-means to improve the deficiency of K-means algorithm that the Kmeans algorithm is very sensitive to the initial cluster center and the isolated point text. Thirdly the advantages of kmeans and SOM are combined to a new model to cluster text in the paper. Finally the experimental results indicate that the improved algorithm has a higher accuracy compared with the original algorithm, and has a better stability.

出版日期2012

全文

访问全文

收藏分享被引(1) 浏览

更新时间：2021-06-14 11:32

A new text clustering algorithm based on improved k_means

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友