A hybrid clustering technique combining a novel genetic algorithm with K-Means

Rahman Md Anisur; Islam Md Zahidul<sup>*</sup>

doi:10.1016/j.knosys.2014.08.011

摘要

Many existing clustering techniques including K-Means require a user input on the number of clusters. It is often extremely difficult for a user to accurately estimate the number of clusters in a data set. The genetic algorithms (GAs) generally determine the number of clusters automatically. However, they typically choose the genes and the number of genes randomly. If we can identify the right genes in the initial population then GAs have better possibility to produce a high quality clustering result than the case when we randomly choose the genes. We propose a novel GA based clustering technique that is capable of automatically finding the right number of clusters and identifying the right genes through a novel initial population selection approach. With the help of our novel fitness function, and gene rearrangement operation it produces high quality cluster centers. The centers are then fed into K-Means as initial seeds in order to produce an even higher quality clustering solution by allowing the initial seeds to readjust as needed. Our experimental results indicate a statistically significant superiority (according to the sign test analysis) of our technique over five recent techniques on twenty natural data sets used in this study based on six evaluation criteria.

出版日期2014-11

全文

访问全文

收藏分享被引(148) 浏览

更新时间：2024-04-24 05:27

A hybrid clustering technique combining a novel genetic algorithm with K-Means

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友