摘要

Fuzzy clustering is an important tool for analyzing microarray data. A major problem in applying fuzzy clustering method to microarray gene expression data is the choice of parameters with cluster number and centers. This paper proposes a new approach to fuzzy kernel clustering analysis (FKCA) that identifies desired cluster number and obtains more steady results for gene expression data. First of all, to optimize characteristic differences and estimate optimal cluster number, Gaussian kernel function is introduced to improve spectrum analysis method (SAM). By combining subtractive clustering with max-min distance mean, maximum distance method (MDM) is proposed to determine cluster centers. Then, the corresponding steps of improved SAM (ISAM) and MDM are given respectively, whose superiority and stability are illustrated through performing experimental comparisons on gene expression data. Finally, by introducing ISAM and MDM into FKCA, an effective improved FKCA algorithm is proposed. Experimental results from public gene expression data and UCI database show that the proposed algorithms are feasible for cluster analysis, and the clustering accuracy is higher than the other related clustering algorithms.