摘要

When dealing with high dimensional data, clustering faces the curse of dimensionality problem. In such data sets, clusters of objects exist in subspaces rather than in whole feature space. Subspace clustering algorithms have already been introduced to tackle this problem. However, noisy data points present in this type of data can have great impact on the clustering results. Therefore, to overcome these problems simultaneously, the fuzzy soft subspace clustering with noise detection (FSSC-ND) is proposed. The presented algorithm is based on the entropy weighting soft subspace clustering and noise clustering. The FSSC-ND algorithm uses a new objective function and update rules to achieve the mentioned goals and present more interpretable clustering results. Several experiments have been conducted on artificial and UCI benchmark datasets to assess the performance of the proposed algorithm. In addition, a number of cancer gene expression datasets are used to evaluate the performance of the proposed algorithm when dealing with high dimensional data. The results of these experiments demonstrate the superiority of the FSSC-ND algorithm in comparison with the state of the art clustering algorithms developed in earlier research.

  • 出版日期2016-11