摘要

Each clustering algorithm usually optimizes a qualification metric during its learning process. The qualification metric in traditional clustering algorithms considers all the features of under-consideration dataset equally; it means each feature participates in the clustering process equivalently. Considering that some features have more information than the others in a dataset (due to their lower information or their higher variances, etc.), we proposes a fuzzy weighted clustering algorithm. We name this new clustering algorithm, Fuzzy Weighted Locally Adaptive Clustering (FWLAC) algorithm. The proposed FWLAC algorithm is capable of handling imbalanced clustering. However, FWLAC algorithm suffers from its sensitivity to the two parameters that should be tuned manually. The performance of FWLAC algorithm is affected by well-tuning of its parameters. So the paper proposes two solutions to well-tuning of its two parameters. In the first solution, we propose a simple clustering ensemble framework to show the sensitivity of the WLAC algorithm to its manual well-tuning. Although it is not a try-and-error procedure, it is like a grid search, where we use different pairs of values for both parameters h (1) and h (2). Per each pair of values for parameters h (1) and h (2), the algorithm produces a partitioning. So after the grid search, we obtain a large number of partitionings. We break any of the partitionings into its clusters, and then they form an ensemble of clusters. Finally the consensus partitioning is extracted from them by a consensus function. The algorithm is not data dependent at all. For all datasets, we use a similar grid search and a similar set of values for the parameters h (1) and h (2). In this way we have proposed an alternative solution to parameter selection. We use a selection phase to select/remove some clusters from our ensemble of clusters to obtain an elite ensemble of clusters. To do this, a stability measure, normalized mutual information (NMI), was used to validate a cluster. The paper shows the effectiveness of the proposed clustering frameworks both theoretically and experimentally.

  • 出版日期2015-2