Approximating a similarity matrix by a latent class model: A reappraisal of additive fuzzy clustering

作者:ter Braak Cajo J F*; Kourmpetis Yiannis; Kiers Henk A L; Bink Marco C A M
来源:Computational Statistics & Data Analysis, 2009, 53(8): 3183-3193.
DOI:10.1016/j.csda.2008.10.004

摘要

Let Q be a given n x n square symmetric matrix of nonnegative elements between 0 and 1, e.g. similarities. Fuzzy clustering results in fuzzy assignment of individuals to K clusters. In additive fuzzy clustering, the n x K fuzzy memberships matrix P is found by least-squares approximation of the off-diagonal elements of Q by inner products of rows of P. By contrast, kernelized fuzzy c-means is not least-squares and requires an additional fuzziness parameter. The aim is to popularize additive fuzzy clustering by interpreting it as a latent class model, whereby the elements of Q are modeled as the probability that two individuals share the same class on the basis of the assignment probability matrix P. Two new algorithms are provided, a brute force genetic algorithm (differential evolution) and an iterative row-wise quadratic programming algorithm of which the latter is the more effective. Simulations showed that (1) the method usually has a unique solution, except in special cases, (2) both algorithms reached this solution from random restarts and (3) the number of clusters can be well estimated by AIC. Additive fuzzy clustering is computationally efficient and combines attractive features of both the vector model and the cluster model.

  • 出版日期2009-6-15