摘要

Cluster ensemble is becoming an important research pot and many researchers study in this field But there is no author to state the problem of distributed cluster ensemble In this paper the authors initiatively state the problem and introduce the model of distributed latent Dirichlet allocation (D-LDA) for distributed cluster ensemble which is the most important contribution of this paper First, the latent variables in D-LDA and some terminologies are defined for distributed cluster ensemble Second, Markov chain. Monte Carlo(MCMC) approximation inference for D-LDA is stated in detail Third, some datasets from UCI are chosen for experiments Compared with cluster-based similarity partitioning algorithm (CSPA), hyper-graph partitioning algorithm and meta-cluster algorithm (MCLA), the results show D-LDA does work better, further-more Me outputs of D-LDA, as a soft cluster model, Can not only cluster the, data ponds but also show the structure of data points