Unsupervised learning of Dirichlet process mixture models with missing data

作者:Zhang, Xunan; Song, Shiji*; Zhu, Lei; You, Keyou; Wu, Cheng
来源:Science China Information Sciences, 2016, 59(1): 012201.
DOI:10.1007/s11432-015-5429-0

摘要

This study presents a novel approach to unsupervised learning for clustering with missing data. We first extend a finite mixture model to the infinite case by considering Dirichlet process mixtures, which can automatically determine the number of mixture components or clusters. Furthermore, we view the missing features as latent variables and compute the posterior distributions using the variational Bayesian expectation maximization algorithm, which optimizes the evidence lower bound on the complete-data log marginal likelihood. We demonstrate the performance on several artificial data sets with missing values. The experimental results indicate that the proposed method outperforms some classic imputation methods. We finally present an application to seabed hydrothermal sulfide color images analysis problem.