摘要

Nonparametric Bayesian models use a Bayesian framework to learn model complexity automatically from the data, eliminating the need for a complex model selection process. A Hierarchical Dirichlet Process Hidden Markov Model (HD-PHMM) is the nonparametric Bayesian equivalent of a hidden Markov model (HMM), but is restricted to an ergodic topology that uses a Dirichlet Process Model to achieve a mixture distribution-like model. For applications involving ordered sequences (e.g., speech recognition), it is desirable to impose a left-to-right structure on the model. In this paper, we introduce a model based on HDPHMM that: 1) shares data points between states, 2) models non-ergodic structures, and 3) models non-emitting states. The first point is particularly important because Gaussian mixture models, which support such sharing, have been very effective at modeling modalities in a signal (e.g., speaker variability). Further, sharing data points allows models to be estimated more accurately, an important consideration for applications such as speech recognition in which some mixture components occur infrequently. We demonstrate that this new model produces a 20% relative reduction in error rate for phoneme classification and an 18% relative reduction on a speech recognition task on the TIMIT Corpus compared to a baseline system consisting of a parametric HMM.

  • 出版日期2016-1