A Compact Representation of Visual Speech Data Using Latent Variables

作者:Zhou Ziheng*; Hong Xiaopeng; Zhao Guoying; Pietikainen Matti
来源:IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36(1): 181-187.
DOI:10.1109/TPAMI.2013.173

摘要

The problem of visual speech recognition involves the decoding of the video dynamics of a talking mouth in a high-dimensional visual space. In this paper, we propose a generative latent variable model to provide a compact representation of visual speech data. The model uses latent variables to separately represent the interspeaker variations of visual appearances and those caused by uttering within images, and incorporates the structural information of the visual data through placing priors of the latent variables along a curve embedded within a path graph.

  • 出版日期2014-1