An Unsupervised Adaptation Approach to Leveraging Feedback Loop Data by Using i-Vector for Data Clustering and Selection

Xu Jian<sup>*</sup>; Yan Zhi Jie; Huo Qiang

doi:10.1109/TASLP.2014.2341911

摘要

We present a study of using unsupervised adaptation approaches to improve speech recognition accuracy of a deployed speech service by leveraging large-scale untranscribed speech data collected from a feedback loop (FBL). For a regular user with lots of adaptation utterances, conventional CMLLR-based adaptation can be used for personalization directly. For a casual user with a few adaptation utterances, we propose to use CMLLR-based adaptation by augmenting his / her adaptation utterances with utterances acoustically close to the user, which are selected from the FBL data by an i-vector based approach. For a new user, we propose to perform a CMLLR-based recognition of an unknown utterance by selecting a set of CMLLR transforms from the most similar cluster, which are pre-trained by using the utterances from the corresponding cluster generated by an i-vector based utterance clustering method from the FBL data. The effectiveness of the above approaches are confirmed by our experiments on a short message dictation task on smart phones.

出版日期2014-11
单位中国科学技术大学; Microsoft

全文

访问全文

收藏分享被引浏览

更新时间：2021-04-17 05:34

An Unsupervised Adaptation Approach to Leveraging Feedback Loop Data by Using i-Vector for Data Clustering and Selection

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友