摘要

We present a study of using unsupervised adaptation approaches to improve speech recognition accuracy of a deployed speech service by leveraging large-scale untranscribed speech data collected from a feedback loop (FBL). For a regular user with lots of adaptation utterances, conventional CMLLR-based adaptation can be used for personalization directly. For a casual user with a few adaptation utterances, we propose to use CMLLR-based adaptation by augmenting his / her adaptation utterances with utterances acoustically close to the user, which are selected from the FBL data by an i-vector based approach. For a new user, we propose to perform a CMLLR-based recognition of an unknown utterance by selecting a set of CMLLR transforms from the most similar cluster, which are pre-trained by using the utterances from the corresponding cluster generated by an i-vector based utterance clustering method from the FBL data. The effectiveness of the above approaches are confirmed by our experiments on a short message dictation task on smart phones.

全文