Discriminative Training Using Non-Uniform Criteria for Keyword Spotting on Spontaneous Speech

Weng Chao<sup>*</sup>; Juang Biing Hwang

doi:10.1109/TASLP.2014.2381931

摘要

In this work, we formulate the problem of keyword spotting as a non-uniform error automatic speech recognition (ASR) problem and propose a model training methodology based on the non-uniform minimum classification error (MCE) approach. The main idea is to adapt the fundamental MCE criteria to reflect the cost-sensitive notion in that errors on keywords are much more significant than errors on non-keywords in an automatic speech recognition task. The notion of cost sensitivity leads to emphasis of keyword models in parameter optimization. Then we present a system which takes advantage of the weighted finite-state transducer (WFST) framework to efficiently implement the non-uniform MCE. To enhance the approach of non-uniform error cost minimization for keyword spotting, we further formulate a technique called "adaptive boosted non-uniform MCE" which incorporates the idea of boosting. We validate the proposed framework on two challenging large-scale spontaneous conversational telephone speech (CTS) datasets in two different languages (English and Mandarin). Experimental results show our framework can achieve consistent and significant spotting performance gains over both the maximum likelihood estimation (MLE) baseline and conventional discriminatively-trained systems with uniform error cost.

出版日期2015-2

全文

访问全文

收藏分享被引(1) 浏览

更新时间：2021-04-13 02:08

Discriminative Training Using Non-Uniform Criteria for Keyword Spotting on Spontaneous Speech

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友