摘要

Keyword Spotting (KWS) systems can be divided into two main groups: Hidden Markov Model (HMM)-based and Discriminative KWS (DKWS) systems. In this paper, we propose an approach to improve a DKWS system using advantages of HMM-based systems. The proposed DKWS system contains feature extraction and classification (that includes a classifier and a search algorithm) parts. The focus of this paper is on the feature extraction part and the search algorithm. At first, we propose a method for using the advantages of a triphone-based HMM system and improving the monophone-based feature extraction, (proposed in our previous works), to triphone-based one. Then, we propose an N-best search algorithm instead of one-best algorithm. The results on TIMIT database indicate that the true detection rate of the triphone-based Evolutionary DKWS (EDKWS) system with N-best search (Tph-EDKWS-N-Best), in false alarm rate per keyword per hour greater than two, is 4.6% higher than that of the monophone-based EDKWS system with one-best search (Mph-EDKWS-1-Best). This improvement costs about 0.4 unit degradation in Real Time Factor (a common metric of measuring the speed of an automatic speech recognition system). Additionally, Figure of Merit (average true detection rate for different false alarm per keyword per hour from 1 to 10) of the Tph-EDKWS-N-Best system is noticeably higher than that of HMM-based KWS systems. However, the computational complexity of the Tph-EDKWS-N-Best system is considerably higher than that of the HMM-based KWS systems.

  • 出版日期2018-1