摘要

This study proposes an effective feature compensation scheme to address severely adverse environments for speech recognition where background noise and channel distortion are simultaneously involved. In the proposed scheme, an iterative channel estimation method is integrated into the framework of our previously proposed Parallel Combined Gaussian Mixture Model (PCGMM) based feature compensation algorithm. A new speech corpus is developed which reflects both additive and convolutional noise corruption. The channel distortion effects are obtained from the NTIMIT and CTIMIT corpora. Evaluation based on objective measures including STNR, PESQ, and speech recognition shows that generated speech corpus includes highly challenging acoustic conditions for speech recognition. The proposed feature compensation method is evaluated over the developed speech corpus. The experimental results demonstrate that the proposed feature compensation scheme is effective at improving speech recognition performance in the presence of both background noise and channel distortion, employing the iterative channel estimation method. The proposed PCGMM-based feature compensation scheme employing the channel estimation method shows +3.58% and +11.61% relative improvements in averaged WER compared to the ETSI AFE algorithm for the developed speech corpora including NTIMIT and CTIMIT channel effects respectively. For real-fife application, a voice activity detection technique is employed to estimate the noise model for PCGMM-based method without a priori knowledge of the non-speech locations of input speech. The proposed method is also evaluated on the CU-Move corpus which represents actual in-vehicle conditions, showing a +12.99% relative improvement compared to the ETSI AFE. This study confirms that the proposed PCGMM-based feature compensation method integrated with channel estimation is effective at increasing speech recognition accuracy in real-life severely adverse conditions.

  • 出版日期2015-10