摘要

Data imputation approaches for robust automatic speech recognition reconstruct noise corrupted spectral information by exploiting prior knowledge of the relationship between target speech and background through the use of spectrographic masks. Most of these approaches are model-based techniques that can only provide accurate estimates of the underlying clean speech when the characteristics of the noise corrupted features do not deviate from those of the model. Discrete wavelet transform (DWT) based de-noising methods can also be used for re-estimating the underlying clean speech from a noise corrupted signal, but often require that the background noise is stationary and modeled by a Gaussian distribution. A novel approach is presented here for incorporating the information derived from spectrographic masks in a DWT-based de-noising method. The spectrographic masks are used for deriving thresholds for de-noising wavelet domain coefficients making DWT based de-noising more suitable for non-stationary noise conditions. The results of an experimental study are presented to demonstrate the performance of DWT based data imputation relative to other established techniques on the Aurora 2 noisy speech recognition task. It will be shown that the proposed approach reduces the impact of model mismatch associated with parametric approaches and exploits the robustness of non-parametric wavelet de-noising approach.

  • 出版日期2015-3
  • 单位McGill