Average framing linear prediction coding with wavelet transform for text-independent speaker identification system

作者:Daqrouq Khaled*; Al Azzawi Khalooq Y
来源:Computers & Electrical Engineering, 2012, 38(6): 1467-1479.
DOI:10.1016/j.compeleceng.2012.04.014

摘要

In this work, an average framing linear prediction coding (AFLPC) technique for text-independent speaker identification systems is presented. Conventionally, linear prediction coding (LPC) has been applied in speech recognition applications. However, in this study the combination of modified LPC with wavelet transform (WT), termed AFLPC, is proposed for speaker identification. The investigation procedure is based on feature extraction and voice classification. In the phase of feature extraction, the distinguished speaker's vocal tract characteristics were extracted using the AFLPC technique. The size of a speaker's feature vector can be optimized in term of an acceptable recognition rate by means of genetic algorithm (GA). Hence, an LPC order of 30 is found to be the best according to the system performance. In the phase of classification, probabilistic neural network (PNN) is applied because of its rapid response and ease in implementation. In the practical investigation, performances of different wavelet transforms in conjunction with AFLPC were compared with one another. In addition, the capability analysis on the proposed system was examined by comparing it with other systems proposed in literature. Consequently, the PNN classifier achieves a better recognition rate (97.36%) with the wavelet packet (WP) and AFLPC termed WPLPCF feature extraction method. It is also suggested to analyze the proposed system in additive white Gaussian noise (AWGN) and real noise environments; 58.56% for 0 dB and 70.52% for 5 dB. The recognition rates for the whole database of the Gaussian mixture model (GMM) reached the lowest value in case of small number of training samples.

  • 出版日期2012-11

全文