摘要

In this work, we are interested in developing an efficient voice disorders classification system by using discrete wavelet packet transform (DWPT), multi-class linear discriminant analysis (MC-LDA), and multilayer neural network (ML-NN). The characteristics of normal and pathologic voices are well described with energy and Shannon entropy extracted from the coefficients in the output nodes of the best wavelet packet tree with eight decomposition level. The separately extracted wavelet packet-based features, energy and Shannon entropy, are optimized with the usage of multi-class linear discriminant analysis to reduced 2-dimensional feature vector. The experimental implementation uses 258 data samples including normal voices and speech signals impaired by three sorts of disorders: A P squeezing, gastric reflux, and hyperfunction. The voice disorders classification results achieved on Kay Elemetrics databases, developed by Massachusetts Ear and Eye Infirmary (MEEI), show average classification accuracy of 96.67% and 97.33% for the structure composed of wavelet packet-based energy and entropy features, respectively. In these structures, feature vectors are optimized by multi-class linear discriminant analysis and, finally classified by multilayer neural network. The obtained results from confusion matrix and cross-validation tests prove that this novel voice pathology classification system is capable of significant classification improvement with low complexity. This research claims that the proposed voice pathology classification tool can be employed for application of early detection of laryngeal pathology and for assessment of vocal improvement following voice therapy in clinical setting.

  • 出版日期2014-3