Bispectral features and mean shift clustering for stress and emotion recognition from natural speech

作者:Yogesh C K*; Hariharan M; Yuvaraj R; Ngadiran Ruzelita; Adom A H; Yaacob Sazali; Polat Kemal
来源:Computers & Electrical Engineering, 2017, 62: 676-691.
DOI:10.1016/j.compeleceng.2017.01.024

摘要

A new set of features and feature enhancement techniques are proposed to recognize emotion and stress from speech signal. The speech waveforms and the glottal waveforms (derived from the recorded emotional/stress speech waveforms) were processed by using third order statistics called bispectrum and 28 (14 from speech waveforms and 14 from glottal waveforms) bispectral based features. In this work, mean shift clustering was used to enhance the discrimination ability of the extracted Bispectral Features (BSFs). Four classifiers were used to distinguish different emotional and stressed states. The performance of the proposed method is tested with three databases. Different experiments were conducted and recognition rates were achieved in the range between 93.44% and 100% for Berlin emotional speech database (BES), between 73.81% and 97.23% for Surrey audio-visual expressed emotion database (SAVEE), between 93.8% and 100% for speech under simulated and actual stress simulated domain (SUSAS) (recognition of multi-style speech under stress-neutral, loud, lombard and anger) and 100% for SUSAS actual domain (recognition of three different levels of stress. high, medium and low). The obtained results indicate that the proposed bispectral based features and mean shift clustering provide promising results to recognize emotion and stress from speech signal.

  • 出版日期2017-8