Artificial neural networks as speech recognisers for dysarthric speech: Identifying the best-performing set of MFCC parameters and studying a speaker-independent approach

Shahamiri Seyed Reza<sup>*</sup>; Salim Siti Salwah Binti

doi:10.1016/j.aei.2014.01.001

摘要

Dysarthria is a neurological impairment of controlling the motor speech articulators that compromises the speech signal. Automatic Speech Recognition (ASR) can be very helpful for speakers with dysarthria because the disabled persons are often physically incapacitated. Mel-Frequency Cepstral Coefficients (MFCCs) have been proven to be an appropriate representation of dysarthric speech, but the question of which MFCC-based feature set represents dysarthric acoustic features most effectively has not been answered. Moreover, most of the current dysarthric speech recognisers are either speaker-dependent (SD) or speaker-adaptive (SA), and they perform poorly in terms of generalisability as a speaker-independent (SI) model. First, by comparing the results of 28 dysarthric SD speech recognisers, this study identifies the best-performing set of MFCC parameters, which can represent dysarthric acoustic features to be used in Artificial Neural Network (ANN)-based ASR. Next, this paper studies the application of ANNs as a fixed-length isolated-word SI ASR for individuals who suffer from dysarthria. The results show that the speech recognisers trained by the conventional 12 coefficients MFCC features without the use of delta and acceleration features provided the best accuracy, and the proposed SI ASR recognised the speech of the unforeseen dysarthric evaluation subjects with word recognition rate of 68.38%.

出版日期2014-1

全文

访问全文

收藏分享被引(54) 浏览

更新时间：2024-04-14 04:41

Artificial neural networks as speech recognisers for dysarthric speech: Identifying the best-performing set of MFCC parameters and studying a speaker-independent approach

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友