Development of a Low-Cost, Noninvasive, Portable Visual Speech Recognition Program

Kohlberg Gavriel D; Gal Ya'akov; Lalwani Anil K<sup>*</sup>

doi:10.1177/0003489416650689

摘要

Objectives: Loss of speech following tracheostomy and laryngectomy severely limits communication to simple gestures and facial expressions that are largely ineffective. To facilitate communication in these patients, we seek to develop a low-cost, noninvasive, portable, and simple visual speech recognition program (VSRP) to convert articulatory facial movements into speech. Methods: A Microsoft Kinect-based VSRP was developed to capture spatial coordinates of lip movements and translate them into speech. The articulatory speech movements associated with 12 sentences were used to train an artificial neural network classifier. The accuracy of the classifier was then evaluated on a separate, previously unseen set of articulatory speech movements. Results: The VSRP was successfully implemented and tested in 5 subjects. It achieved an accuracy rate of 77.2% (65.0%-87.6% for the 5 speakers) on a 12-sentence data set. The mean time to classify an individual sentence was 2.03 milliseconds (1.91-2.16). Conclusion: We have demonstrated the feasibility of a low-cost, noninvasive, portable VSRP based on Kinect to accurately predict speech from articulation movements in clinically trivial time. This VSRP could be used as a novel communication device for aphonic patients.

出版日期2016-9

全文

访问全文

收藏分享被引(3) 浏览

更新时间：2024-05-06 01:13

Development of a Low-Cost, Noninvasive, Portable Visual Speech Recognition Program

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友