Audio-visual speech recognition techniques in augmented reality environments

Mirzaei Mohammad Reza<sup>*</sup>; Ghorshi Seyed; Mortazavi Mohammad

doi:10.1007/s00371-013-0841-1

摘要

Many recent studies show that Augmented Reality (AR) and Automatic Speech Recognition (ASR) technologies can be used to help people with disabilities. Many of these studies have been performed only in their specialized field. Audio-Visual Speech Recognition (AVSR) is one of the advances in ASR technology that combines audio, video, and facial expressions to capture a narrator's voice. In this paper, we combine AR and AVSR technologies to make a new system to help deaf and hard-of-hearing people. Our proposed system can take a narrator's speech instantly and convert it into a readable text and show the text directly on an AR display. Therefore, in this system, deaf people can read the narrator's speech easily. In addition, people do not need to learn sign-language to communicate with deaf people. The evaluation results show that this system has lower word error rate compared to ASR and VSR in different noisy conditions. Furthermore, the results of using AVSR techniques show that the recognition accuracy of the system has been improved in noisy places. Also, the results of a survey that was conducted with 100 deaf people show that more than 80 % of deaf people are very interested in using our system as an assistant in portable devices to communicate with people.

出版日期2014-3

全文

访问全文

收藏分享被引浏览

更新时间：2019-05-20 18:44

Audio-visual speech recognition techniques in augmented reality environments

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友