摘要

Recently, researchers have paid close attention to human emotions to develop intelligent human-machine interfaces. Emotion extraction that considers only a single modality, however, has limitations, because each modality has its own weaknesses with respect to different emotions. Multi-modal sensing that mimics that employed by humans is, therefore, very important to improve extraction performance. We propose a multi-modal emotion extraction system based on fuzzy inference that is understandable to human beings. The system has a module-based hierarchical architecture in which the lower module handles emotion extraction independently for each modality, while the upper module integrates the results for all the modalities. Parallel processing is necessary for real-time processing of the multi-modal emotion extraction. We have implemented the proposed system on a multi-core processor, the "Cell Broadband Engine (TM)" with multiple cores specialized for stream data processing. In this paper, we demonstrate the multi-modal emotion extraction system with two modalities, namely facial expressions and voice. Furthermore, to improve system performance, we propose a method to optimize fuzzy rules for facial expressions and voice using SOM clustering and statistical methods, respectively. The performance and validity of the proposed system, are discussed based on the experimental results for emotion extraction of six basic emotions.

  • 出版日期2011-8