摘要

In this paper, we propose novel hybrid approaches to annotate videos in valence and arousal spaces by using users%26apos; electroencephalogram (EEG) signals and video content. Firstly, several audio and visual features are extracted from video clips and five frequency features are extracted from each channel of the EEG signals. Secondly, statistical analyses are conducted to explore the relationships among emotional tags, EEG and video features. Thirdly, three Bayesian Networks are constructed to annotate videos by combining the video and EEG features at independent feature-level fusion, decision-level fusion and dependent feature-level fusion. In order to evaluate the effectiveness of our approaches, we designed and conducted the psychophysiological experiment to collect data, including emotion-induced video clips, users%26apos; EEG responses while watching the selected video clips, and emotional video tags collected through participants%26apos; self-report after watching each clip. The experimental results show that the proposed fusion methods outperform the conventional emotional tagging methods that use either video or EEG features alone in both valence and arousal spaces. Moreover, we can narrow down the semantic gap between the low-level video features and the users%26apos; high-level emotional tags with the help of EEG features.

  • 出版日期2014-9