A novel recurrent hybrid network for feature fusion in action recognition

Yu, Sheng; Cheng, Yun; Xie, Li; Luo, Zhiming; Huang, Min; Li, Shaozi<sup>*</sup>

doi:10.1016/j.jvcir.2017.09.007

摘要

Action recognition in video is one of the most important and challenging tasks in computer vision. How to efficiently combine the spatial-temporal information to represent video plays a crucial role for action recognition. In this paper, a recurrent hybrid network architecture is designed for action recognition by fusing multi-source features: a two-stream CNNs for learning semantic features, a two-stream single-layer LSTM for learning long-term temporal feature, and an Improved Dense Trajectories (IDT) stream for learning short-term temporal motion feature. In order to mitigate the overfitting issue on small-scale dataset, a video data augmentation method is used to increase the amount of training data, as well as a two-step training strategy is adopted to train our recurrent hybrid network. Experiment results on two challenging datasets UCF-101 and HMDB-51 demonstrate that the proposed method can reach the state-of-the-art performance.

出版日期2017-11
单位湖南人文科技学院; 厦门大学

全文

访问全文

收藏分享被引(14) 浏览

更新时间：2021-11-22 04:11

A novel recurrent hybrid network for feature fusion in action recognition

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友