A review of Convolutional-Neural-Network-based action recognition

Yao, Guangle; Lei, Tao<sup>*</sup>; Zhong, Jiandan

doi:10.1016/j.patrec.2018.05.018

摘要

Video action recognition is widely applied in video indexing, intelligent surveillance, multimedia understanding, and other fields. Recently, it was greatly improved by incorporating the learning of deep information using Convolutional Neural Network (CNN). This motivated us to review the notable CNN-based action recognition works. Because CNN is primarily designed to extract 2D spatial features from still image and videos are naturally viewed as 3D spatiotemporal signals, the core issue of extending the CNN from image to video is temporal information exploitation. We divide the solutions for exploiting temporal information exploration into three strategies: 1) 3D CNN; 2) taking the motion-related information as the CNN input; and 3) fusion. In this paper, we present a comprehensive review of the CNN-based action recognition methods according to these strategies. We also discuss the action recognition performance on recent large-scale benchmarks and the limitations and future research directions of CNN-based action recognition. This paper offers an objective and clear review of CNN-based action recognition and provides a guide for future research.

出版日期2019-2-1
单位中国科学院光电技术研究所; 电子科技大学; 中国科学院大学

全文

访问全文

收藏分享被引(229) 浏览

更新时间：2024-04-25 03:47

A review of Convolutional-Neural-Network-based action recognition

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友