Action Parsing-Driven Video Summarization Based on Reinforcement Learning

Lei, Jie; Luan, Qiao; Song, Xinhui; Liu, Xiao; Tao, Dapeng; Song, Mingli<sup>*</sup>

doi:10.1109/TCSVT.2018.2860797

摘要

How to manage, store, and index large numbers of videos is an urgent problem to be solved. Although there are many video summarization models achieving good results, models based on low-level features cannot summarize important semantic information and models based on semantic analysis need related text descriptions that do not exist for most videos. As a consequence, the mining semantic information contained in the video itself is a more feasible way. In this paper, we propose an action parsing-driven video summarization model based on reinforcement learning. The model is mainly divided into two parts, video cut by action parsing and video summarization based on reinforcement learning. In the first part, a sequential multiple instance learning model is trained with weakly annotated data to solve the problem of full annotation's time consuming and weak annotation's ambiguity. In the second part, we design a deep recurrent neural network-based video summarization model that selects the most distinguishable frames comparing with other actions. Meanwhile, the quality of the extracted key frames could be evaluated by the categorization accuracy. Experiments and comparison with state-of-the-art methods demonstrate the advantage of the proposed approach.

出版日期2019-7
单位云南大学; 浙江大学

全文

访问全文

收藏分享被引(51) 浏览

更新时间：2024-04-23 17:04

Action Parsing-Driven Video Summarization Based on Reinforcement Learning

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友