Action snapshot with single pose and viewpoint

作者:Wang, Meili; Guo, Shihui*; Liao, Minghong; He, Dongjian; Chang, Jian; Zhang, Jianjun
来源:Visual Computer, 2019, 35(4): 507-520.
DOI:10.1007/s00371-018-1479-9

摘要

Many art forms present visual content as a single image captured from a particular viewpoint. How to select a meaningful representative moment from an action performance is difficult, even for an experienced artist. Often, a well-picked image can tell a story properly. This is important for a range of narrative scenarios, such as journalists reporting breaking news, scholars presenting their research, or artists crafting artworks. We address the underlying structures and mechanisms of a pictorial narrative with a new concept, called the action snapshot, which automates the process of generating a meaningful snapshot (a single still image) from an input of scene sequences. The input of dynamic scenes could include several interactive characters who are fully animated. We propose a novel method based on information theory to quantitatively evaluate the information contained in a pose. Taking the selected top postures as input, a convolutional neural network is constructed and trained with the method of deep reinforcement learning to select a single viewpoint, which maximally conveys the information of the sequence. User studies are conducted to experimentally compare the computer-selected poses and viewpoints with those selected by human participants. The results show that the proposed method can assist the selection of the most informative snapshot effectively from animation-intensive scenarios.