A bag-of-regions representation for video classification

作者:Choi Min Kook; Wang Ziyu; Lee Hyun Gyu; Lee Sang Chul*
来源:Multimedia Tools and Applications, 2016, 75(5): 2453-2472.
DOI:10.1007/s11042-015-2876-y

摘要

A bag-of-regions (BoR) representation of a video sequence is a spatio-temporal tessellation for use in high-level applications such as video classifications and action recognitions. We obtain a BoR representation of a video sequence by extracting regions that exist in the majority of its frames and largely correspond to a single object. First, the significant regions are obtained using unsupervised frame segmentation based on the JSEG method. A tracking algorithm for splitting and merging the regions is then used to generate a relational graph of all regions in the segmented sequence. Finally, we perform a connectivity analysis on this graph to select the most significant regions, which are then used to create a high-level representation of the video sequence. We evaluated our representation using a SVM classifier for the video classification and achieved about 85 % average precision using the UCF50 dataset.

  • 出版日期2016-3