摘要

From wearable devices to depth cameras, researchers have exploited various multimodal data to recognize human actions for applications, such as video gaming, education, and healthcare. Although there many successful techniques have been presented in the literature, most current approaches have focused on statistical or local spatiotemporal features and do not explicitly explore the temporal dynamics of the sensor data. However, human action data contain rich temporal structure information that can characterize the unique underlying patterns of different action categories. From this perspective, we propose a novel temporal order modeling approach to human action recognition. Specifically, we explore subspace projections to extract the latent temporal patterns from different human action sequences. The temporal order between these patterns are compared, and the index of the pattern that appears first is used to encode the entire sequence. This process is repeated multiple times and produces a compact feature vector representing the temporal dynamics of the sequence. Human action recognition can then be efficiently solved by the nearest neighbor search based on the Hamming distance between these compact feature vectors. We further introduce a sequential optimization algorithm to learn the optimized projections that preserve the pairwise label similarity of the action sequences. Experimental results on two public human action datasets demonstrate the superior performance of the proposed technique in both accuracy and efficiency.

  • 出版日期2017-5