摘要

Multi-shot person re-identification in non-overlapping camera networks has become an important research area. In order to tackle this problem, a robust and adaptive person modeling against occlusion and uncontrolled changes is required. In this paper, a new Multi-Scale Video Covariance (MS-VC) unsupervised approach was proposed to efficiently describe human in motion and requires no labeled training data. The MS-VC approach is based on the computing of the features extracted from a new structured representation called Video Tree Structure (VIDTREST) of any video sequence and can efficiently describe behavioral biometrics and appearance of each human by combining spatio-temporal information in a fixed-size vector. The VIDTREST model captures moving regions of interest. In addition, it decreases the color weight which can discard background noise and resolve clothing similarity cases in the appearance models and other changes. Furthermore, a fast algorithm was suggested to decompose each sequence under VIDTREST, extract its multi-scale features and compute its covariance matrices in one pass. The proposed method was evaluated with CAVIAR and PRID datasets. Our experimental results outperform the recognition rates of the existing unsupervised approaches in-the-state-of-the-art.

  • 出版日期2017-10