摘要

Moving human localization is the first pre-requisite step of human activity analysis in video surveillance. Identifying human targets accurately and efficiently is always of high demands in computer vision studies. Also, learning is often indispensable in contemporary moving human localization, and unknown parameters of proposed methods need to be properly adjusted to guarantee the final localization performance. Such a task can be facilitated with the help of popular deep learning techniques, especially when enormous surveillance video clips become commonly seen nowadays. In this study, the metric learning problem in moving human localization is emphasized, and a new deep multi-channel residual networks-based metric learning method is introduced for the first time. Specifically, the deep metric learning problem in this new method is solved within a ranking procedure via both the conventional stochastic gradient descent algorithm and a more efficient proximal gradient descent algorithm. Comprehensive experiments are conducted and this new method is compared with several other popular deep learning-based approaches. Qualitative and quantitative analysis are conducted from the statistical perspective, to evaluate all localization outcomes obtained by all compared methods based on two specific measurements. The localization performance of this new method is suggested to be promising after the comprehensive analysis.