摘要

Automatic object classification in traffic scene videos is an important issue for intelligent visual surveillance with great potential for all kinds of security applications. However, this problem is very challenging for the following reasons. Firstly, regions of interest in videos are of low resolution and limited size due to the capacity of conventional surveillance cameras. Secondly, the intra-class variations are very large due to changes of view angles, lighting conditions, and environments. Thirdly, real-time performance of algorithms is always required for real applications. In this paper, we evaluate the performance of local feature descriptors for automatic object classification in traffic scenes. Image intensity or gradient information is directly used to construct effective feature vectors from regions of interest extracted via motion detection. This strategy has great advantages of efficiency compared to various complicated texture features. We not only analyze and evaluate the performance of different feature descriptors, but also fuse different scales and features to achieve better performance. Numerous experiments are conducted and experimental results demonstrate the efficiency and effectiveness of this strategy with robustness to noise, variance of view angles, lighting conditions, and environments.

全文