摘要

This paper concentrates on the problem of image classification using an improved bag of visual words model. In the bag of visual words model, a visual word means a set of features which is corresponding to local image information of the image pixels. Afterwards, the visual features are grouped to some clusters. Particularly, the visual word represents a vector which includes the features of each cluster centroid, and the set of clusters is defined as a codebook (also named visual dictionary). Thus, we can regard an image as a visual word frequency vector. As the traditional bag of visual words model has not considered the spatial information, and it greatly influences the performance of this model. Hence, we proposed a modified version of bag of visual words model by adding the spatial information. We define a pairwise spatial histogram utilizing a discretization of the spatial neighborhood to several bins. A pair-wise spatial histogram of similar patches is defined utilizing a discretization of the image to several bins within an angle range, and all the possible angles are classified to several bins. Based on the improved bag of visual word model, an image is represented as a visual word's frequency vector, and then the image classification is solved by SVM classifier. Finally, we design a series of experiments to testify the effectiveness of our algorithm using two different dataset. Experimental results show that compared with other method, the proposed can classify images more precisely.