摘要

A biologically inspired spatiotemporal saliency attention model based on entropy value is proposed in this paper. This model includes a dynamic attention phase and a static attention phase. In the dynamic attention phase, low-level visual features are extracted from current and some previous frames. Every feature map is resized into some different sizes. The feature maps in same size and same feature for all the frames are used to calculate the entropy value map. All the entropy maps are normalized and are fused into a dynamic saliency map. In the static attention phase, same features are extracted and form multi-scale feature maps by center-surround differences in current frame, and then those feature maps are transformed into conspicuity maps, which are linearly combined into a static saliency map. Our model decides salient regions based on a spatiotemporal saliency map which is generated by integration of the dynamic and the static saliency map. Experimental results indicate that: when there is noise among the frames or there is change of illumination among the frames, our model is excellent to Shi's model and Marat's model; when the moving objects do not belong to the static salient regions, our model is better than Ban's model.