摘要

In this paper we propose a novel deep spatial transformer convolutional neural network (Spatial Net) framework for the detection of salient and abnormal areas in images. The proposed method is general and has three main parts: (1) context information in the image is captured by using convolutional neural networks (CNN) to automatically learn high-level features; (2) to better adapt the CNN model to the saliency task, we redesign the feature sub-network structure to output a 6-dimensional transformation matrix for affine transformation based on the spatial transformer network. Several local features are extracted, which can effectively capture edge pixels in the salient area, meanwhile embedded into the above model to reduce the impact of highlighting background regions; (3) finally, areas of interest are detected by means of the linear combination of global and local feature information. Experimental results demonstrate that Spatial Nets obtain superior detection performance over state-of-the-art algorithms on two popular datasets, requiring less memory and computation to achieve high performance.

全文