摘要

One of the challenging problems in understanding high-resolution remote sensing images is aerial scene classification. A well-designed feature extractor and classifier can improve classification accuracy. In this letter, we construct three different convolutional neural networks with different sizes of receptive field, respectively. More importantly, we further propose a multilevel fusion method, which can make judgment by incorporating different levels' information. The aerial image and two patches extracted from the image are fed to these three different networks, and then, a probability fusion model is established for final classification. The effectiveness of the proposed method is tested on a more challenging data set-AID that has 10 000 high-resolution remote sensing images with 30 categories. Experimental results show that our multilevel fusion model gets a significant classification accuracy improvement over all state-of-the-art references.