摘要

With the continuous development of remote sensing techniques, an enormous amount of remote sensing data is collected by observing the earth. The method by which researchers mine the relevant knowledge accurately and efficiently from remote sensing big data remains a topic of interest. The existing model mining algorithms usually rely on a priori knowledge, and cannot meet the actual needs of remote sensing data mining. This paper proposes distributed correlation model mining from remote sensing big data based on gene expression programming (DCMM-GEP) combined with cloud computing to find a better model for remote sensing big data using an abnormal value recognition algorithm based on residual sum of residual (AVR-RSR) and a global model generation algorithm for remote sensing big data based on linear least squares (GMGRS-LLS). Comparative experiments show that DCMM-GEP outperforms both distributed correlation model mining based on genetic programming and genetic algorithms, showing better average time-consumption, R-square values and prediction accuracy. The comparative results also show that with an increasing number of computing nodes, DCMM-GEP has a good speed-up ratio and scale-up ratio.