A parallel computing approach to fast geostatistical areal interpolation

作者:Guan Qingfeng*; Kyriakidis Phaedon C; Goodchild Michael F
来源:International Journal of Geographical Information Science, 2011, 25(8): 1241-1267.
DOI:10.1080/13658816.2011.563744

摘要

Areal interpolation is the procedure of using known attribute values at a set of (source) areal units to predict unknown attribute values at another set of (target) units. Geostatistical areal interpolation employs spatial prediction algorithms, that is, variants of Kriging, which explicitly incorporate spatial autocorrelation and scale differences between source and target units in the interpolation endeavor. When all the available source measurements are used for interpolation, that is, when a global search neighborhood is adopted, geostatistical areal interpolation is extremely computationally intensive. Interpolation in this case requires huge memory space and massive computing power, even with the dramatic improvement introduced by the spectral algorithms developed by Kyriakidis et al. (2005. Improving spatial data interoperability using geostatistical support-to-support interpolation. In: Proceedings of geoComputation. Ann Arbor, MI: University of Michigan) and Liu et al. (2006. Calculation of average covariance using fast Fourier transform (FFT). Menlo Park, CA: Stanford Center for Reservoir Forecasting, Petroleum Engineering Department, Stanford University) based on the fast Fourier transform (FFT). In this study, a parallel FFT-based geostatistical areal interpolation algorithm was developed to tackle the computational challenge of such problems. The algorithm includes three parallel processes: (1) the computation of source-to-source and source-to-target covariance matrices by means of FFT; (2) the QR factorization of the source-to-source covariance matrix; and (3) the computation of source-to-target weights via Kriging, and the subsequent computation of predicted attribute values for the target supports. Experiments with real-world datasets (i.e., predicting population densities of watersheds from population densities of counties in the Eastern Time Zone and in the continental United States) showed that the parallel algorithm drastically reduced the computing time to a practical length that is feasible for actual spatial analysis applications, and achieved fairly high speed-ups and efficiencies. Experiments also showed the algorithm scaled reasonably well as the number of processors increased and as the problem size increased.

  • 出版日期2011