摘要

Modern digital data production methods, such as computer simulation and remote sensing, have vastly increased the size and complexity of data collected over spatial domains. Analysis of these large spatial datasets for scientific inquiry is typically carried out using the Gaussian process. However, nonstationary behavior and computational requirements for large spatial datasets can prohibit efficient implementation of Gaussian process models. To perform computationally feasible inference for large spatial data, we consider partitioning a spatial region into disjoint sets using hierarchical clustering of observations and finite differences as a measure of dissimilarity. Intuitively, directions with large finite differences indicate directions of rapid increase or decrease and are, therefore, appropriate for partitioning the spatial region. Spatial contiguity of the resulting clusters is enforced by only clustering Voronoi neighbors. Following spatial clustering, we propose a nonstationary Gaussian process model across the clusters, which allows the computational burden of model fitting to be distributed across multiple cores and nodes. The methodology is primarily motivated and illustrated by an application to the validation of digital temperature data over the city of Houston as well as simulated datasets. Supplementary materials for this article are available online.

  • 出版日期2017-2