摘要

There is a need for robust solutions to the challenges of spatio-temporal data quality assessment that include and go beyond assessment of accuracy. Emphasis is often placed on the quality assessment of individual observations from sensors but not on the sensors themselves nor upon site metadata such as location and timestamps. The focus of this paper is on the development and evaluation of such a representative, interpolation-based solution for the assessment of spatio-temporal data quality. We call our method the SMART method, short for Simple Mappings for the Approximation and Regression of Time series. A robust, linear mapping is determined between the observations from pairs of sites over a representative time period and a quadratic estimate of error is derived from these linear mappings. These mappings combine to form a robust interpolator that outperforms other popular interpolators in estimating ground truth in the presence of bad data, and that can be used to estimate ground truth and assess accuracy. The coefficients of the mappings and other derived measures can also help to identify problematic sites, including sites having incorrect location or timestamp metadata. When applied to a real-world, meteorological data set, we identify numerous problematic sites that otherwise have not been flagged as bad. We identify sites for which metadata is incorrect. We believe that there are many problems with real data sets like these and, in the absence of an approach like ours, these problems have largely gone unidentified. Our approach is novel for the simple but effective way that it accounts for spatial and temporal variation, and that it addresses more than just accuracy.

  • 出版日期2018