摘要

Land use regression (LUR) models are widely employed in health studies to characterize chronic exposure to air pollution. The LUR is essentially an interpolation technique that employs the pollutant of interest as the dependent variable with proximate land use, traffic, and physical environmental variables used as independent predictors. Two major limitations with this method have not been addressed: (1) variable selection in the model building process, and (2) dealing with unbalanced repeated measures. In this paper, we address these issues with a modeling framework that implements the deletion/substitution/addition (DSA) machine learning algorithm that uses a generalized linear model to average over unbalanced temporal observations. Models were derived for fine particulate matter with aerodynamic diameter of 2.5 microns or less (PM2.5) and nitrogen dioxide (NO2) using monthly observations. We used 4119 observations at 108 sites and 15,301 observations at 138 sites for PM2.5 and NO2, respectively. We derived models with good predictive capacity (cross-validated-R-2 values were 0.65 and 0.71 for PM2.5 and NO2, respectively). By addressing these two shortcomings in current approaches to LUR modeling, we have developed a framework that minimizes arbitrary decisions during the model selection process. We have also demonstrated how to integrate temporally unbalanced data in a theoretically sound manner. These developments could have widespread applicability for future LUR modeling efforts.

  • 出版日期2013-10