A New Technique for Evaluating Land-use Regression Models and Their Impact on Health Effect Estimates

作者:Wang Meng*; Brunekreef Bert; Gehring Ulrike; Szpiro Adam; Hoek Gerard; Beelen Rob
来源:Epidemiology, 2016, 27(1): 51-56.
DOI:10.1097/EDE.0000000000000404

摘要

Background: Leave-one-out cross-validation that fails to account for variable selection does not properly reflect prediction accuracy when the number of training sites is small. The impact on health effect estimates has rarely been studied. The objective of this study was to develop an improved validation procedure for land-use regression models with variable selection and investigate health effect estimates in relation to land-use regression model performance. Methods: We randomly generated 10 training and test sets for nitrogen dioxide and particulate matter. For each training set, we developed models and evaluated them using a cross-holdout validation approach. Cross-holdout validation develops new models for each evaluation compared with refitting the model without variable selection, as in standard leave-one-out cross-validation. We also implemented holdout validation, which evaluates model predictions using independent test sets. We evaluated the relationship between cross-holdout validation and holdout validation R-2 and estimates of the association between air pollution and forced vital capacity in the Dutch birth cohort. Results: Cross-holdout validation R(2)s were generally identical to holdout validation R(2)s, but were notably smaller than leave-one-out cross-validation R(2)s. Decreases in forced vital capacity in relation to air pollution exposure were larger for land-use regression models that had larger holdout validation and cross-holdout validation R(2)s rather than leave-one-out cross-validation R-2. Conclusion: Cross-holdout validation accurately reflects predictive ability of land-use regression models and is a useful validation approach for small datasets. Land-use regression predictive ability in terms of holdout validation and cross-holdout validation rather than leave-one-out cross-validation was associated with the magnitude of health effect estimates in a case study.

  • 出版日期2016-1