摘要

Influence diagnostics aim to identify a small number of influential data points that have a disproportionate impact on the model parameters and/or predictions. The key issues with current influence diagnostic techniques are that the regression-theory approaches do not provide hydrologically relevant influence metrics, while the case-deletion approaches are computationally expensive to calculate. The main objective of this study is to introduce a new two-stage hybrid framework that overcomes these challenges, by delivering hydrologically relevant influence metrics in a computationally efficient manner. Stage one uses computationally efficient regression-theory influence diagnostics to identify the most influential points based on Cook's distance. Stage two then uses case-deletion influence diagnostics to quantify the influence of points using hydrologically relevant metrics. To illustrate the application of the hybrid framework, we conducted three experiments on 11 hydro-climatologically diverse Australian catchments using the GR4J hydrological model. The first experiment investigated how many data points from stage one need to be retained in order to reliably identify those points that have the hightest influence on hydrologically relevant metrics. We found that a choice of 30-50 is suitable for hydrological applications similar to those explored in this study (30 points identified the most influential data 98% of the time and reduced the required recalibrations by 99% for a 10 year calibration period). The second experiment found little evidence of a change in the magnitude of influence with increasing calibration period length from 1, 2, 5 to 10 years. Even for 10 years the impact of influential points can still be high (>30% influence on maximum predicted flows). The third experiment compared the standard least squares (SLS) objective function with the weighted least squares (WLS) objective function on a 10 year calibration period. In two out of three flow metrics there was evidence that SLS, with the assumption of homoscedastic residual error, identified data points with higher influence (largest changes of 40%, 10%, and 44% for the maximum, mean, and low flows, respectively) than WLS, with the assumption of heteroscedastic residual errors (largest changes of 26%, 6%, and 6% for the maximum, mean, and low flows, respectively). The hybrid framework complements existing model diagnostic tools and can be applied to a wide range of hydrological modelling scenarios.

  • 出版日期2018-6