摘要

1. Correlated data are ubiquitous in ecological and evolutionary research, and appropriate statistical analysis requires that these correlations are taken into account. For regressions with correlated, non-normal outcomes, twomain approaches are used: conditional and marginal modelling. The former leads to generalized linear mixed models (GLMMs), while the latter are estimated using generalized estimating equations (GEEs), or marginalized multilevel regression models. Differences, advantages and drawbacks of conditional and marginal models have been discussed extensively in the statistical and applied literature, and there is some agreement that the choice of the model must depend on the question under study. Yet, there still appears to be a lot of confusion and disagreement over when to choose which model. 2. We start with a review of conditional and marginal models, and the differences in the interpretation of the resulting parameter estimates. We highlight that the two types of models propagate different linear relations between the covariates and the response. Moreover, while conditional models explicitly account for heterogeneity among clustered observations, marginal models yield averages over such heterogeneities and are therefore often interpreted as population-averaged models. 3. We point out theoretically and with an example that when modelling non-normal outcomes no unambiguous definition of a marginal model generally exists. Instead, marginal model parameters are marginal only with respect to unaccounted differences among clusters and thus depend on the fixed effects in the model. Therefore, marginal model parameters should not be loosely interpreted as population-averaged parameters. In addition, we explain how marginal modelling is mathematically analogous to deliberately omitting covariates with explanatory power, and to deliberately introducing a Berkson measurement error into covariates. We also reiterate that marginal modelling is related to a well-known statistical phenomenon, the Simpson's paradox. 4. In most cases, therefore, we regard the conditional model as the more powerful choice to explain how covariates are associated with a non-normal response. Still, marginal models can be useful, given that the scientific question explicitly requires such a model formulation.

  • 出版日期2016-12