摘要

The desire to group observations generated from multivariate time series is common in many applications with the goal to distinguish not only between differences in the means of individual variables but also changes in their covariances and in the temporal dependence of observations. In this analysis, we compare ten model-based clustering methods in terms of their ability to identify such features under four scenarios in which data are simulated with varying levels of variable and temporal dependence. To consider these methods in a realistic environment, we focus our analysis on wind data, where observations are often strongly correlated in time, and the dependence of variables is known to vary across different regional weather patterns. In particular, we assess each method's performance when applied to wind data simulated under a realistic two-regime Markov-switching vector autoregressive (VAR) model with a diurnally varying mean. A Gaussian mixture model and a basic Markov-switching model outperform the other methods considered in terms of misclassification rates and number of clusters identified. These two methods and an additional Markov-switching VAR model are then applied to one year of averaged hourly wind data from twenty meteorological stations, and we find that the methods can identify very different features in the data. Supplementary materials accompanying this paper appear on-line.

  • 出版日期2015-6