Fast classification of univariate and multivariate time series through shapelet discovery

Grabocka Josif<sup>*</sup>; Wistuba Martin; Schmidt Thieme Lars

doi:10.1007/s10115-015-0905-9

摘要

Time-series classification is an important problem for the data mining community due to the wide range of application domains involving time-series data. A recent paradigm, called shapelets, represents patterns that are highly predictive for the target variable. Shapelets are discovered by measuring the prediction accuracy of a set of potential (shapelet) candidates. The candidates typically consist of all the segments of a dataset; therefore, the discovery of shapelets is computationally expensive. This paper proposes a novel method that avoids measuring the prediction accuracy of similar candidates in Euclidean distance space, through an online clustering/pruning technique. In addition, our algorithm incorporates a supervised shapelet selection that filters out only those candidates that improve classification accuracy. Empirical evidence on 45 univariate datasets from the UCR collection demonstrates that our method is 3-4 orders of magnitudes faster than the fastest existing shapelet discovery method, while providing better prediction accuracy. In addition, we extended our method to multivariate time-series data. Runtime results over four real-life multivariate datasets indicate that our method can classify MB-scale data in a matter of seconds and GB-scale data in a matter of minutes. The achievements do not compromise quality; on the contrary, our method is even superior to the multivariate baseline in terms of classification accuracy.

出版日期2016-11

全文

访问全文

收藏分享被引(41) 浏览

更新时间：2024-05-07 23:17

Fast classification of univariate and multivariate time series through shapelet discovery

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友