A Sketch-based Clustering Algorithm for Uncertain Data Streams

作者:Xian'gang Sheng; Ping Chen; Jingyu Chen
来源:Journal of Networks, 2013, 8(7): 1536-1542.
DOI:10.4304/jnw.8.7.1536-1542

摘要

Due to the inaccuracy and noisy, uncertainty is inherent in time series streams, and increases the complexity of streams clustering. For the continuous arriving and massive data size, efficient data storage is a crucial task for clustering uncertain data streams. With hash-compressed structure, an extended uncertain sketch and update strategy are proposed to store uncertain data streams. And based on divergence and sketch metric, a sketch based similarity is given to measure objects distances. Then with core-sets and the max-min cluster distance measure, an initial cluster centers selection algorithm is proposed to improve the quality of clustering uncertain time series streams. Finally, the effectiveness of the proposed clustering algorithm is illustrated through the experimental results.