摘要

In recent years, data stream mining has become an important research topic. With the emergence of new applications, the data we process are not again static, but the continuous dynamic data stream. Examples include network traffic analysis, Web click stream mining, network intrusion detection, and on-line transaction analysis. In this paper, we propose a new framework for data stream mining, called the weighted sliding window model. The proposed model allows the user to specify the number of windows for mining, the size of a window, and the weight for each Thus users can specify a higher weight to a more significant data section, which will make the mining result closer to user's requirements. Based on the weighted sliding window model, we propose a single pass algorithm, called WSW, to efficiently discover all the frequent itemsets from data streams. By analyzing data characteristics, an improved algorithm, called WSW-Imp, is developed to further reduce the time of deciding whether a candidate itemset is frequent or not. Empirical results show that WSW-Imp outperforms WSW under the weighted sliding window model.

  • 出版日期2009-11