Fast mining frequent itemsets using Nodesets

作者:Deng Zhi Hong*; Lv Sheng Long
来源:Expert Systems with Applications, 2014, 41(10): 4505-4512.
DOI:10.1016/j.eswa.2014.01.025

摘要

Node-list and N-list, two novel data structure proposed in recent years, have been proven to be very efficient for mining frequent itemsets. The main problem of these structures is that they both need to encode each node of a PPC-tree with pre-order and post-order code. This causes that they are memory-consuming and inconvenient to mine frequent itemsets. In this paper, we propose Nodeset, a more efficient data structure, for mining frequent itemsets. Nodesets require only the pre-order (or post-order code) of each node, which makes it saves half of memory compared with N-lists and Node-lists. Based on Nodesets, we present an efficient algorithm called FIN to mining frequent itemsets. For evaluating the performance of FIN, we have conduct experiments to compare it with PrePost and FP-growth*, two state-of-the-art algorithms, on a variety of real and synthetic datasets. The experimental results show that FIN is high performance on both running time and memory usage.