摘要

A novel algorithm for efficiently mining high average-utility itemsets is presented in this paper. The algorithm utilizes list structures, which compactly capture all information needed to calculate the actual average-utilities of itemsets so as to mine all high average-utility itemsets without the generation of candidate itemsets. The algorithm thus does not require any scanning of a given transactional database after initial two scans for constructing list structures of itemsets with 1-lengths. The algorithm can generate all high average-utility itemsets through its depth-first search based mining process, which is conducted by recursively constructing list structures of itemsets with (k + 1)-lengths from list structures of itemsets with k-lengths. In order to avoid the expansion of unpromising itemsets that cannot be expanded to high average-utility itemsets, a novel pruning technique using tight upper-bounds of itemsets' average-utilities is designed and applied to the algorithm. Therefore, the runtime and memory efficiencies of the algorithm are able to be enhanced significantly because the search space of its mining process can be considerably reduced. Various experiments were performed by using four real datasets and two groups of synthetic datasets. Experimental results support that the proposed algorithm has runtime, memory, and scalability performances superior to those of existing algorithm.

  • 出版日期2017-3