An efficient algorithm for mining the top-k high utility itemsets, using novel threshold raising and pruning strategies

Quang-Huy Duong<sup>*</sup>; Liao, Bo; Fournier-Viger, Philippe; Thu-Lan Dam

doi:10.1016/j.knosys.2016.04.016

摘要

Top-k high utility itemset mining is the process of discovering the k itemsets having the highest utilities in a transactional database. In recent years, several algorithms have been proposed for this task. However, it remains very expensive both in terms of runtime and memory consumption. The reason is that current algorithms often generate a huge amount of candidate itemsets and are unable to prune the search space effectively. In this paper, we address this issue by proposing a novel algorithm named kHMC to discover the top-k high utility itemsets more efficiently. Unlike several algorithms for top-k high utility itemset mining, kHMC discovers high utility itemsets using a single phase. Furthermore, it employs three strategies named RIU, CUD, and COV to raise its internal minimum utility threshold effectively, and thus reduce the search space. The COV strategy introduces a novel concept of coverage. The concept of coverage can be employed to prune the search space in high utility itemset mining, or to raise the threshold in top-k high utility itemset mining, as proposed in this paper. Furthermore, kHMC relies on a novel co-occurrence pruning technique named EUCPT to avoid performing costly join operations for calculating the utilities of itemsets. Moreover, a novel pruning strategy named TEP is proposed for reducing the search space. To evaluate the performance of the proposed algorithm, extensive experiments have been conducted on six datasets having various characteristics. Results show that the proposed algorithm outperforms the state-of-the-art TKO and REPT algorithms for top-k high utility itemset mining both in terms of memory consumption and runtime.

出版日期2016-7-15
单位哈尔滨工业大学; 湖南大学

全文

访问全文

收藏分享被引(76) 浏览

更新时间：2024-04-14 11:43

An efficient algorithm for mining the top-k high utility itemsets, using novel threshold raising and pruning strategies

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友