A method for mining top-rank-k frequent closed itemsets

作者:Nguyen Loan T T*; Trinh Truc; Ngoc Thanh Nguyen; Vo Bay
来源:Journal of Intelligent and Fuzzy Systems, 2017, 32(2): 1297-1305.
DOI:10.3233/JIFS-169128

摘要

Mining frequent closed itemsets (FCIs) is important in mining non-redundant (minimal) association rules. Therefore, many algorithms have been developed for mining FCIs with reduced mining time and memory usage. For mining FCIs, algorithms use the minimum support threshold, minSup, to prune itemsets. However, using a fixed minSup is not suitable for mining top-rank-k FCIs. A large threshold will lead to a small number of generated FCIs, leading to insufficient FCIs to query when k is large. On the other hand, a small minSup will generate a huge number of generated FCIs, leading to large runtimes and high memory usage. In this paper, we propose a method for mining top-rank-k FCIs without using a fixed minimum support threshold. A strategy is first used to eliminate 1-items that cannot generate FCIs belonging to top-rank-k FCIs. Next, based on the set of candidate 1-items, we propose TRK-FCI, a DCI-Plus-based algorithm, for mining top-rank-k FCIs. In the process of mining top-rank-k FCIs, TRK-FCI automatically increases minSup according to the mined FCIs, efficiently pruning itemsets that cannot belong to top-rank-k FCIs. We also modify the dynamic bit vector (DBV) structure and apply it to reduce memory usage and runtime in the TRK-FCI-DBV algorithm. Experimental results show that TRK-FCI-DBV is more efficient than TRK-FCI for various databases.

  • 出版日期2017