An effective method for approximate representation of frequent itemsets

Huang, Jheng Nan; Hong, Tzung Pei<sup>*</sup>; Chiang, Ming Chao

doi:10.3233/IDA-150488

摘要

In data mining, finding frequent itemsets is a critical step to discovering association rules. The number of frequent itemsets may, however, be huge if the threshold of minimum support is set at a low value or the number of items in the transaction database to be mined is large. In the past, some approaches were thus proposed to keep frequent itemsets with compact representation. For example, the approach of maximal itemsets keeps a borderline composed of the maximal itemsets, which separate frequent itemsets from non-frequent ones. It can recover all the frequent itemsets, but cannot get their actual frequencies back. On the contrary, the approach of closed itemsets can correctly recover each frequent itemset and its frequency. Besides, another approach called reference itemsets can recover each frequent itemset and approximately estimate its frequency. In this paper, we propose an efficient algorithm to recover each frequent itemset and its approximate frequency based on the kept maximal itemsets, frequent 1-itemsets, their supports, and some key information. The maximal frequent itemsets are used to recover all frequent itemsets, which are then organized into a simple flow network with levels. Next, the kept key information is used to derive approximate supports of the frequent itemsets in the flow network through the flow process. Finally, a series of experiments are conducted to show the compression effects of the proposed algorithm.

出版日期2017
单位中山大学

全文

访问全文

收藏分享被引浏览

更新时间：2021-01-21 17:43

An effective method for approximate representation of frequent itemsets

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友