Itemset generalization with cardinality-based constraints

Cagliero Luca<sup>*</sup>; Garza Paolo

doi:10.1016/j.ins.2013.05.008

摘要

Generalized itemset mining is an established data mining technique that focuses on discovering high-level correlations among large databases. By exploiting a taxonomy built over the data items, items are aggregated into higher level concepts and, thus, data correlations at different abstraction levels can be discovered. However, since a large number of patterns can be extracted, the result of the mining process is often not easily manageable by domain experts. We propose a novel approach to discovering a compact subset of generalized itemsets from structured data. To guarantee model conciseness and readability, a set of itemsets that has a common generalization is generated only when its cardinality is so small that its manual inspection is practically feasible. Furthermore, generalizations are generated only when their knowledge is covered by a large number of low-level descendant itemsets, and the generalizations are worth considering in place of their many low-level descendants only in these cases. Experiments performed on synthetic, benchmark, and real data taken from a mobile application scenario demonstrate the effectiveness and efficiency of the proposed approach.

出版日期2013-9-20

全文

访问全文

收藏分享被引(5) 浏览

更新时间：2019-03-28 10:08

Itemset generalization with cardinality-based constraints

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友