摘要

Associative classification (AC) is a branch of data mining that utilizes association rules (ARs) for classification. ARs are extracted from databases that satisfy some statistical criteria such as minimal support. However, in some practical applications, useful ARs may be found among infrequent, but closely related, itemsets that are filtered out by high minimal support. In this study, a new measure, named condenseness, is presented for evaluating whether infrequent ruleitems that are filtered out by minimal support can form strong ARs for classification. For an infrequent ruleitem, the condenseness is the average of lift of all ARs that can be generated from the ruleitem. A ruleitem with a high condenseness means that its elements are closely related and can serve for AC even if it does not have high support. Based on the concept of condenseness, a new associative classifier is developed and presented - condensed association rules for classification (CARC). CARC generates ARs using a modified Apriori algorithm and develops new strategies of rule inference. With the condenseness measure and strategies for rule inference, more useful ARs can be produced and improve the effectiveness of association classification. Empirical evidences show that CARC mitigates the problems caused by setting too high/low minimal support and has a better performance on classification.

  • 出版日期2015-6