摘要

Previous analysis of learning data can help us to discover hidden relations among features. We can use this knowledge to select the most suitable learning methods and to achieve further improvements in the performance of classification systems. For the known Naive Bayes classifier, several studies have been conducted in an attempt to reconstruct the set of attributes in order to remove or debilitate dependence relations, which can reduce the accuracy of this classifier. These methods are included in the ones known as semi-naive Bayes classifiers. In the present research, we present a semi-Naive Bayes classifier that searches for dependent attributes using a filter approach. In order to prevent the number of cases of the compound attributes from being excessively high, a grouping procedure is always applied after the merging of two variables. This method attempts to group two or more cases of the new variable into a single one, in order to reduce the cardinality of the compound variables. As a result, the model presented is a competitive classifier with respect to the state of the art of semi-Naive Bayes classifiers, particularly in terms of quality of class probability estimates, but with a much lower memory space complexity.

  • 出版日期2011