A framework for bottom-up induction of oblique decision trees

作者:Banos Rodrigo C*; Jaskowiak Pablo A; Cerri Ricardo; de Carvalho Andre C P L F
来源:Neurocomputing, 2014, 135: 3-12.
DOI:10.1016/j.neucom.2013.01.067

摘要

Decision-tree induction algorithms are widely used in knowledge discovery and data mining, specially in scenarios where model comprehensibility is desired. A variation of the traditional univariate approach is the so-called oblique decision tree, which allows multivariate tests in its non-terminal nodes. Oblique decision trees can model decision boundaries that are oblique to the attribute axes, whereas univariate trees can only perform axis-parallel splits. The vast majority of the oblique and univariate decision-tree induction algorithms employ a top-down strategy for growing the tree, relying on an impurity-based measure for splitting nodes. In this paper, we propose BUTIF-a novel Bottom-Up Oblique Decision-Tree Induction Framework. BUTIF does not rely on an impurity-measure for dividing nodes, since the data resulting from each split is known a priori. For generating the initial leaves of the tree and the splitting hyperplanes in its internal nodes, BUTIF allows the adoption of distinct clustering algorithms and binary classifiers, respectively. It is also capable of performing embedded feature selection, which may reduce the number of features in each hyperplane, thus improving model comprehension. Different from virtually every top-down decision-tree induction algorithm, BUTIF does not require the further execution of a pruning procedure in order to avoid overfitting, due to its bottom-up nature that does not overgrow the tree. We compare distinct instances of BUTIF to traditional univariate and oblique decision-tree induction algorithms. Empirical results show the effectiveness of the proposed framework.

  • 出版日期2014-7-5