A hybrid decision tree training method using data streams

Wozniak Michal<sup>*</sup>

doi:10.1007/s10115-010-0345-5

摘要

Classical classification methods usually assume that pattern recognition models do not depend on the timing of the data. However, this assumption is not valid in cases where new data frequently become available. Such situations are common in practice, for example, spam filtering or fraud detection, where dependencies between feature values and class numbers are continually changing. Unfortunately, most classical machine learning methods (such as decision trees) do not take into consideration the possibility of the model changing, as a result of so-called concept drift and they cannot adapt to a new classification model. This paper focuses on the problem of concept drift, which is a very important issue, especially in data mining methods that use complex structures (such as decision trees) for making decisions. We propose an algorithm that is able to co-train decision trees using a modified NGE (Nested Generalized Exemplar) algorithm. The potential for adaptation of the proposed algorithm and the quality thereof are evaluated through computer experiments, carried out on benchmark datasets from the UCI Machine Learning Repository.

出版日期2011-11

全文

访问全文

收藏分享被引(40) 浏览

更新时间：2024-04-19 09:46

A hybrid decision tree training method using data streams

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友