Application of boosting to classification problems in chemometrics

Zhang MH; Xu QS; Daeyaert F; Lewi PJ; Massart DL<sup>*</sup>

doi:10.1016/j.aca.2005.01.075

摘要

Application of boosting to both two-class and multi-class classification problems are studied. Five real chemical data sets are used. Each data is randomly divided into two subsets, one for training and the other for prediction. For two-class classification, each data is separated into a high response level class and a low response level class according to a threshold value. As a result, three data sets, wheat data, cream data and HIV data, show that boosting using classification and regression trees (CART) as a base learner may decrease the misclassification rate in prediction with respect to using a single CART. However, boosting for green tea data indicates that overfitting may occur when boosting is applied. For the chromatographic retention data, boosting performs worse than a single CART. The cream data and the HIV data are also used for multi-class classification. Both data sets demonstrate that boosting performs better than CART in multi-classification. Variable importance analysis suggests that the improvement made by boosting may be due to the use of more variables, which give more information on special types of samples in the training data.

出版日期2005-7-15
单位中南大学

全文

访问全文

收藏分享被引(22) 浏览

更新时间：2018-08-02 23:12

Application of boosting to classification problems in chemometrics

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友