摘要

At present, ensemble learning has exhibited its great power in stabilizing and enhancing the performance of some traditional variable selection methods such as lasso and genetic algorithm. In this paper, a novel bagging ensemble method called BSSW is developed to implement variable ranking and selection in linear regression models. Its main idea is to execute stepwise search algorithm on multiple bootstrap samples. In each trial, a mixed importance measure is assigned to each variable according to the order that it is selected into final model as well as the improvement of model fitting resulted from its inclusion. Based on the importance measure averaged across some bootstrapping trials, all candidate variables are ranked and then decided to be important or not. To extend the scope of application, BSSW is extended to the situation of generalized linear models. Experiments carried out with some simulated and real data indicate that BSSW achieves better performance in most studied cases when compared with several other existing methods.

全文