摘要

In this paper, a resampling ensemble algorithm is developed focused on the classification problems for imbalanced datasets. In the method, the small classes are oversampled and large classes are under-sampled. The resampling scale is determined by the ratio of the min class number and max class number. And multiple machine learning methods are selected to construct the ensemble. Numerical results show that the algorithm performance is highly related to the ratio of minority class number and attribute number. When the ratio is less than 3, the performance will be greatly hindered. Experimental results also show that the ensemble of different types of methods could improve the algorithm performance efficiently.