摘要

In medical diagnosis, e.g. bowel cancer detection, a large number of examples of normal cases exists with a much smaller number of positive cases. Such data imbalance usually complicates the learning process, especially for the classes with fewer representative examples, and results in miss detection. In this article, we introduce a regularized ensemble framework of deep learning to address the imbalanced, multi-class learning problems. Our method employs regularization that accommodates multi-class data sets and automatically determines the error bound. The regularization penalizes the classifier when it misclassifies examples that were correctly classified in the previous learning phase. Experiments are conducted using capsule endoscopy videos of bowel cancer symptoms and synthetic data sets with moderate to high imbalance ratios. The results demonstrate the superior performance of our method compared to several state-of-the-art algorithms for imbalanced, multi-class classification problems. More importantly, the sensitivity gain of the minority classes is accompanied by the improvement of the overall accuracy for all classes. With regularization, a diverse group of classifiers is created and the maximum accuracy improvement is at 24.7%. The reduction in computational cost is also noticeable and as the volume of training data increase, the gain of efficiency by our method becomes more significant.