A new nested ensemble technique for automated diagnosis of breast cancer

作者:Abdar, Moloud; Zomorodi-Moghadam, Mariam; Zhou, Xujuan*; Gururajan, Raj; Tao, Xiaohui; Barua, Prabal D.; Gururajan, Rashmi
来源:Pattern Recognition Letters, 2020, 132: 123-131.
DOI:10.1016/j.patrec.2018.11.004

摘要

Nowadays, breast cancer is reported as one of most common cancers amongst women. Early detection of this cancer is an essential to aid in informing subsequent treatments. This study investigates automated breast cancer prediction using machine learning and data mining techniques. We proposed the nested ensemble approach which used the Stacking and Vote (Voting) as the classifiers combination techniques in our ensemble methods for detecting the benign breast tumors from malignant cancers. Each nested ensemble classifier contains "Classifiers" and "MetaClassifiers". MetaClassifiers can have more than two different classification algorithms. In this research, we developed the two-layer nested ensemble classifiers. In our two-layer nested ensemble classifiers the MetaClassifiers have two or three different classification algorithms. We conducted the experiments on Wisconsin Diagnostic Breast Cancer (WDBC) dataset and K-fold Cross Validation technique are used for the model evaluation. We compared the proposed two-layer nested ensemble classifiers with single classifiers (i.e., BayesNet and Naive Bayes) in terms of the classification accuracy, precision, recall, F 1 measure, ROC and computational times of training single and nested ensemble classifiers. We also compared our best model with previous works reported in the literatures in terms of accuracy. The results demonstrate that the proposed two-layer nested ensemble models outperformance the single classifiers and most of the previous works. Both SV-BayesNet-3MetaClassifier and SV-Naive Bayes-3-MetaClassifier achieved accuracy 98.07% (K = 10). However, SV-Naive Bayes-3-MetaClassifier is more efficiency as it needs less time to build the model.

  • 出版日期2020-4