摘要

The discovery of biomarkers from high-dimensional data is a very challenging task in cancer diagnoses. On the one hand, biomarker discovery is the so-called high-dimensional small-sample problem. On the other hand, these data are redundant and noisy. In recent years, biomarker discovery from high-throughput biological data has become an increasingly important emerging topic in the field of bioinformatics. In this study, we propose a binary differential evolution algorithm for feature selection. Firstly, we suggest using a two-stage approach, where three filter methods including the Fisher score, T-statistics, and Information gain are used to generate the feature pool for input to differential evolution (DE). Secondly, in order to improve the performance of differential evolution algorithm for feature selection, a new variant of binary DE called BDE is proposed. Three optimization strategies are incorporated into the BDE. The first strategy is the heuristic method in initial stage, the second one is the self-adaptive parameter control, and the third one is the minimum change value to improve the exploration behaviour thus enhance the diversity. Finally, Support vector machine (SVM) is used as the classifier in 10 fold cross-validation method. The experimental results of our proposed algorithm on some benchmark datasets demonstrate the effectiveness of our algorithm. In addition, the BDE forged in this study will be of great potential in feature selection problems.