摘要

In this study, a hybrid genetic algorithm is adopted to find a subset of features that are most relevant to the classification task. Two stages of optimization are involved. The outer optimization stage completes the global search for the best subset of features in a wrapper way, in which the mutual information between the predictive labels of a trained classifier and the true classes serves as the fitness function for the genetic algorithm. The inner optimization performs the local search in a filter manner, in which an improved estimation of the conditional mutual information acts as an independent measure for feature ranking taking account of not only the relevance of the candidate feature to the output classes but also the redundancy to the already-selected features. The inner and outer optimizations cooperate with each other and achieve the high global predictive accuracy as well as the high local search efficiency. Experimental results demonstrate both parsimonious feature selection and excellent classification accuracy of the method on a range of benchmark data sets.