摘要

Feature selection plays an important role in data mining field. The feature selection for continuous attributes is a hot issue in recent years. Firstly, combining the existing research and introduces the concept of entropy breakpoint into the discrimination of continuous attributes. Secondly, according to the defect which information gain tended to attribute with more values, the paper use the standardized gain to replace the information gain to measure the feature selection, and propose an algorithm for continuous attributes based on the information entropy feature selection. The experimental results show that, the algorithm has a better effect on high dimension data set. Using C4.5 classifier to classify the results of feature selection, the classification accuracy is obviously improved.

全文