摘要

The bottleneck problem has emerged in feature selection when processing high-dimension and large-scale data, so in the past decade, the researches on feature selection have not adhere to the traditional algorithms and ideas, showing a new trend of combining many new mathematical tools, which opens new space for feature selection applied in pattern recognition and makes further development in knowledge discovery and data mining. Granular computing has begun to take shape and show effect as a new idea of intelligent information processing, which creates the conditions for feature selection applied in data. The paper describes a new feature selection algorithm, basing on granular computing and making rough set approximation as background, the algorithm generates the granules, using a tolerance function, distinguishes noise data and inconsistent data, to achieve feature selection in the information table, and be effective for large-scale data sets.

全文