摘要

Feature reduction refers to the problem of deleting those input features that are less predictive of a given outcome; a problem encountered in many areas such as pattern recognition, machine learning and data mining. In particular, it has been successfully applied in tasks that involve datasets containing huge numbers of features. Rough set theory has been used as such a data set preprocessor with much success, but current methods are inadequate at solving the problem of numerical feature reduction. As the classical rough set model can just be used to evaluate categorical features, we introduce a neighborhood rough set model to deal with numerical datasets by defining a neighborhood relation. However, this method is still not enough to find the optimal subsets regularly. In this paper, we propose a new feature reduction mechanism based on fish swarm algorithm (FSA) in an attempt to polish up this. The method is then applied to the problem of finding optimal feature subsets in the neighborhood rough set reduction process. We define three foraging behaviors of fish to find the optimal subsets and a fitness function to evaluate the best solutions. We construct the neighborhood feature reduction algorithm based on FSA and design some experiments comparing with a heuristic neighborhood feature reduction method. Experimental results show that the FSA-based neighborhood reduction method is suitable to deal with numerical data and more possibility to find an optimal reduct.