摘要

Feature selection is a pre-processing step in data mining and machine learning, and is very important in analyzing high-dimensional data. Attribute clustering has been proposed for feature selection. If similar attributes can be clustered into groups, they can then be easily replaced by others in the same group when some attribute values are missing. Hong et al. proposed a genetic algorithm (GA) to find appropriate attribute clusters. However, in their approaches, multiple chromosomes represent the same attribute clustering result (feasible solution) due to the combinatorial property, and thus the search space is larger than necessary. This study improves the performance of the GA-based attribute clustering process based on the grouping genetic algorithm (GGA). In the proposed approach, the general GGA representation and operators are used to reduce redundancy in the chromosome representation for attribute clustering. Experiments are also conducted to compare the efficiency of the proposed approach with that of an existing approach. The results indicate that the proposed approach can derive attribute grouping results in an effective way.