摘要

Hybrid genetic algorithms (GA) and artificial neural networks (ANN) are not new in the machine learning culture. Such hybrid systems have been shown to be very successful in classification and prediction problems. However, little attention has been focused on this architecture as a feature selection method and the consequent significance of the ANN activation function and the number of GA evaluations on the feature selection performance. The activation function is one of the core components of the ANN architecture and influences the learning and generalization capability of the network. Meanwhile the GA searches for an optimal ANN classifier given a set of chromosomes selected from those available. The objective of the GA is to combine the search for optimum chromosome choices with that of finding an optimum classifier for each choice. The process operates as a form of co-evolution with the eventual objective of finding an optimum chromosome selection rather than an optimum classifier. The selection of an optimum chromosome set is referred to in this paper as feature selection. Quantitative comparisons of four of the most commonly used ANN activation functions against ten GA evaluation step counts and three population sizes are presented. These studies employ four data sets with high dimension and low significant datum instances. That is to say that each datum has a high attribute count and the unusual or abnormal data are sparse within the data set. Results suggest that the hyperbolic tangent (tanh) activation function outperforms other common activation functions by extracting a smaller, but more significant feature set. Furthermore, it was found that fitness evaluation sizes ranging from 20,000 to 40,000 within populations ranging from 200 to 300, deliver optimum feature selection capability. Again, optimum in this sense meaning a smaller but more significant feature set.

  • 出版日期2010-12