摘要

Nowadays, e-mail is one of the most inexpensive and expeditious means of communication. However, a principal problem of any internet user is the increasing number of spam, and therefore an efficient spam filtering method is imperative. Feature selection is one of the most important factors, which can influence the classification accuracy rate. To improve the performance of spam prediction, this paper proposes a new fuzzy adaptive multi-population parallel genetic algorithm (FAMGA) for feature selection. To maintain the diversity of population, a few studies of multi-swarm strategy are reported, whereas the dynamic parameter setting has not been considered further. The proposed method is based on multiple subpopulations and each subpopulation runs in independent memory space. For the purpose of controlling the subpopulations adaptively, we put forward two regulation strategies, namely population adjustment and subpopulation adjustment. In subpopulation adjustment, a controller is designed to adjust the crossover rate for each subpopulation, and in population adjustment, a controller is designed to adjust the size of each subpopulation. Three publicly available benchmark corpora for spam filtering, the PU1, Ling-Spam and Spam Assassin, are used in our experiments. The results of experiments show that the proposed method improves the performance of spam filtering, and is significantly better than other feature selection methods. Thus, it is proved that the multi-population regulation strategy can find the optimal feature subset, and prevent premature convergence of the population.

  • 出版日期2011

全文