A Heuristic Feature Selection Approach for Text Categorization by Using Chaos Optimization and Genetic Algorithm

作者:Chen, Hao*; Jiang, Wen; Li, Canbing; Li, Rui
来源:Mathematical Problems in Engineering, 2013, 2013: 524017.
DOI:10.1155/2013/524017

摘要

Due to the era of Big Data and the rapid growth in textual data, text classification becomes one of the key techniques for handling and organizing the text data. Feature selection is the most important step in automatic text categorization. In order to choose a subset of available features by eliminating unnecessary features to the classification task, a novel text categorization algorithm called chaos genetic feature selection optimization is proposed. The proposed algorithm selects the optimal subsets in both empirical and theoretical work in machine learning and presents a general framework for text categorization. Experimental results show that the proposed algorithm simplifies the feature selection process effectively and can obtain higher classification accuracy with a smaller feature set.