摘要

In the field of data mining, classification is a supervised learning task whose purpose is to induce models (classifiers), using a set of labeled training data instances, to predict the class of new unlabeled instances. Data preparation is crucial to the data mining process, and its aim is to improve the fitness of the training data to allow learning algorithms to produce more effective classifiers. Two widely-applied data preparation methods are feature selection and instance selection, both of which fall under the umbrella of data reduction. In this paper, we present new ant colony optimization (ACO) algorithms for data reduction - via both feature and instance selection - to improve the predictive quality of the constructed classification models. Empirical evaluations on 43 benchmark datasets with five well-known classification algorithms show that our ACO algorithms improve the predictive quality of the produced classifiers. We also compare the performance of our proposed ACO algorithms to CIW-NN, a state-of-the-art co-evolutionary instance selection, instance weighting and feature weighting nearest-neighbour classifier, using a Friedman test of statistical significance.

  • 出版日期2016