摘要

In order to increase the classification accuracy, a new feature selection method, RFFIM-PCA, based on the random forest feature importance measure (RFFIM) and principal component analysis (PCA) for analyzing the near-infrared (NIR) spectra of tobacco, is presented in this paper. We applied the method to the classification of cigarettes' qualitative evaluation and also compared it with other methods. The result showed that RFFIM-PCA discriminates the high-dimensional data effectively and can be used to identify the cigarettes' quality. The feature selection filters the noises, while PCA eliminates the redundant features and reduces the dimensionalities as well. The experimental results showed that RFFIM-PCA successfully eliminated the noises and redundant features in high-dimensional data, leading to a promising improvement on the feature selection and classification accuracy.

全文