摘要

High dimensionality is quite commonly encountered in data mining problems, and hence dimensionality reduction becomes an important task in order to improve the efficiency of learning algorithms. As a widely used technique of dimensionality reduction, feature selection is about selecting a feature subset being guided by certain criterion. In this paper, three unsupervised feature selection algorithms are proposed and addressed from the viewpoint of sparse graph embedding learning. First, using the self-characterization of the given data, we view the data themselves as a dictionary, conduct sparse coding and propose the sparsity preserving feature selection (SPFS) algorithm. Second, considering the locality preservation of neighborhoods for the data, we study a special case of the SPFS problem, namely, neighborhood preserving feature selection problem, and come up with a suitable algorithm. Third, we incorporate sparse coding and feature selection into one unified framework, and propose a neighborhood embedding feature selection (NEFS) criterion. Drawing support from nonnegative matrix factorization, the corresponding algorithm for NEFS is presented and its convergence is proved. Finally, the three proposed algorithms are validated with the use of eight publicly available real-world datasets from machine learning repository. Extensive experimental results demonstrate the superiority of the proposed algorithms over four compared state-of-the-art unsupervised feature selection methods.