摘要

Feature selection, which aims to select an optimal feature subset to avoid the "curse of dimensionality," is an important research topic in many real-world applications. To select informative features from a high-dimensional dataset, we propose a novel unsupervised feature selection algorithm called Double Regularized Matrix Factorization Feature Selection (DRMFFS) in this paper. DRMFFS is based on the feature selection framework of matrix factorization, but extends this framework by introducing double regularizations (i.e., graph regularization and inner product regularization). There are three major contributions to our approach. First, for the sake of preserving the useful underlying geometric structure information of the feature space of the data, we introduce the graph regularization to guide the learning of the feature selection matrix, making it more effective. Second, in order to take into account the correlations among features, an inner product regularization term is imposed on the objective function of matrix factorization. Therefore, the selected features by DRMFFS cannot only represent the original high-dimensional data well but also contain low redundancy. Third, we design an efficient iteratively update algorithm to solve our approach and also prove its convergence. Experiments on six benchmark databases demonstrate that the proposed approach outperforms the state-of-the-art approaches in terms of both the classification and clustering performance.