摘要

Feature generation techniques that sort the generated features in terms of their importance, such as principal component analysis (PCA), reduce the problem of feature subset selection to only determining the number of features to be retained. For databases with linearly inseparable classes, kernel PCA can be used as the feature generation method instead of the linear PCA. However, determining the number of features in the kernel space that needs to be retained for preserving the classifiability is a difficult problem, since the data vectors are not available in an explicit form in that space. In this paper, we propose a criterion and an algorithm for determining the number of required features in kernel PCA using only the elements of the kernel matrix. In order to show the effectiveness of the proposed criterion, the new algorithm is applied to the USPS handwritten digit, Yale Face and Caltech 101 databases. The proposed algorithm is also investigated for its robustness to noise that corrupts the data samples.

  • 出版日期2015-10

全文