摘要

The Latent Semantic Indexing (LSI) is a commonly used dimensionality reduction methods in text categorization; however, as a linear reconstructed method, its goal is to obtain the optimal representative feature rather than the optimal classification feature. This paper proposes a novel method in which the categorization information is combined into the latent semantic indexing to obtain more discriminating features than the standard latent semantic indexing. The experimental results show that the proposed method achieves good performance on two benchmark data sets, especially in the case where the dimensionality is greatly reduced.

全文