Supervised two-step feature extraction for structured representation of text data

Hava Ondrej<sup>*</sup>; Skrbek Miroslav; Kordik Pavel

doi:10.1016/j.simpat.2012.11.003

摘要

Training data matrix used for classification of text documents to multiple categories is characterized by large number of dimensions while the number of manually classified training documents is relatively small. Thus the suitable dimensionality reduction techniques are required to be able to develop the classifier. The article describes two-step supervised feature extraction method that takes advantage of projections of terms into document and category spaces. We propose several enhancements that make the method more efficient and faster than it was presented in our former paper. We also introduce the adjustment score that enables to correct defected targets or helps to identify improper training examples that bias extracted features.

出版日期2013-4

全文

访问全文

收藏分享被引(6) 浏览

更新时间：2019-05-20 16:33

Supervised two-step feature extraction for structured representation of text data

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友