Adapting centroid classifier for document categorization

Tan Songbo<sup>*</sup>; Wang Yuefen; Wu Gaowei

doi:10.1016/j.eswa.2011.02.114

摘要

In the community of information retrieval, Centroid Classifier has been showed to be a simple and yet effective method for text categorization. However, it is often plagued with model misfit (or inductive bias) incurred by its assumption. Various methods have been proposed to address this issue, such as Weight Adjustment, Voting, Refinement and DragPushing. However, existing methods employ only one criterion, i.e., training-set error. Researches in machine learning indicate that training-set error based method cannot guarantee the generalization capability of base classifiers for unseen examples. To overcome this problem, we propose a novel Model Adjustment algorithm, which makes use of training-set errors as well as training-set margins. Furthermore, we prove that for a linearly separable problem, proposed method converges to the optimal solution after finite updates using any learning parameter eta(eta > 0). The empirical assessment conducted on four benchmark collections indicates that proposed method performs slightly better than SVM classifier in prediction accuracy, as well as beats it in running time.

出版日期2011-8
单位中国地质科学院; 中国科学院

全文

访问全文

收藏分享被引(1) 浏览

更新时间：2018-08-02 13:27

Adapting centroid classifier for document categorization

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友