An ensemble self-training protein interaction article classifier

Chen Yifei<sup>*</sup>; Hou Ping; Manderick Bernard

doi:10.3233/BME-130935

摘要

Protein-protein interaction (PPI) is essential to understand the fundamental processes governing cell biology. The mining and curation of PPI knowledge are critical for analyzing proteomics data. Hence it is desired to classify articles PPI-related or not automatically. In order to build interaction article classification systems, an annotated corpus is needed. However, it is usually the case that only a small number of labeled articles can be obtained manually. Meanwhile, a large number of unlabeled articles are available. By combining ensemble learning and semi-supervised self-training, an ensemble self-training interaction classifier called EST_IACer is designed to classify PPI-related articles based on a small number of labeled articles and a large number of unlabeled articles. A biological background based feature weighting strategy is extended using the category information from both labeled and unlabeled data. Moreover, a heuristic constraint is put forward to select optimal instances from unlabeled data to improve the performance further. Experiment results show that the EST_IACer can classify the PPI related articles effectively and efficiently.

出版日期2014
单位南京审计大学

全文

访问全文

收藏分享被引(3) 浏览

更新时间：2023-12-16 11:57

An ensemble self-training protein interaction article classifier

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友