Sparse multiple instance learning as document classification

Yan, Shengye<sup>*</sup>; Zhu, Xiaodong; Liu, Guoqing; Wu, Jianxin

doi:10.1007/s11042-016-3567-z

摘要

This work focuses on multiple instance learning (MIL) with sparse positive bags (which we name as sparse MIL). A structural representation is presented to encode both instances and bags. This representation leads to a non-i.i.d. MIL algorithm, miStruct, which uses a structural similarity to compare bags. Furthermore, MIL with this representation is shown to be equivalent to a document classification problem. Document classification also suffers from the fact that only few paragraphs/words are useful in revealing the category of a By using the TF-IDF representation which has excellent empirical performance in document classification, the miDoc method is proposed. The proposed methods achieve significantly higher accuracies and AUC (area under the ROC curve) than the state-of-the-art in a large number of sparse MIL problems, and the document classification analogy explains their efficacy in sparse MIL problems.

出版日期2017-2
单位南京信息工程大学; 南京大学

全文

访问全文

收藏分享被引(4) 浏览

更新时间：2024-05-12 13:21

Sparse multiple instance learning as document classification

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友