摘要

Multiple instance (MI) learning aims at identifying the underlying concept from collectively labeled data. A training sample consists of a set, known as a bag, of unlabelled instances. The bag as a whole is labeled positive if at least one instance in the bag is positive, or negative otherwise. Given such training samples, the goal is to learn a description of the common instance(s) among the positive bags, i.e., the underlying concept that is responsible for the positive label. In this work, we introduce a learning scheme based on the notion of partial entropy for MI concept learning. Partial entropy accentuates the intra-class information by focusing on the information reflected from the positive class in proportion to the total entropy, maximization of which is to equalize the likelihoods of intra-class outcomes among the positive class, essentially reflecting the intended concept. When coupled with a distance-based probabilistic model for MI learning, it is equivalent to seeking out a concept estimate that equalizes the intra-class distances while the distance to negative bags is restrained. It produces patterns that are similar to at least one instance from each of the positive bags while dissimilar from all instances in negative bags. The generated patterns from the optimization process correspond to prototypical concepts. Maximum partial entropy is conceptually simple and experimental results on different MI datasets demonstrate its effectiveness in learning an explicit representation of the concept and its competitive performance when applied to classification tasks.

  • 出版日期2017-6

全文