摘要

In many real-world data mining tasks, the connotation of the target concept may change as time goes by. For example, the connotation of "learned knowledge" of a student today may be different from his/her "learned knowledge" tomorrow, since the "learned knowledge" of the student is expanding everyday. In order to learn a model capable of making accurate predictions, the evolution of the concept must be considered, and thus, a series of data sets collected at different time is needed. In many tasks, however, there is only a single data set instead of a series of data sets. In other words, only a single snapshot of the data along the time axis is available. In this paper, we formulate the Positive Class Expansion with single Snapshot (PCES) problem and discuss its difference with existing problem settings. To show that this new problem is addressable, we propose a framework which involves the incorporation of desirable biases based on user preferences. The resulting optimization problem is solved by the Stochastic Gradient Boosting with Double Target approach, which achieves encouraging performance on PCES problems in experiments.

全文