摘要

This paper addresses a novel learning problem where data from multiple domains are generated from distributions that are allowed to be arbitrarily different. Domain adaptation and transfer learning have been addressed previously for two domains with different distributions, but based on the assumption that the distributions are sufficiently related to each other. However, in many real-world applications, multiple domains may have arbitrarily different distributions. In this paper, a general framework is proposed, called Multi-ATL, for bridging knowledge from multiple domains with arbitrarily different distributions. The proposed framework consists of three key components: latent feature space extraction, co-training with multiple views, and active sample selection. The basic concept is to explore each domain and characterize it with an extracted latent feature space, and then apply a co-training algorithm to the latent spaces to use multiple views simultaneously for training. In the co-training process, active sample selection is used to determine how much knowledge should be transferred from a source domain to the target domain. Experimental results on one synthetic and two real-world datasets show that the proposed framework significantly improves classification accuracy (13.5% on the synthetic and +17.79%, +23.87% on the real datasets) with only a few (2-5) active selections.