摘要

In many machine learning algorithms, a major assumption is that the training and the test samples are in the same feature space and have the same distribution. However, for many real applications this assumption does not hold. In this paper, we survey the problem where the training samples and the test samples are from different distributions. This problem can be referred as domain adaptation. The training samples, always with labels, are obtained from what is called source domains, while the test samples, which usually have no labels or only a few labels, are obtained from what is called target domains. The source domains and the target domains are different but related to some extent; the learners can learn some information from the source domains for the learning of the target domains. We focus on the multisource domain adaptation problem where there is more than one source domain available together with only one target domain. A key issue is how to select good sources and samples for the adaptation. In this survey, we review some theoretical results and well developed algorithms for the multi-source domain adaptation problem. We also discuss some open problems which can be explored in future work.