摘要

At present most methods of spam filtering assume that the training data from a source domain and the test data from a target domain follow the same distribution. However, in many cases the performance decreases sharply as this assumption does not hold. In this paper we propose an adaptive transfer learning (ATL) algorithm for solving the spam filtering problem. The proposed algorithm consists of three key steps. Firstly, we compute the similarity between the particular target features and the common features based on the singular value decomposition method. Secondly, we label the test data through a traditional classifier and add the most similar sample to the training data set according to the prescribed threshold. Finally, we apply a traditional classifier to get the label of the test data. We demonstrate the effectiveness of this approach with experiment results on real world data pertaining to email filtering.

  • 出版日期2010

全文