摘要

With the development of the diversification of spam, it raises the difficulties and challenges to content-based spam filtering. To address this problem, this paper firstly analyzed the statistical features of Email headers, and then proposed a method to use these features to improve Bayesian anti-spam filter. The selected Email-header features are presented as the fingerprint vectors, and then transformed to the input tokens to Bayesian filter. The experiment results show that this method efficiently utilizes the messages embedded in Email headers and then improves the performance of the Bayesian anti-spam filtering.

全文