SpamED: A Spam E-Mail Detection Approach Based on Phrase Similarity

作者:Pera Maria Soledad*; Ng Yiu Kai
来源:Journal of the American Society for Information Science and Technology, 2009, 60(2): 393-409.
DOI:10.1002/asi.20962

摘要

E-mail messages are unquestionably one of the most popular communication media these days. Not only are they fast and reliable but also free in general. Unfortunately, a significant number of e-mail messages received by e-mail users on a daily basis are spam. This fact is annoying since spam messages translate into a waste of the user's time in reviewing and deleting them. In addition, spam messages consume resources such as storage, bandwidth, and computer-processing time. Many attempts have been made in the past to eradicate spam-1 however, none has proven highly effective. In this article, we propose a spam e-mail detection approach, called SpamED, which uses the similarity of phrases in messages to detect spam. Conducted experiments not only verify that SpamED using trigrams in e-mail messages is capable of minimizing false positives and false negatives in spam detection but it also outperforms a number of existing e-mail filtering approaches with a 96% accuracy rate.

  • 出版日期2009-2