A paragraph-inserted word salad filtering algorithm

作者:Jeong Ok Ran; Kim Won*
来源:International Journal of Web and Grid Services, 2012, 8(1): 56-71.
DOI:10.1504/IJWGS.2012.046730

摘要

Social spam is one type of spam which includes spamming the members of social websites by sending or posting unwanted ads or baiting them to visit particular websites. Word salad in turn is one type of social spam which aims at baiting people to visit particular websites, such as blogs, personal profiles, third-party applications built on social networking websites, etc. A word salad is created by inserting either words or paragraphs within a normal document, where the inserted words or paragraphs have no relevance to the The purpose of a word salad is to fool the search engines into assigning high ranks to the In this paper, we discuss an algorithm that filters (detects) paragraph-inserted word salads. The algorithm is based on the Singular Value Decomposition (SVD) method and, based on experiments, shows up to 81.3% accuracy.

  • 出版日期2012