A personalized spam page detection method using hidden markova model

作者:Yin, Guisheng; Zhang, Yanan*; Yuan, Weiwei; Cui, Xiaohui; Cui, Xiang; Cheng, Weijie; Dong, Jing; Wei, Jijie; Pan, Yue
来源:Journal of Convergence Information Technology, 2012, 7(12): 111-119.
DOI:10.4156/jcit.vol7.issue12.14

摘要

Spam pages mislead the search engines by ranking themselves in the top of the search results. The users therefore need more efforts to find useful information. Most existing works only detect the spam pages, and try to improve the user satisfaction by eliminating the spam pages in the search engine';s index. However, users may have different concerns for the same query term and they expect personalized search results. In this paper, we propose a novel personalized spam page detection method which is able to filter out spam pages for each user. This is achieved by predicting each user';s satisfaction of web pages according to their behaviors using Hidden Mark ova Model (HMM), in which the user behaviors are obtained from their sessions. We compare our method with some typical methods. The experimental results show that our method can more effectively distinguish the spam pages and recommend user';s most interested web pages.

全文