A Collaborative Decentralized Approach to Web Search

作者:Papagelis Athanasios*; Zaroliagis Christos
来源:IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans , 2012, 42(5): 1271-1290.
DOI:10.1109/TSMCA.2012.2187887

摘要

Most explanations of the user behavior while interacting with the web are based on a top-down approach, where the entire Web, viewed as a vast collection of pages and interconnection links, is used to predict how the users interact with it. A prominent example of this approach is the random-surfer model, the core ingredient behind Google%26apos;s PageRank. This model exploits the linking structure of the Web to estimate the percentage of web surfers viewing any given page. Contrary to the top-down approach, a bottom-up approach starts from the user and incrementally builds the dynamics of the web as the result of the users%26apos; interaction with it. The second approach has not being widely investigated, although there are numerous advantages over the top-down approach regarding (at least) personalization and decentralization of the required infrastructure for web tools. In this paper, we propose a bottom-up approach to study the web dynamics based on web-related data browsed, collected, tagged, and semi-organized by end users. Our approach has been materialized into a hybrid bottom-up search engine that produces search results based solely on user provided web-related data and their sharing among users. We conduct an extensive experimental study to demonstrate the qualitative and quantitative characteristics of user generated web-related data, their strength, and weaknesses as well as to compare the search results of our bottom-up search engine with those of a traditional one. Our study shows that a bottom-up search engine starts from a core consisting of the most interesting part of the Web (according to user opinions) and incrementally (and measurably) improves its ranking, coverage, and accuracy. Finally, we discuss how our approach can be integrated with PageRank, resulting in a new page ranking algorithm that can uniquely combine link analysis with users%26apos; preferences.

  • 出版日期2012-9