摘要

With the remarkable proliferation of intelligent mobile devices and fast growing broadband wireless technology, social networking is undergoing explosive growth in recent years as more and more users access social networks via mobile platforms. It is often expensive or even impossible to deploy a large online social network (OSN) on a single server. A cost-effective approach is horizontal scaling, where the OSN is partitioned and deployed on a set of low-cost servers. In this research, we study the problem of minimum-cost balanced partitioning of OSNs. Our goal is to achieve the best partitioning by minimizing the total inter-server traffic cost and at the same time balancing the load among servers. Given its NP-hardness, we propose new techniques and explore efficient heuristics to address the problem, especially for extremely large OSNs with an enormous volume of social nodes, social connections, and social data. Our key contributions include a localized approach with O(delta)(2) thorn complexity to explicitly calculate the projected gain in inter-server traffic cost (named Server Change Benefit (SCB)). Built upon this technique, we devise two algorithms that offer online and offline solutions to achieving minimum-cost balanced partitioning of OSNs. The online algorithm is fast and highly efficient to process newly arrival individual nodes. The offline algorithm uses the current online result as a starting point. It further reduces inter-server traffic cost by applying relocation and swapping. It employs a merging process to group the nodes according to the social structure and swap the groups with similar size to further reduce the total inter-server traffic cost. We implement both algorithms and evaluate them based on a variety of real-world OSN datasets from Facebook, Arxiv, Gnutella, Amazon, and Twitter. The simulations demonstrate that the proposed algorithms can significantly reduce the execution time by an average of three folds and at the same time yield supreme performance (i.e., inter-server traffic cost) in comparison with existing solutions.

  • 出版日期2018-7-1