摘要

In distributed systems, data may be correlated due to accesses from clients and the correlation has some impact on date placement, and existing research works focus on independent data objects. In this paper, we address both the scalability and the stability of the data placement solutions in internet environment. We first show that replica allocation decisions can be made locally for each replica site in a tree network, with data access knowledge of its neighbors. We then develop a new replication cost model for correlated data objects in Internet environment. Based on the cost model and the algorithms in previous research, we develop a distributed optimal replica allocation algorithm (DOPR) for correlated data in internet environment. A distributed heuristic algorithm (DHPR) is then developed to efficiently make replica placement decisions. The algorithm obtains sub-optimal solutions for the correlated data model and yields significant performance gains. Experimental studies show that the distributed heuristic allocation algorithm significantly outperforms the general frequency-based replication schemes (in which the replication decision of each data object is made based on the number of accesses on that data object).

  • 出版日期2014-4