摘要

Virtual machine (VM) images are frequently backed up for service reliability in datacenters. However, the duplicated data of image backups take up a large amount of storage space. Thus, deduplication technologies are often used in backup operations to save storage space by removing duplicated data. Since backup operations with deduplication are resource intensive and time consuming, how to reduce the time of backup operations has become a key issue of datacenter management. Contemporary deduplication backup strategies can be summarized as deduplication after backup strategy, deduplication before backup strategy and deduplication during backup. As the strategies with different resource requirements are suitable for different scenarios, it is reasonable to combine them adaptively. This paper proposed an adaptive strategy for the deduplication backup of virtual machine images. We first profile the resource requirement of the deduplication backup operations with different strategies, and then use an object-oriented genetic algorithm to make a plan for minimizing the time of backup operations. Experimental results demonstrate that we can accurately estimate the deduplication backup time, and the algorithm saves about twenty percent deduplication backup time.

全文