Adapting grid applications to safety using fault-tolerant methods: Design, implementation and evaluations

作者:Shi Xuanhua; Pazat Jean Louis; Rodriguez Eric; Jin Hai; Jiang Hongbo
来源:Future Generation Computer Systems, 2010, 26(2): 236-244.
DOI:10.1016/j.future.2009.07.015

摘要

Grid applications have been prone to encountering problems such as failures or malicious attacks during execution in recent years. due to their distributed and large-scale features. The application itself, however, has limited power to address these problems. This paper presents the design, implementation, and evaluation of an adaptive framework-Dynasa, which strives to handle security problems using adaptive fault-tolerance (i.e., checkpointing and replication) during the execution of applications according to the status of the Grid environments. We evaluate our adaptive framework experimentally using the Grid5000 testbed and the experimental results have demonstrated that Dynasa enables the application itself to handle the security problems efficiently. The starting of the adaptive component is less than 1 s and the adaptive action is less than 0.1 s with the checkpoint interval of 20 s. Compared with non-adaptive method, experimental results demonstrate that Dynasa achieves better performance in terms of execution time, network bandwidth consumed, and CPU load, resulting in up to a 50% lower overhead.