摘要

In this paper we analyze a retrial queue that can be used to model fault-tolerant systems with checkpointing and rollback recovery. We assume that the service time of each job is decomposed into N modules, at the end of each of which a checkpoint is established. Checkpointing and rollback recovery consists, basically, of saving periodically the state of the system on a secure device so that, upon recovery from a system failure, the system can resume the computation from the most recent checkpoint, rather than from the beginning. Upon a successful service completion of a job, the server activates a timer and remains awake. If the timer expires without a request, the server departs for a vacation. Upon returning from the vacation, the server activates the timer again. Furthermore, both idle and vacation periods can be interrupted by the server in order to perform secondary jobs. Applications of this model can be found in power saving of mobile devices in a half-duplex communication system operating in wireless environment, and in long-running software applications. We investigate stability condition and steady state analysis. We also apply a mean value analysis to obtain useful performance measures, and prove that the model satisfies the stochastic decomposition property. Useful energy metrics are determined and constrained optimization problems are formulated and used to obtain extensive numerical results.

  • 出版日期2015-1