摘要

This paper presents a novel method-continuous-time Markov decision process (CTMDP)-to address the uncertainties in pursuit-evasion problem. The primary difference between the CTMDP and the Markov decision process (MDP) is that the former takes into account the influence of the transition time between the states. The policy iteration method-based potential performance for solving the CTMDP and its convergence are also presented. The results obtained by MDP-based method demonstrate that it is a special case of CTMDP-based method involving the identity transition rate matrix. To compare the methods, a well-known pursuit-evasion problem, involving two identical cars, is solved as a benchmark. The CTMDP-based method can provide a discretization solution that is close to the analytical solution obtained by the differential game method. Besides, it shows strong robustness against changes in the transition probability, as compared with the traditional MDP-based method. To the best of our knowledge, this is the first attempt to validate the influence of the transition time between the states in such a pursuit-evasion scenario, or in a similar application, solved by an MDP-related model. The CTMDP-based method offers a new approach to solving the pursuit-evasion problem and can be extended to similar optimization applications.