A Continuous-Time Markov Decision Process-Based Method With Application in a Pursuit-Evasion Example

Jia, Shengde; Wang, Xiangke<sup>*</sup>; Shen, Lincheng

doi:10.1109/TSMC.2015.2478875

摘要

This paper presents a novel method-continuous-time Markov decision process (CTMDP)-to address the uncertainties in pursuit-evasion problem. The primary difference between the CTMDP and the Markov decision process (MDP) is that the former takes into account the influence of the transition time between the states. The policy iteration method-based potential performance for solving the CTMDP and its convergence are also presented. The results obtained by MDP-based method demonstrate that it is a special case of CTMDP-based method involving the identity transition rate matrix. To compare the methods, a well-known pursuit-evasion problem, involving two identical cars, is solved as a benchmark. The CTMDP-based method can provide a discretization solution that is close to the analytical solution obtained by the differential game method. Besides, it shows strong robustness against changes in the transition probability, as compared with the traditional MDP-based method. To the best of our knowledge, this is the first attempt to validate the influence of the transition time between the states in such a pursuit-evasion scenario, or in a similar application, solved by an MDP-related model. The CTMDP-based method offers a new approach to solving the pursuit-evasion problem and can be extended to similar optimization applications.

出版日期2016-9
单位中国人民解放军国防科学技术大学

全文

访问全文

收藏分享被引(13) 浏览

更新时间：2024-04-21 16:27

A Continuous-Time Markov Decision Process-Based Method With Application in a Pursuit-Evasion Example

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友