Numerical analysis of continuous time Markov decision processes over finite horizons

Buchholz Peter<sup>*</sup>; Schulz Ingo

doi:10.1016/j.cor.2010.08.011

摘要

Continuous time Markov decision processes (CTMDPs) with a finite state and action space have been considered for a long time. It is known that under fairly general conditions the reward gained over a finite horizon can be maximized by a so-called piecewise constant policy which changes only finitely often in a finite interval. Although this result is available for more than 30 years, numerical analysis approaches to compute the optimal policy and reward are restricted to discretization methods which are known to converge to the true solution if the discretization step goes to zero. In this paper, we present a new method that is based on uniformization of the CTMDP and allows one to compute an epsilon-optimal policy up to a predefined precision in a numerically stable way using adaptive time steps.

出版日期2011-3

全文

访问全文

收藏分享被引(17) 浏览

更新时间：2018-02-09 21:44

Numerical analysis of continuous time Markov decision processes over finite horizons

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友