ACTION TIME SHARING POLICIES FOR ERGODIC CONTROL OF MARKOV CHAINS

Budhiraja Amarjit<sup>*</sup>; Liu Xin; Shwartz Adam

doi:10.1137/100798557

摘要

Ergodic control for discrete time controlled Markov chains with a locally compact state space and a compact action space is considered under suitable stability, irreducibility, and Feller continuity conditions. A flexible family of controls, called action time sharing (ATS) policies, associated with a given continuous stationary Markov control, is introduced. It is shown that the long-term average cost for such a control policy, for a broad range of one-stage cost functions, is the same as that for the associated stationary Markov policy. In addition, ATS policies are well suited for a range of estimation, information collection, and adaptive control goals. To illustrate the possibilities we present two examples. The first demonstrates a construction of an ATS policy that leads to consistent estimators for unknown model parameters while producing the desired long-term average cost value. The second example considers a setting where the target stationary Markov control q is not known but there are sampling schemes available that allow for consistent estimation of q. We construct an ATS policy which uses dynamic estimators for q for control decisions and show that the associated cost coincides with that for the unknown Markov control q.

出版日期2012

全文

访问全文

收藏分享被引浏览

更新时间：2018-04-09 19:50

ACTION TIME SHARING POLICIES FOR ERGODIC CONTROL OF MARKOV CHAINS

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友