A Counterexample on Sample-Path Optimality in Stable Markov Decision Chains with the Average Reward Criterion

Cavazos Cadena Rolando<sup>*</sup>; Montes de Oca Raul; Sladky Karel

doi:10.1007/s10957-013-0474-6

免费注册

赞收藏引用

科研之友

微信

新浪微博

Facebook

分享链接

A Counterexample on Sample-Path Optimality in Stable Markov Decision Chains with the Average Reward Criterion

作者：Cavazos Cadena Rolando^*; Montes de Oca Raul; Sladky Karel

来源：Journal of Optimization Theory and Applications, 2014, 163(2): 674-684.

DOI：10.1007/s10957-013-0474-6

摘要

This note deals with Markov decision chains evolving on a denumerable state space. Under standard continuity-compactness requirements, an explicit example is provided to show that, with respect to a strong sample-path average reward criterion, the Lyapunov function condition does not ensure the existence of an optimal stationary policy.

出版日期2014-11

全文

访问全文

收藏分享被引浏览

更新时间：2017-04-24 13:51

相似论文
引用论文
参考文献

产品服务

科研之友科研之友机构版科创云

站内浏览

科研成果科研人员科研机构

服务支持

帮助中心隐私政策服务条款

联系方式

在线客服：【立即咨询】客户热线：400-1616-289 电子邮箱：support@scholarmate.com

微信公众号