Approximate Linear Programming for Average Cost MDPs

Veatch Michael H<sup>*</sup>

doi:10.1287/moor.1120.0574

登录

免费注册

赞收藏引用

科研之友

微信

新浪微博

Facebook

分享链接

Approximate Linear Programming for Average Cost MDPs

作者：Veatch Michael H^*

来源：Mathematics of Operations Research, 2013, 38(3): 535-544.

DOI：10.1287/moor.1120.0574

摘要

We consider the linear programming approach to approximate dynamic programming with an average cost objective and a finite state space. Using a Lagrangian form of the linear program (LP), the average cost error is shown to be a multiple of the best fit differential cost error. This result is analogous to previous error bounds for a discounted cost objective. Second, bounds are derived for average cost error and performance of the policy generated from the LP that involve the mixing time of the Markov decision process (MDP) under this policy or the optimal policy. These results improve on a previous performance bound involving mixing times.

出版日期2013-8

全文

访问全文

收藏分享被引浏览

更新时间：2019-03-28 06:26

相似论文
引用论文
参考文献

产品服务

科研之友科研之友机构版科创云

站内浏览

科研成果科研人员科研机构

服务支持

帮助中心隐私政策服务条款

联系方式

在线客服：【立即咨询】客户热线：400-1616-289 电子邮箱：support@scholarmate.com

微信公众号