Mixed Markov decision processes in a semi-Markov environment with discounted criterion

作者:Hu, QY*; Wang, JL
来源:Journal of Mathematical Analysis and Applications, 1998, 219(1): 1-20.
DOI:10.1006/jmaa.1997.5792

摘要

This paper presents a new model: the mixed Markov decision process (MDP) in a semi-Markov environment with discounted criterion. It describes a system which behaves like a MDP except that the system is influenced by its semi-Markov process environment. Following each state transition of the environment, the MDP model changes among discrete time MDP, continuous time MDP, and semi-MDP. After presenting the model, we show the validity of the optimality equation and the existence of epsilon-optimal policies. Finally, the mixed MDP in a Markov environment is transformed into a discrete time MDP.