摘要

This paper deals with the nonstationary continuous time Markov decision process in a semi-Markov environment with discounted criterion. The model can describe a system that itself can be modeled by a countable state nonstationary continuous time Markov decision process with nonhomogeneous transition rate family and reward rate function, but the system is influenced by its environment, which is modeled after a semi-Markov process. And with each change of the environment';s states, (1) an instantaneous state (of the system) transition occurs; (2) an instantaneous reward occurs; and (3) the parameters of the nonstationary continuous time Markov decision processes vary. The precise formulation of the model is presented, and the optimality equation and the existence of epsilon (>0) optimal policies are proved.