摘要
<jats:p>Partially Observable Markov Decision Processes (POMDP) provides piecewise-linear a natural and principled framework for sequential decision-making under uncertainty. However, large-scale POMDP suffers from the exponential growth of the belief points and policy trees space. We present a new point-based incremental pruning algorithm based on the piecewise linearity and convexity of the value function. Instead of reasoning about the whole belief space when pruning the cross-sums in POMDP policy construction, our algorithm uses belief points to perform approximate pruning by generating policy trees, and get the optimal policy in real-time belief states. The empirical results indicate that point-based incremental pruning for heuristic search methods can handle large POMDP domains efficiently.</jats:p>
- 出版日期2014-2
- 单位深圳职业技术学院