A Clustering-Based Graph Laplacian Framework for Value Function Approximation in Reinforcement Learning

Xu, Xin<sup>*</sup>; Huang, Zhenhua; Graves, Daniel; Pedrycz, Witold

doi:10.1109/TCYB.2014.2311578

摘要

In order to deal with the sequential decision problems with large or continuous state spaces, feature representation and function approximation have been a major research topic in reinforcement learning (RL). In this paper, a clustering-based graph Laplacian framework is presented for feature representation and value function approximation (VFA) in RL. By making use of clustering-based techniques, that is, K-means clustering or fuzzy C-means clustering, a graph Laplacian is constructed by subsampling in Markov decision processes (MDPs) with continuous state spaces. The basis functions for VFA can be automatically generated from spectral analysis of the graph Laplacian. The clustering-based graph Laplacian is integrated with a class of approximation policy iteration algorithms called representation policy iteration (RPI) for RL in MDPs with continuous state spaces. Simulation and experimental results show that, compared with previous RPI methods, the proposed approach needs fewer sample points to compute an efficient set of basis functions and the learning control performance can be improved for a variety of parameter settings.

出版日期2014-12
单位中国人民解放军国防科学技术大学

全文

访问全文

收藏分享被引(28) 浏览

更新时间：2024-04-02 13:00

A Clustering-Based Graph Laplacian Framework for Value Function Approximation in Reinforcement Learning

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友