摘要
This paper describes the Semi-Online Neural-Q-leaming (SONQL) algorithm designed for real-time learning of reactive robot behaviors. The Q-function is generalized by a multilayer neural network allowing the use of continuous states. The algorithm uses a database of the most recent learning samples to accelerate and improve the convergence. Each SONQL algorithm represents an independent, reactive and adaptive state-action mapping, which implements the function of a robot behavior for one degree of freedom (DOF). The generalization capability of the SONQL algorithm was demonstrated by computer simulation with the '' mountain-car '' benchmark. The SONQL was also investigated by experiment on a mobile robot for a target-following task. Experimental results show that the SONQL is promising for online robot learning.
- 出版日期2007-8-31
- 单位国家自然科学基金委员会