Cooperative learning with joint state value approximation for multi-agent systems

Chen Xin; Chen Gang; Cao Weihua; Wu Min<sup>*</sup>

doi:10.1007/s11768-013-1141-z

摘要

This paper relieves the 'curse of dimensionality' problem, which becomes intractable when scaling reinforcement learning to multi-agent systems. This problem is aggravated exponentially as the number of agents increases, resulting in large memory requirement and slowness in learning speed. For cooperative systems which widely exist in multi-agent systems, this paper proposes a new multi-agent Q-learning algorithm based on decomposing the joint state and joint action learning into two learning processes, which are learning individual action and the maximum value of the joint state approximately. The latter process considers others' actions to insure that the joint action is optimal and supports the updating of the former one. The simulation results illustrate that the proposed algorithm can learn the optimal joint behavior with smaller memory and faster learning speed compared with friend-Q learning and independent learning.

出版日期2013
单位中南大学

全文

访问全文

收藏分享被引浏览

更新时间：2018-11-25 10:50

Cooperative learning with joint state value approximation for multi-agent systems

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友