摘要

The R(λ)-learning algorithm is based on the average reward model. A novel optimal CPS control methodology for interconnected power systems based on the whole process R(λ)-learning algorithm is presented. The objective of the presented CPS control methodology coincides with that of AGC which pursues the high CPS compliance in every ten minutes. Moreover, the R(λ)-learning algorithm can converge faster and gain higher value of the CPS index than the Q(λ)-learning algorithm which is based on a discounted reward model. In addition, the improved controller based on the novel R(λ)-learning algorithm holds the advantage of learning on-line in the whole process and the pre-learning process of the controller is substituted by the imitation-learning process. The improved controller overcomes the serious defect of the conventional reinforcement learning controller which needs to build an accurate simulating model for converging in the pre-learning process, and it can enhance the learning efficiency and applicability in power systems.

全文