摘要

In dynamic Wireless Sensor Networks (WSNs), each sensor node should be allowed to schedule tasks by itself based on current environmental changes. Task scheduling on each sensor node should be done online towards balancing the tradeoff between resources utilization and application performance. In order to solve the problem of frequent exchange of cooperative information in existing cooperative learning algorithms, a task scheduling algorithm based on Q-learning and shared value function for WSNs, QS is proposed. Specifically, the task model for target monitoring applications and the cooperative Q-learning model are both established, and some basic elements of reinforcement learning including the delayed rewards and the state space are also defined. Moreover, according to the characteristic of the value of the function change, QS designs the sending constraint and the expired constraint of state value to reduce the switching frequency of cooperative information while guaranteeing the cooperative learning effect. Experimental results on NS3 show that QS can perform task scheduling dynamically according to current environmental changes; compared with other cooperative learning algorithms, QS achieves better application performance with achievable energy consumption and also makes each sensor node complete its functionality job normally.