摘要

Learning automata (LA), which intellectually explores its optimal state by interacting with an external environment continuously, is encountered widely in artificial intelligence. In the evaluation of LA, it has always been a key issue how to tradeoff between "accuracy" and "speed," which substantially touches on parameter tuning. A latest issue in the design of LA methodology involves bearing a parameter-free property, thus removing the tremendous expenses brought by parameter tuning. Nevertheless, the currently existing parameter-free LA schemes generally maintain a Monte-Carlo technique, which helps avoid the tuning process at the cost of more computations. This paper examines a new measurement of parameter-free LA schemes based on statistics which overcome the difficulties found in other counterparts. Specifically, it has innovatively disengaged from the dependance on Monte-Carlo methods. Of greater significance, the learning mechanisms operating in the common stationary environments are likewise extended to the nonstationary environments. Simulations confirm the effectiveness and efficiency of the proposed algorithm, especially its low computation consumption as well as the strong tracking capability to abrupt environmental changes.