摘要

Learning classifier systems (LCSS) are evolutionary learning mechanisms that combine genetic algorithms (GAs) with the power of the reinforcement learning paradigm. LCSs try to evolve state-action-reward mappings to propose the best action for each environmental state and maximize the achieved reward. In the early versions of LCSs, state-action pairs had been mapped to a constant real-valued reward. Thus, to model a fairly complex environment, LCSs had to develop some identical state-action pairs to show different levels of the environmental reward. Recently, a new extension to a well-known LCS, called the accuracy based learning classifier system or XCS, has been developed, which is able to map state-action pairs to a linear reward function. This new extension, called XCSF, can develop a more compact population than the original XCS. However, further research studies have shown that this new extension is not able to develop the proper mappings when its input parameters come from particular intervals. In this paper, we propose a new extension to XCSF that is able to map the state-action pairs to a non-linear reward function. The experimental results show that this extension can outperform other extensions to XCSF with respect to the compactness of the produced population and the accuracy of the final approximation for any desired range of the input parameters.

  • 出版日期2008-7