摘要

Within the paradigm of learning classifier systems, extended classifier system (XCS) is outstanding. However, the original XCS has no memory mechanism and can only learn optimal policy in Markovian environments, where the optimal action is determined solely by the state of current sensory input. But in practice, most environments are partially observable environments with respect to agent's sensation, and they form the most general class of environments: non-Markov environments. In these environments, XCS either fails completely, or only develops a suboptimal policy, since it is memoryless. In this paper, we develop a new learning classifier system based on XCS, named 'XCSMM', which adds an internal message to XCS as an internal memory, and then extends the classifier with a memory condition that is used to sense the internal memory. XCSMM holds a simple and clear memory mechanism, which is easy to understand and implement. Besides, four sets of different complex maze problems have been employed to test XCSMM. Experimental results show that XCSMM is able to evolve optimal or suboptimal solutions in most non-Markovian environments.

全文