摘要

With the development of data acquisition and storage techniques, even more and larger datasets are easily confronted in machine learning. In order to save excessive storage and computational time and improve generalization accuracy by removing noise, we propose a novel instance-based learning algorithm based on the Relative Position view, namely RePo, in this paper. We treat the training set reduction as a problem that selects which instance should be deleted, and develop two understandable definitions of replaceable structure, which result in simple and effective principles of dealing with points: retain the border points, delete the noisy and reduce the internal ones. By generating new prototypes, deleting noise and close border points, the RePo algorithm is quite effective in storage reduction. In addition, we compare our RePo to other nine traditional and typical reduction techniques performing on 16 classification tasks. It has been demonstrated that the RePo outperforms the others in terms of storage requirement while guarantees generalization accuracy is close to the best one.