Active learning through density clustering

作者:Wang, Min; Min, Fan*; Zhang, Zhi-Heng; Wu, Yan-Xue
来源:Expert Systems with Applications, 2017, 85: 305-317.
DOI:10.1016/j.eswa.2017.05.046

摘要

Active learning is used for classification when labeling data are costly, while the main challenge is to identify the critical instances that should be labeled. Clustering-based approaches take advantage of the structure of the data to select representative instances. In this paper, we developed the active learning through density peak clustering (ALEC) algorithm with three new features. First, a master tree was built to express the relationships among the nodes and assist the growth of the cluster tree. Second, a deterministic instance selection strategy was designed using a new importance measure. Third, tri-partitioning was employed to determine the action to be taken on each instance during iterative clustering, labeling, and classifying. Experiments were performed with 14 datasets to compare against state-of-the-art active learning algorithms. Results demonstrated that the new algorithm had higher classification accuracy using the same number of labeled data.