Active seed selection for constrained clustering

Viet Vu Vu<sup>*</sup>; Labroche Nicolas

doi:10.3233/IDA-150499

摘要

Active learning for semi-supervised clustering allows algorithms to solicit a domain expert to provide side information as instances constraints, for example a set of labeled instances called seeds. The problem consists in selecting the queries to the expert that are likely to improve either the relevance or the quality of the proposed clustering. However, these active methods suffer from several limitations: (i) they are generally tailored for only one specific clustering paradigm or cluster shape and size, (ii) they may be counter-productive if the seeds are not selected in an appropriate manner and, (iii) they have to work efficiently with minimal expert supervision. In this paper, we propose a new active seed selection algorithm that relies on a k-nearest neighbors structure to locate dense potential clusters and efficiently query and propagate expert information. Our approach makes no hypothesis about the underlying data distribution and can be paired with any clustering algorithm. Comparative experiments conducted on real data sets show the efficiency of this new approach compared to existing ones.

出版日期2017

全文

访问全文

收藏分享被引(8) 浏览

更新时间：2024-04-10 17:16

Active seed selection for constrained clustering

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友