A novel hypothesis-margin based approach for feature selection with side   pairwise constraints

Yang Ming; Song Jing

doi:10.1016/j.neucom.2010.08.006

摘要

Feature selection is an important problem for pattern classification systems. As compared to unsupervised feature selection methods, the supervised ones have better performance. However, almost all existing supervised ones use class labels as supervised information, very less work has been done for other forms of supervision information such as pairwise constraints, which specifies whether a pair of data samples belongs to the same class (must-link constraints) or different classes (cannot-link constraints). In reality, pairwise constraints can be easily obtained by specifying whether some pairs of examples belong to the same class or not. Therefore, a new filter method for feature selection with pairwise constraints, called Constraint Score, was proposed. Unfortunately, Constraint Score does not consider the case where only cannot-link constraints are given. Also, the conclusion 'must-link constraints are more important than cannot-link constraints' given by Constraint Score algorithm needs to be further verified, since 'cannot-link constraints' seems more important than 'must-link constraints' from the viewpoint of hypothesis-margin or margin. In addition, like the existing supervised feature selection methods, the currently proposed hypothesis-margin based approach for feature selection, called Simba, also utilizes class labels as supervision information. In this paper, to further study the feature selection problem aiming at pairwise constraints, we introduce a novel hypothesis-margin based approach for feature selection with side pairwise constraints, called Simba-sc, which only uses cannot-link constraints as supervision information. We compare our algorithm with the well-known Constraint Score, Fisher Score and Laplacian Score algorithms. Experiments are carried out on 6 UCI data sets using three different classifiers. Experimental results show that, with a few cannot-link constraints, Simba-sc achieves similar or even higher performance than Fisher Score with full class labels on all training data, and has better or comparable performance than Constraint Score.

出版日期2010-10
单位南京师范大学

全文

访问全文

收藏分享被引(3) 浏览

更新时间：2018-08-02 15:56

A novel hypothesis-margin based approach for feature selection with side pairwise constraints

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友